The Strategic Data Project is committed to helping education data analysts at all skill levels develop new skills. For this reason, in the future OpenSDP will house a number of tutorials on topics including:

  • How to clean your data
  • How to produce and share synthetic data
  • How to build a guide for code sharing
  • How to create open source visualizations

You can also go to the GitHub repositories to explore the tutorial files. To contribute to tutorials in OpenSDP, send us your ideas and feedback: opensdp@gse.harvard.edu.

Data Janitor

As most data analysts know, 80% of the job is getting raw data ready for analysis. Each new dataset is a fresh challenge, but fortunately there are best practices and useful programming skills specific to education data. The goal of the Data Janitor tutorial series is to speed up the learning curve for education analysts struggling with data cleaning chores.

Nearly Unique

This tutorial teaches how to implement decision rules in Stata when cleaning longitudinal data. You will start with a sample data file that is nearly unique at the student and school year level, and clean each variable until the data is internally consistent.

Download Go to Repository

Nearly Unique

This tutorial teaches how to implement decision rules in R when cleaning longitudinal data. You will start with a sample data file that is nearly unique at the student and school year level, and clean each variable until the data is internally consistent.

Download Go to Repository

Data Viz

Compelling, effective presentations change minds and can change policy, but good data visualization doesn’t happen by accident. OpenSDP Data Viz tutorials will help education analysts learn tools and principles for designing effective data visualizations.

R Shiny

This tutorial teaches how to use R Shiny, a powerful, free interactive graphics tool. You will work through three hands-on exercises and write code for an interactive graph with user controls.

Download Go to Repository