How MDI Piloted Synthetic Data with Nebraska’s SLDS to Advance Privacy-Protected Education Research

In collaboration with the Nebraska Statewide Longitudinal Data System (NSWERS), the Massive Data Institute (MDI) at Georgetown University piloted the use of synthetic data as a privacy-enhancing technology (PET) to support secure education data analysis.

👉 Read the full use case, access the PDF brief, and explore keywords from our report:
Nebraska Synthetic Data Pilot – Use Case

👉 Explore more about PETs in our PET website.

This pilot addressed a core challenge in state data systems: enabling research access to linked datasets—such as K–12 graduation data and professional licensing records—without exposing sensitive individual-level information. Using synthetic data generation, the team demonstrated a method that preserved analytic value while eliminating reidentification risk.

“Synthetic data offers a promising alternative when traditional suppression or generalization techniques are too limiting,” said MDI Policy Fellow Stephanie Straus.

The Nebraska pilot tested the generation of realistic but artificial datasets that reflect original data distributions, allowing researchers to:

This project was part of MDI’s broader mission to deploy PETs in real-world policy settings. It reflects our commitment to helping state education agencies adopt modern data governance approaches that are both innovative and compliant.