Featured Research
News
PETs

How State Education Systems Are Charting a New Data Privacy Frontier

By: Stephanie Straus, Policy Fellow, Massive Data Institute

In recent years, state education departments have expanded their existing databases into integrated, multi-agency systems called Statewide Longitudinal Data Systems (SLDSs). These systems, funded by U.S. Department of Education grants, follow a jurisdiction’s students across their learning lifecycle, from K-12 secondary education to postsecondary education and into the workforce. Due to their interdisciplinary nature, SLDSs often necessitate the linkage of various state agencies’ data, such as those from departments of education, higher education, and labor.

This integrated data aspect of SLDSs presents data privacy and governance challenges. Sharing sensitive data across disparate agencies, provisioning safe data access to external stakeholders, and conducting matches without unique identifiers are key challenges cited by SLDSs. A major ameliorating factor, though, can be found in the integration of Privacy Enhancing Technologies (PETs), a suite of cryptographic techniques that increase data protection while maintaining data utility. PETs work by masking identifiers, automating access rules, and altering outputs to protect individuals and groups from harms.

Helping SLDSs Adopt PETs to Meet Data Privacy Needs

At the Massive Data Institute, I lead our initiative on PETs in education, funded by the Gates Foundation. Since 2021, I have partnered with SLDSs to surface their greatest data governance tension points, and I advise government staff on where PETs can fit into their existing infrastructure. PETs are versatile techniques, ranging from secure hashing algorithms to hardware-based enclaves, that can be combined with government agencies’ current data privacy protections, such as cell suppression, manual access controls, or trusted intermediaries.

Impetus for This Work

Our PETs in education work was borne out of two projects: 

For the landscape analysis paper, Dr. Amy O’Hara and I surveyed 40 SLDS staff and other education data owners regarding PETs. We recorded their knowledge (or lack thereof), enthusiasm, and concerns about these advanced data privacy techniques. We found that one of the major barriers to PET adoption in their organizations is the lack of successful use cases.

For the ED pilot, I helped the National Center for Education Statistics recreate postsecondary aid statistics via a PET called secure multiparty computation. The resulting report highlights the implementation barriers we faced, and made me realize that the most difficult part of integrating privacy technologies into education data infrastructure is not the technology itself, but rather everything else around it. 

Applications in the Education Field

With that barrier in mind, we shifted our focus to real-world implementation of PETs into SLDSs governance structure. Every day, I work with SLDS managers to help them test PETs. I start with an overview of their data privacy concerns and then see where PETs can fit in. Are they most worried about group inference disclosures via public data releases? Are they struggling with gaining the trust of a neighboring agency to link across their joint data? Each use case necessitates a different calculus and involves different PETs, as can be seen via our successful PET pilots with Nebraska, Arkansas, and DC–in this case, synthetic data, secure hashing, and private set intersection, respectively. 

During these collaborations, I also record the non-technical aspects of implementation, such as how to talk to your legal counsel about the technology, how to ensure your data is in the proper schema before testing, and what sort of procurement considerations you need to bring to your IT or CIO staff. We have created how-to binders (see our synthetic data binder) based on the projects with the aforementioned states, as we have guided them through the administrative, regulatory, and governance aspects of PET adoption. We have also created a primer and a PET 101 training series for government agencies thinking about adopting PETs.

Scaling and Future Directions 

As we continue to work alongside SLDSs, we uncover new issue areas that are tantamount to their embracing of PETs. SLDS staff still emphasize the need for cross-agency data linkages in low-trust environments, a demand for statistical transparency from their stakeholders, and a consideration of the overall privacy loss across all of their data releases. 

However, they (and other government agency staff beyond SLDSs) would like federal entities to set national standards for these PETs to guide technology selection, procurement, and implementation. Some federal institutions have started this work (e.g., NIST’s recent differential privacy guidelines and this White House national strategy), but more needs to be done. 

I look forward to allaying the SLDSs concerns to national decisionmaking bodies, in order to move the needle further in the balancing act of protecting students’ data and using those data to their greatest benefit.

To learn more about Privacy-Enhancing Technologies (PETs), please visit https://mdi.georgetown.edu/pets/

Tagged
MDI Featured Research
PETs
Straus