Faculty Spotlight

MDI Faculty Spotlight: Dr. Rebecca Johnson

Written by Carrie McDonald, MDI Journalism Intern

Embodying the central mission of the Massive Data Institute (MDI) to tackle societal scale issues and impact public policy through data-driven research, Dr. Rebecca Johnson is committed to creating innovative datasets that work to address the limitations of traditional data and ultimately improve public policy outcomes. 

An Assistant Professor in the McCourt School of Public Policy and MDI faculty affiliate, Johnson focuses on demography, sociology, and social policy. She researches how government agencies prioritize different groups with varying needs, predominantly focusing on K-12 education and rental housing policy. 

Johnson was first inspired to study computational methods out of her passion for bioethics.

“Working in bioethics, you realize that families of kids with disabilities or other needs can face a lot of administrative burdens and challenges when advocating for their children and, oftentimes, they are up against institutions that have a lot more power and a lot more resources,” Johnson said. “While on the one hand, there are more philosophical or normative researchers working on more abstract ethical issues with that, I think it’s really important to have empirical research documenting those inequalities.”

Noticing the lack of ready-made datasets in this field of interest when she reached graduate school, Johnson began to focus on natural language processing, web scraping, and other computational techniques to study topics that mattered to her from a policy perspective. 

“I’m interested in using computational methods to build innovative data sources to study things that we haven’t been able to study before due to data limitations,” Johnson said.

One part of Johnson’s continued work on using computational methods to inform education policy is her role as a Faculty Advisor for our flagship MDI Scholars program. Over the past two academic years, her team built a database of school board videos and transcripts to study which issues are debated.

According to Johnson, her team embarked on this challenge without knowing if it would be computationally possible and overcame many technical difficulties along the way. They have collected transcripts from approximately 1500 school districts thus far, representing about one in eight districts nationwide. This year, Johnson hopes to improve the efficiency of this data pipeline and maximize its potential to influence policy decisions. 

“I’ve started to talk to parents, school board members, and education administrators, and they see immediate applied uses for the data, which is always my goal as someone who works in a policy school and wants to do research that impacts policy,” Johnson said. 

Johnson is also an academic affiliate with the federal Office of Evaluation Sciences (OES) and a Data Science Fellow with The Lab at DC, an applied research team in Mayor Bowser’s administration. Through these partnerships, Johnson supports real-world government programs with her data science skills to ensure that her research “is not only relevant for the academic community but can ideally have a more direct path to influencing policy-making in local and federal contexts.” 

In one project that is a partnership with the D.C. Deputy Mayor for Education, Johnson used natural language processing to analyze how communications between teachers and parents/guardians changed during the pandemic. On the federal level, Johnson’s research has included projects with the Department of Education, the Department of Housing and Urban Development, and the Census Bureau. 

Looking forward, Johnson believes in the importance of emphasizing the role of domain expertise within data science, massive data, and computational social science, as well as the potential for combining computational methods and the large-scale field experiments and randomized controlled trials commonly used by government agencies. She advises her students to find a policy area that they are passionate about and learn computational methods by concentrating on a specific domain.

“Even though the tools we use are domain-general, the nuances of what we’re able to find using those tools are shaped by deep knowledge of a particular domain or a policy context,” Johnson said.
Last year, Johnson presented her work at multiple research conferences, including the Sociology of Education Association Annual Conference, the Polarization Lab Annual Meeting, and the Population Association of American Annual Meeting.