News

2023 – 2026 Multi-Year Recap

Written by Lisa Singh, MDI Director, Sonneborn Chair, and a Professor in the Department of Computer Science and McCourt School of Public Policy

What a privilege – 4 years as the Director of the Massive Data Institute. While I have been connected to the institute since its founding, I am thankful for being given the opportunity to lead MDI and help the MDI community shape the institute. 

I came into this position with a lot of passion and a bold vision of what research at Georgetown could look like. And while I did not accomplish all I had set out to do, I hope that both the passion and the initiatives that came to fruition have benefited the MDI community, the McCourt community, the university, and beyond. 

I have struggled to decide what to write in this final letter. But I hope that what I share below highlights how much we have grown, how much impact we have, and how well positioned we are to continue to make a difference over the next decade. 

THE IMPACT OF MDI 

Like everyone else, we have faced some abrupt federal grant losses. But we pivoted quickly and now have approximately $15 million in active funding. Our faculty and fellows have used external and internal funding to publish over 75 research articles and policy briefs, participate in over 50 research conferences, write multiple books, and submit a handful of formal responses for requests for comments (most on AI). 

We believe in training for everyone. Over the last four years, we partnered with universities, industry, and government to organize summer learning experiences, co-host multiple data set challenges and forums, sponsor Wikipedia edit-a-thons and save-the-data events. We have hosted multiple conferences, including the first administrative data research conference in the US, ADRCon (led by Amy O’Hara) and over 10 panels discussing topics ranging from emerging technologies to data governance and equity issues to gun violence to election misinformation. Many of these were with Tech and Society partners or our GU Politics neighbors. Each fall we invited a distinguished lecturer to share their research and vision for the future of their field with our community. Our postdoctoral fellows and faculty also hosted 7 to 10 different technical workshops each year for the Georgetown community to learn about new data related methods and technologies, ranging from how to program using network analysis to research methods for human-centered AI. 

For those who work with restricted data or large data sets or LLMs or run large simulations, it can be challenging to manage the facilities needed to access, collect, and/or use the data required to conduct your research. The usage of the Research Data Center (RDC), directed by Amy O’Hara, continues to increase, supporting over 20 approved research projects at Georgetown and other local organizations, including American University, Virginia Tech, RAND, USDA, Inter-American Development Bank.  MDI has also supported cloud computing infrastructure needs for over 25 research projects and in 2024 spearheaded the first on-premise GPU and HPC cluster to support research that was too expensive in the Cloud environment. 

Going through all these statistics does not do justice to all the work done at MDI. The largest impact is on the specific policies and communities affected by our research and the technology we build. It is impossible to share all the research and evaluations taking place (yes – over 10 of them involve AI), but I want to highlight one – a simple one that may have a global impact over the next few years. I highlight this one, not because it is better than all the other work we do, but because it reminds us about the power of data and the need for its availability and its transparency.

The #PublicDebtIsPublic initiative launched in January 2025. It is a collaboration between MDI and the Sovereign Debt Forum at Georgetown Law. The premise of the project is straightforward. Governments borrow money to feed, protect, and invest in their people,  and the public has a right to read and understand those promises. To support that, MDI has built a platform to easily search for sovereign debt contracts and learn about them. Creating a search engine for sovereign debt contracts may not sound very novel or revolutionary. But this platform is the first step toward making transparency of sovereign debt a global norm. This customized search engine is showing the world the different “deals” that are being made and researchers are using these data to highlight inequities and oddities as they emerge. Sovereign debt transparency is an ambitious goal and MDI wants to support building tools and writing reports to help make that vision a reality. The platform is gaining traction and was highlighted last week in the annual Paris Club Report. 

While I picked this example because of the recent report, I could have highlighted 10 other examples of how we not only conduct the analysis or build the technology, but we also help nudge the policies forward. 

THE PEOPLE OF MDI

To do great things, you must be surrounded by brilliant, passionate people. MDI is a distributed group of clusters that interconnect through the methods we advance, the substantive research and policy questions we answer, the data we use, and the technologies we build.

Our institute has grown substantially in the last 4 years. We have over 10 core research faculty and over 60 affiliated research faculty from seven schools across campus. We have policy experts who have spent decades in different agencies across the government. We have a tech team with software developers who help build public interest technologies and data scientists who clean and analyze lots of messy, complicated data sets. We have an administrative and project management team that can take any idea and operationalize it into a unique event that our communications interns help execute or an innovative proposal for funding. We have MDI Policy and Postdoctoral Fellows who spend a few years with us, advancing our research and policy agenda to meet the moment. Our team has a wealth of expertise from government, academia, industry, and non-profits. MDI’s core team is filled with brilliant, passionate people who want to make a difference. I am truly blessed to have learned from so many of them.

Our growth has not only been with faculty and staff. We also have 30 to 40 Scholars who participate in advancing our research each year. These undergraduate and Master’s students span a large number of disciplines and almost all the schools at Georgetown. During the summers, we also have students from other universities engage in data and research projects. Over the last four years, we have housed over 100 MDI Scholars, Fritz Fellows, Sonneborn Scholars, NSF REU students, and research assistants who worked on over 50 research projects. I may be biased, but I believe the MDI Scholars program is a flagship research opportunity for students at Georgetown.  I appreciate all the hard work the faculty mentors and postdoctoral fellows put into our experiential learning opportunities. I also want to give a shout out to Ethics Lab for collaborating with us to ensure that every project team pauses to think about the ethics of working with specific data or technologies within their research project.

THANK YOU.

This short letter has gotten long. So one last thank you for working on the hard problems, continuing to persevere during challenging times, and inspiring me to answer questions and build tools that improve people’s lives. It is this courage, resilience, and kindness that I will always associate with MDI. 

A NEW CHAPTER

The next Director of MDI is no stranger to our community. Amy O’Hara has been at MDI for a decade as the RDC Director and as research faculty. She is a hub within MDI – ADRCon and the MDI Summer Institute are two of her current outreach initiatives.  The Institute (which just met last week) convenes technical and policy staff, researchers, technologists, and education leaders from across the country to build practical skills for applying privacy-enhancing technologies to Statewide Longitudinal Data Systems (SLDS).

Amy also works on multiple data science and research projects that support both state and federal government, including building open tools to enable responsible AI in education, evaluating the statistical validity and privacy protections of AI-generated synthetic education data, expanding access to administrative data through AI-driven tools and training resources, and evaluating the ability of large language models (LLMs) to accurately respond to questions about government open data assets. She also serves as the president of the Association of Public Data Users. I could continue for a while, but I think you can see how perfect she is for this role. 

I am thrilled that Amy is the next MDI Director and look forward to seeing where she leads the Institute.  

Tagged
MDI Director
Singh