Events
News

Fall 2023 Scholars Present Projects and Reflect on the Semester

Written by Tilde Jaques, MDI Journalism Intern

The Massive Data Institute’s Fall 2023 MDI Scholars Showcase occurred on Wednesday, Dec. 6. MDI Scholars shared posters of their research projects with the community and presented flash talks. These Scholars are undergraduate and MS students who are selected by MDI faculty to work on faculty-guided research during the Fall semester. The research spans different disciplines and includes a data and/or policy component.  The MDI Scholars program is in its 4th year. Along with working on specific research projects, the cohort convenes each month to learn different data science skills and share research experiences.

Learn more about the MDI Scholar reflections on their research over this past semester, and how they got involved with the MDI Scholars program!

Project 1: Deciphering Lawmakers’ Ideologies: Integrating Machine Learning and Large Language Models
Zhiqiang Ji (M.S. Data Science for Public Policy ’24) worked with Professor Michael Bailey on a project exploring the application of ChatGPT for analyzing ideological positions in political texts.


Project 2: DistrictView: Building a First-of-Its-Kind Database of U.S. School Board Meeting Transcripts
Maggie Sullivan (M.S. Data Science for Public Policy ’24) worked with Corinna Calanoc (M.S. Data Science and Analytics ’24) and Faculty Advisor Rebecca Johnson to build a first-of-its-kind database of transcripts from U.S. school board meetings. More info here! 


Project 3: Building A Data Collection Protocol to Preserve Privacy
 Jason Yi (B.S. Computer Science ’26) and Jamie Spoeri (B.S. Computer Science ’25) worked with Jianan Su, Professors Micah Sherr and Lisa Singh, and Fellow Harel Berger on a project aiming to facilitate the secure collection of distributed data without compromising users’ privacy.


“I aim for my research to have a notable effect on the broader community by addressing real-world issues and enhancing societal well-being,” Li said, “I am enthusiastic about utilizing data science tools to contribute to fostering a more positive social environment.”

Xinyu Li

Project 4: Measuring French Racism and Misrepresentation on Social Media to Better Understand Online Perception
Xinyu Li (DSAN ’24) worked with Faculty Advisor Lisa Singh on a project examining the dynamics of online interactions, behaviors, and perceptions, with a focus on the interplay of gender, race, and identity in the digital realm.


Project 5: Automated pipeline to extract wetland damage data from US Army Corps of Engineers notices using LLMs
Xinyu Zheng (M.A. Public Policy ’23) and Himangshu Kumar (M.S. Data Science for Public Policy ’24) worked on a project using automated pipeline data to understand the impact of development projects on sensitive areas.


Project 6: The Graying of the Federal Workforce
Haiyang Chen (M.S. Data Science for Public Policy ’24) worked with Linlin Wang (M.S. Data Science and Analytics ’24) and Faculty Advisor Mark Richardson on a project measuring the aging federal workforce. Here’s more.


Project 7: Exploring the Patterns of Toxic Release: Towards Environmental Equity
Raunak Advani (B.S. Data Science & Analytics ’24) worked with Fellow Le Bao on a project that investigated the spatial distribution of environmental pollution, focusing on the relationship between demographics and exposure to these toxic release emissions.


Project 8: Environmental Justice Data Solution: A Holistic Approach
Minh Quach (M.S. Data Science for Public Policy ‘’24) worked with Madhvi Malhotra (M.S. Data Science for Public Policy ’24) and Fanni Varhelyi (M.S. Data Science for Public Policy ’24) under the supervision of Professor Michael Bailey on a project to create a prototype for policymakers to more easily access environmental justice data. Here’s more.


Project 9: Spatial and demographic patterns of building-level emissions in Washington D.C.
Himangshu Kumar (M.S. Data Science for Public Policy ’24) and Anthony Moubarak (B.S. Data Science & Analytics ’24) worked with Professor Michael Bailey and Fellow Ahmed Eissa on a project looking at spatial and demographic patterns of building-level emissions in Washington D.C. Here’s more information about their project.


Project 10: Evaluating the Reach & Efficacy of Head Start Locations
Amanda Hao (B.A. Science, Technology, and International Affairs ’26) worked on a project under the supervision of Professor Amy O’Hara and MDI Research Specialist Gabriel Taylor that aimed to evaluate the reach and efficacy of national Head Start locations. Here’s more about this project.


Project 11: Exploring Methods for Privacy-Protecting Administrative Record Linkage
Alicia Gopal (B.S. Political Economy ’25) worked with faculty advisors Professor Amy O’Hara and Dr. Nathan Wycoff on a project that examines ways to protect privacy and confidentiality by disrupting data.


Project 12: Using Selenium to automate education finance data pipelines
Andrew Lee (B.A. Science, Technology, and International Affairs ’24) worked on a project aiming to use Selenium to automate education finance data pipelines.


Project 13: Learning Lessons from Incident Reporting
Brian Holland (M.S. Data Science for Public Policy ’24) worked with Professor Robin Dillon-Merrill on a project that analyzes data from incident reporting systems to find trends and lessons.


Project 14: Sentiment and Emotion as Tools for Forced Migration Prediction?
Bernardo Medeiros (B.A. Computer Science & Government ’24) worked with Kate Liggio (B.A. Computer Science ’24) and Rich Pihlstrom (B.A. Computer Science ’24) on a project investigating emotion and sentiment expressed on social media as tools to improve predictive models for forced migration. The MDI Scholar team was advised by Lisa Singh, Katherine Donato, Ali Arab, Nathan Schneider, and Ameeta Agrawal. Here’s more about this project.


Project 15: Tracking Cross-National Score Trajectory in PISA with Dynamic Time Warping Methods
Kefan Yu (B.S. Data Science & Analytics ’24) worked with Professor Qiwei Britt He on a project that tracked scores, ranks, and performance disparities between boys and girls in the Program for International Student Assessment (PISA) across nations.


Project 16: Risk Mapping in Forensic DNA Analysis
Roy Hwang (B.S. Computer Science ’25) and Julia Nonnenkamp (B.S. Computer Science & Mathematics ’24) worked with faculty advisors Lisa Singh, Elissa Redmiles, and Ioannis Ziogas on a project investigating the technologies used for genetic data analysis to identify potential risks and limitations of technology used in forensic analysis.

Tagged
Jaques
MDI Scholar Showcase