MDI Project Spotlight: Professor Maria Alva Uses Open-Source Data to Track Death Rate Trends
Written by Paige Kupas, MDI Journalism Intern
Maria Alva, Assistant Professor in the School of Health and MDI Affiliated faculty, has been working on a project to harness open-source data to track changes in death rates. Her project remains particularly relevant because the Centers for Disease Control and Prevention (CDC) stopped tracking and reporting COVID-19 cases in May, and the lags in all cause and cause-specific mortality are long.
The project, funded by a grant from the National Institutes of Health (NIH), aims to track death trends across demographic groups, such as age, gender, race, or ethnicity in Washington, D.C. A spike in death rates within any of the groups would indicate that there is an underlying factor to investigate further, according to Alva.
“What this project seeks to do is use open-source data — that people already contribute to through funeral homes and newspaper online obituaries — to track at the aggregate level changes in all-cause mortality across groups,” Alva said.
Using open source data to track death rates provides advantages in terms of both cost and timeliness, according to Alva. Gaining access to official administrative death data requires time and money, and Vital Records and the CDC typically operates under a two-year-long lag in releasing such data, although this lag time was dropped during the COVID-19 pandemic under FOIAs.
In order to test the representativeness and reliability of using online obituary and funeral home records, Alva obtained death certificates of the individuals who died in the District of Columbia from 2015 to 2021. Comparing the administrative data to the open source data will help the project’s researchers learn how to best use open source data to predict changes in administrative data as accurately as possible.
Alva and her team have recently submitted a paper about the project, “Death, Inequality, and the Pandemic in the Nation’s Capital,” that suggests tracking changes in mortality rates helps researchers to understand how shocks impact different demographic groups disproportionately.
Alva hopes that the project’s impact will allow organizations and countries that lack surveillance infrastructure to be able to do timely detection and identification of new diseases and outbreaks using open source data.
“We plan on making our code open-source. This is a multi-disciplinary effort between public health, computer science and data scientists. In essence, our approach involves collecting and aggregating the information people already provided voluntarily to different repositories – we leverage this crowdsourcing effort to give back insights and identify patterns or changes in our community as one would do in epidemiological surveillance,” Alva explained.
Although a spike in deaths does not inform researchers exactly what is happening, it “serves as an immediate alarm system that something is going on” that warrants investigation, which is vital when trying to keep up with future pandemics and public health crises.