How AI and Causal Inference Are Changing the Way We Understand Impact
By: Miranda M. Yarowsky, SFS ’26, Fall 2025 MDI Communications & Events Assistant
In an age where algorithms shape what we see, who gets hired, and even who receives aid, “the hardest question isn’t how to predict; it’s how to measure”. This was a central message of the Fall 2025 MDI Distinguished Lecture, delivered on October 22 by Susan Athey, Economist and Professor of the Economics of Technology at Stanford University. Active in the field of machine learning and causal inference since 2007, and co-author of the influential papers: “Recursive Partitioning for Heterogeneous Causal Effects” and “Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests”, Athey has helped reshape how researchers use machine learning to understand cause and effect.
In her lecture, Athey focused on how the next frontier of artificial intelligence (AI) is not just about building smarter systems, but about defining what successful impact truly means and how to measure impact: whether in labor markets, public policy, or social well-being.
Creativity in Defining What to Measure
Athey reminded the audience that creativity is as vital for scientists as it is for artists or engineers. “Creativity comes into defining what to measure,” she said. In a world ruled by data, deciding what to quantify is not neutral; our choices reflect our priorities. And these choices ultimately shape the direction where innovation leads.
In Athey’s own work, which includes building causal trees and causal forests, she hones in on improving the prediction of cause and effect. But in her lecture, she made a larger point: The metrics researchers pick can define what good or successful AI truly means. In this way, it seems the act of measuring can quietly steer how the whole field of AI develops.
The Challenge of Measuring Long-Term Impact
Athey also pointed out a longtime problem, which is that good measurement is rarely easy measurement. Many of the outcomes that matter most, whether in education or well-being, are difficult to observe in the short term. “It’s hard to measure in the long term,” she noted, because meaningful effects play out through complex social and institutional feedback loops.
Her research on heterogeneous treatment effects, explores this challenge. Instead of asking whether an intervention “works” on average, Athey’s methods identify for whom it works and under what conditions. Through papers like “Recursive Partitioning for Heterogeneous Causal Effects” and “Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests”, she has developed tools that help researchers move beyond prediction and into causal inference, or, in her words, “picking the right objective”, especially given the challenge of heterogeneity.
When Social Science Meets Artificial Intelligence
What is particularly meaningful in Athey’s work is how it demonstrates the indispensable role of social scientists in the future of AI. These vital researchers introduce frameworks for understanding the structural factors that shape the dimensions of human well-being that cannot be optimized by code alone. Athey illustrated this through diverse examples. She discussed analyzing data from 180,000 laid-off workers in Sweden to understand how automation influences employment trajectories. Another collaboration in Poland tested digital upskilling programs for women, showing how interventions must be evaluated not only on immediate outcomes but on long-term opportunities and constraints rooted in gender and labor markets.
When researchers reframed the question to include these structural constraints, they found that programs seeming effective in the short term often did little to change women’s long-term mobility or access to higher-quality jobs, insights that would have been invisible using only immediate metrics. In both cases, the model mattered, but the framing of what to measure mattered even more.”

Students at the reception echoed this emphasis on intentional design. Hailey, an Economics Ph.D. candidate, noted that Athey’s work made machine learning feel less opaque:
“One takeaway for me…is the fact that machine learning methods don’t have to be as much of a black box as they’re perceived to be…So I’m definitely looking forward to checking out her generalized random forest R package and learning more about how to do that.”
Mondrita, another Ph.D. candidate, added that it was refreshing to see someone so deeply trained in economics speak comfortably in technical language:
“I can see in her language that there’s a lot related to economics that is familiar to us.”

These reflections also point toward how AI is not simply a technical frontier but also a methodological one. In her closing remarks, Dr. Lisa Singh, the Director of MDI, emphasized that the real unifying thread in Athey’s work is the challenge of causal inference.
Both economists and data scientists seem to struggle not just to detect correlations, but also to understand why outcomes differ across people and contexts. That is precisely where Athey’s work becomes so influential. As she put it, the future of AI depends on “picking the right objective”, i.e., deciding what values we want our systems to optimize before we scale them. Measurement, in this sense, becomes an ethical choice. Building on this point, Athey’s lecture made clear that the future of AI will be defined not only by computational advances but by the values embedded in our evaluation systems. The central question–what should we optimize?–is ultimately a societal one. As institutions and companies increasingly deploy AI at scale, the responsibility to choose meaningful metrics becomes a form of institutional responsibility.
In that sense, Athey’s work extends beyond technical innovation and offers a blueprint for scaling AI that reflects human priorities rather than potentially obscuring them. The path forward requires the same creativity she emphasized: the creativity to measure what truly matters.
