The Green Space Data Challenge
ANNOUNCING THE GREEN SPACE DATA CHALLENGE WINNERS
Hosted by the Place-Based Indicators Project at The Massive Data Institute, the Green Space Data Challenge was an opportunity for students and early career professionals to turn green space data into indicators that could help local leaders understand and improve their communities.
The winning entries were:
- First place ($5,000) – Zhaowen Guo, Shuang Wu, Chenyue Cao, Yiwen Wang
- Second place ($2,000) – Yuxin Liang, Wenhao Jiang (Penn, NYU)
- Third place ($1,000) – Ryan Gentzler
- First place ($5,000) – Jia Xu, Tianyu Shi, Yimin Sheng, Yingtong Zhong (Penn)
- Second place* ($2,000) – Zairui Yang, Ying Shu, Yao Jiang, Jiaxi Lin (Penn)
- Second place* ($2,000) – Anna Scaramuzza, Timothy Putnam
- Third place ($1,000) – Vickram Peter, Anna Kramer, Vaishnavi Akilla, Yu Ze Toh (NYU)
*Since two teams tied for second place in Community Health, we are awarding two separate $2,000 prizes to both teams who tied.
- First place ($5,000) – Joseph Edgerton, Shahrzad Badri, Veronica Malabanan Lucchese
- Second place ($2,000) – Diana Schoder (Georgetown)
- Third place ($1,000) – Kumar H, Ayush Lahiri (Georgetown)
- First place ($5,000) – Yifan Bian, Kangdong Han (Georgetown)
- Second place ($2,000) – Tyler Hoffman, Shaylyn Trego, and Timara Crichlow (Arizona State)
- Third place ($1,000) – Ben Garza
- $1,000 prize – Jia Xu, Tianyu Shi, Yimin Sheng, Yingtong Zhong (Penn)
- $500 prize – Zairu Yang, Ying Shu, Yao Jiang, Jiaxin Lin (Penn)
- $500 prize – Yifan Bian, Kangdong Han (Georgetown)
Each submission was carefully evaluated by expert judges from the U.S. Department of Housing and Urban Development (HUD), USDA-Forest Service, Urban Health Collaborative at Drexel University, and the Spatial Analysis Lab at the University of Vermont. Interested in learning more about their projects? Join this APDU webinar on April 17, 2023 at 3:00pm ET to learn more from the winners about their ideas and hopes for improving community well-being through examining the effects of green space on community health, community safety, specific populations, and physical environment.
To Registrants: The data challenge has officially begun! Please see important communications for registered participants, including registration instructions and next steps, on this page. To ensure that you have the full month to plan and conduct your analysis, we highly recommend completing this process as soon as possible.
How does access to green space impact public health? How can green space data provide actionable information about our communities and where people’s needs are or aren’t being met?
The Massive Data Institute at Georgetown’s McCourt School of Public Policy welcomes your participation in the virtual Green Space Data Challenge. Participants will have the entire month of February 2023 to demonstrate the value of green space data by creating analyses, visualizations, and new community indicators.
Access to green space is an essential need and has been found to have important quality of life implications. While access to green space is not equitable across communities nationwide, it has significant implications for community well-being, and has been found to encourage outdoor recreation, human connection, and positive mental and physical health.
In this data challenge, you will analyze the impact of inequitable green space access on communities using various data sources on our Redivis platform in individual or team notebooks. Your goal will be to transform these green space datasets into actionable community indicators that illustrate the effects of green space across one of four dimensions: public health, public safety, effects on a specific population (e.g., by age, race, or location), or physical environment (e.g. pollution levels).
We provide six datasets directly with information on green space for you to work with for the challenge. We also are linking to a variety of datasets on green space as well as subject area indicators (e.g. for public health, public safety, etc.) from the Environmental Impact Data Collaborative; you are free to use any of these in the challenge. In addition, you may request to use additional publicly available data in your submission by submitting a request by Wednesday, February 8th at 5:00PM EST. This data will be made available to all participants through the Redivis site. See the FAQ below for more information.
You will be evaluated by judges based on the relevance, completeness, and quality of your submission. Judges include:
Stephen T. Dickinson, PhD, Urban Health Collaborative, Drexel University
Alexander Din, United States Department of Housing and Urban Development (HUD)
Michelle Kondo, PhD, USDA-Forest Service
Jarlath O’Neil-Dunne, Spatial Analysis Lab, University of Vermont
- Are communities in close proximity to green space more likely to have better air quality?
Is there a relationship between green space and social vulnerability?
- Do communities with better green space access experience lower rates of violent crime or gun deaths?
- How strong is the relationship between community green space and obesity?
- What are the effects of good park systems on mortality?
For more ideas, see our lit review with examples of research involving green space in each of the four challenge subject areas.explore more
When: Feb 1-28, 2023
Where: The data challenge is entirely virtual, with information and rules of the challenge available on this website. Updates will be posted on this website as well as emailed to registered participants.
Who you are: We welcome any participants over age 18, especially undergraduate students, graduate students, and early professional data scientists. Analysts based in academic institutions, government statistical offices, think tanks and policy labs, and community organizations are encouraged to participate.
What topics you can explore: Participants can analyze green space data in tandem with various other datasets that include health, environment, and/or public safety data.
Submissions and evaluation: Participants will conduct their analyses and submit a short project narrative that describes the research question, analytic approach, and key findings. We encourage participants to find creative ways to incorporate visualizations and other aspects of data storytelling to create a compelling narrative. In addition to their completed analysis with the indicators they used, participants will be asked to submit documentation describing each step of their process. The documentation should be detailed enough as to make the project fully replicable.
The narrative, indicator and methods, and organization and documentation of each participant’s project will be evaluated for relevance, completeness, and quality. More information on evaluation can be found in the FAQs at the bottom of this page.
Accommodation requests related to a disability should be made by 1/27/23 to firstname.lastname@example.org (new window).
There will be separate prize categories for submissions examining the effects of green space on the following subject areas:
|Best Graduate Student Submission||one $1000 prize|
|Best Undergraduate Submission||one $1000 prize|
Participating teams will be listed on MDI’s Place-Based Indicators project webpage. Winners will be invited to present their project both at a webinar hosted by the Association of Public Data Users (APDU) and at APDU’s annual conference in July 2023.
We will provide access to the following datasets. Challenge participants are welcome to bring their own data sources to be added or to use sources from our companion site, the Environmental Impact Data Collaborative (see our FAQs for more details).
A variety of datasets on community health, public safety, physical environment, and specific populations are available for use in your analysis through the EIDC and from other sources.
|Dataset||Description||Geographic Coverage||Data type(s)||Estimated difficulty|
|EnviroAtlas | US EPA||This data shows land cover for 30 US urban areas at 1-meter spatial resolution. Land cover data present a “birds-eye” view that can help identify important features, patterns, and relationships in the landscape.||30 U.S. Communities. See full list here.||ESRI FileGDB, CSV||🌳🌳🌳|
|Provision and Access to Open Spaces in Cities||Shows average share of built-up area in nine urban areas that is open public space, as well as percent of population living within 400 meters walking distance of open public space.||9 US cities and urban areas.||CSV (various)||🌳|
|ParkServe Data||Comprehensive database of local parks in census-defined urban areas.||Census-defined urban areas.||ESRI Shapefile, CSV||🌳🌳🌳|
|Green space data by census block||Illustrates the square meters of total land per person within each census block group that is covered by green space.||National||ESRI FileGDB, CSV||🌳🌳|
|Climate and Economic Justice Screening Tool||Highlights disadvantaged census tracts across all 50 states, the District of Columbia, and the U.S. territories.||National (census tract)||Varies||🌳🌳|
|Tree Equity Score||Metric intended to help communities assess how well they are delivering equitable tree canopy cover to all residents.||National (by state)||ESRI Shapefile||🌳|
|PAD-US-AR||Curated dataset based on the Protected Areas Database-United States (PAD-US) from the USGS.||National||Varies||🌳🌳|
Q: Can I register as an individual or am I required to participate through a team? What is the team size limit?
A: You can choose to either work individually or work within a team of up to four people, including yourself.
Q: Am I limited to a single submission? How should I submit my files?
A: Yes, each individual or team is limited to one submission. One person cannot submit multiple entries or participate through multiple teams.
Q: Can I present my findings from this competition afterwards, in a different research space?
A: Yes, in fact, it is encouraged! You will not be able to download the data on the Redivis, but you are welcome to share your findings with others.
Q: I want to use additional datasets – will I be ineligible if I cannot submit this data?
A: You may submit a request to add additional public data to the Challenge workspace. Note: The data must be made available to all participants. To submit a request to add public data to the Challenge, please email email@example.com by Wednesday, February 8th at 5:00 PM EST.
Q: Can I use other datasets publicly available on Redivis in the challenge?
A: Yes! You can use any dataset that’s publicly available on Redivis (even if it’s not from the Data Challenge or EIDC) that’s relevant for your analysis. If you plan to do so, we recommend running the dataset you’re looking at using by us to confirm that it’s a good fit for the challenge.
Q: How will the challenge be assessed?
A: Each judge will use the same rubric to score submissions – scoring is based on a range of factors (e.g., the overall narrative, methods used, visualization, and creativity). Participants will conduct their analyses and submit a short project narrative that describes the research question, analytic approach, and key findings. We encourage participants to find creative ways to incorporate visualizations and other aspects of data storytelling to create a compelling narrative.
The effort and depth shown in the project narrative in answering the research question will be given the most weight, followed by the quality and utility of the resulting indicator or finding. The final assessment criterion is participants’ organization and level of documentation.
Q: Should I participate even if I don’t have the same level of experience as others?
A: We expect a broad range of experience – with some users with minimal data experience and others with some background. This competition is aimed at users who are still in college (undergraduate or graduate) or are early career professionals, but anyone is welcome to participate. Substantial experience working with these types of data is not required.
Q: If my submission touches on more than one subject area, will it be evaluated for all of the relevant subject areas or just one area?
A: Submissions can only be considered for one subject area. If your submission touches on multiple subject areas, please indicate clearly which area you would like your project to be considered for.
Q: Will submissions be publicized after?
A: Only the submissions that place in the competition will be publicized. The Massive Data Institute will arrange a webinar with APDU where competition winners will have an opportunity to present their work.