McCourt School of Public Policy
Massive Data Institute
News

Two Years of GUITAR

When I finished graduate school and swapped coasts from California to DC amid the first summer of the COVID pandemic, I needed the comfort of a familiar routine—so I started a text-as-data working group. Text-as-data, also known as computational text analysis, describes the use of coding and software to analyze digital texts. Researchers need the skills to work with the increasingly available, massive text data in social media, digitized archives, websites, and more.

As an amateur guitarist who loves the music of language, I couldn’t resist calling this new working group Georgetown University’s Interdisciplinary Text Analysis Research (GUITAR). With MDI’s support, GUITAR took the (virtual) stage to provide an open, supportive space for scholars across the university to explore and practice text-as-data alongside researchers from the industry and other universities. GUITAR’s mission is to build an interdisciplinary and collaborative research community by supporting learning for students, faculty, and researchers working with text data. GUITAR is open to all and assumes no prior training (it complements Georgetown University Computational Linguistics (GUCL), a working group at the intersection of language and computation). 

I kicked off the group on November 23, 2020 by presenting my research analyzing charter school websites and organizational scholarship. GUITAR meets monthly online featuring 2-3 speakers with varied backgrounds and projects, highlighting in-progress research, lessons learned in industry, tutorials, as well as social meetings. Past talk titles and meeting topics include:

  • “Distributed Text Processing with Apache Spark” by Colton Padden, during the 5/5/21 meeting on Optimization and Ethics in NLP
  • “A Bibliometric Horizon Scanning Methodology for Identifying Emerging Topics in the Scientific Literature” by Dr. Ryan Zelnio at the Office for Naval Research, during the 12/01/21 meeting on When Computers Read Literature 
  • “Detecting Client Reasons for Calling and Agent Expressions of Empathy” by Yasi Haghpanah & Mark Arehart at Qualtrics, during the 04/20/22 meeting on Learning about People from Writing

Member testimonials suggest GUITAR plays an important role in educating the broader community about the role of text-as-data in research:

“I loved the different presenters and charts and graphs. It was a great place to learn about a new field and enjoy fellow academic’s company.” -Brian Holland, GU Master’s student in Data Science for Public Policy

“For me, the best part of GUITAR was being able to listen to talks from established researchers on projects they had worked out.” Micah Musser, Research Analyst at the Center for Security and Emerging Technologies at Georgetown University

“The diversity of perspectives helps us all come away from meetings with new ideas and advance progress towards our (sometimes very different) goals.” Jordan Jasuta Fischer, IBM

For more information on topics covered by GUITAR, check out our openly accessible meeting notes archive. To join GUITAR and receive meeting invites, RSVP here. Members also have access to slides and recordings from past meetings. We hope to see you at the next GUITAR meeting!

Meeting participants above a presentation intro slide for the Spring 2022 GUITAR meeting on “Detecting client reasons for calling and agent expressions of empathy.” Participants included (clockwise from top left) Yujia Liu, Jaren Haber, Mark Arehart, Colton Padden, Kibum Moon, Yasi Haghpanah, Nathan Lado, and Emily Penner.

Written by Dr. Jaren Haber: a Postdoctoral Fellow with Georgetown University’s Massive Data Institute. His research applies computational methods to study how organizational contexts, social categories, and media segmentation shape the impacts of structural inequalities. He also leads the GU Interdisciplinary Text Analysis Research (GUITAR) working group. Dr. Haber received his PhD in Sociology from the University of California, Berkeley in 2020.

Tagged
Research Community
Text Analysis