Workshop Highlight: Advanced Models Using Text

Written by Tilde Jaques, MDI Journalism Intern

Artificial Intelligence seems to be dominating the headlines recently – whether it’s new technologies being developed, or the recent Executive Order on Improving AI Security, technology is a major theme of public policy today.

As part of the MDI Fall Workshop series exploring text as data, the Massive Data Institute (MDI) hosted its second workshop on the topic of Advanced Models Using Text with MDI Fellow Dr. Helge Marahrens on October 23 and 24.

This workshop, which took place in the McCourt School of Public Policy on Georgetown’s main campus, focused on various types of large language models including topic-noise models, word embeddings, and neural networks.

Dr. Marahrens spoke in depth about the role of neural networks in sentiment analysis and emotion detection — an issue that is relevant to technology today with the emergence of sophisticated artificial intelligence such as ChatGPT. Dr. Marahrens mentioned the benefits and challenges of neural networks, focusing on the ways that they have been used specifically by social media platforms like Facebook to analyze user data. He explained that ultimately “whether these models become a utopia or dystopia is up to what we do with them.”

While much of the discourse around technology like neural networks tends to be negative, models using text can also have extremely valuable uses. For example, neural networks can actually be very effective at analyzing textual and image data, when trained properly. Dr. Marahrens’ workshop focused specifically on how we can present textual and image data to computers in a way that they can understand and then analyze.

MDI workshops are open to the whole Georgetown community — students, faculty and staff. Among participants was Professor Emisa Nategh, PhD, who has been working on a new research project called “Improving Customer Experiences in Service Settings and Increasing Fairness in Operations: Evidence from Twitter.” This project requires in-depth knowledge of models in text-analysis. She is also working on a project focusing on cancer patients, which she says involves “multiple notes from various sources such as doctors, nurses, and laboratories.”

Professor Nategh attended this workshop to gain insight into advanced models in text mining in order to enhance her capabilities for these projects. She says that this workshop “presented excellent examples of text mining code,” and that Dr. Marahrens “guided [participants] through the code, explaining each line in detail.”

Associate Professor Vicki Wei Tang, PhD, also attended the workshop. Professor Tang has been working on research involving social media data. Recently, she published a paper about a project in which she “developed a way to aggregate information on whether consumers liked a product to see how social media influences sales and revenue.”

Professor Tang explained that she came to the workshop to “figure out how [text modeling] techniques apply to my research.”

This workshop was the second in a series of workshops this Fall presented by MDI Fellows. The next workshop in this series will be “Cutting Large Language Models Down to Size” with Dr. Nathan Wycoff. It will take place in St. Mary’s Hall on November 6th and November 7th from 4:00 to 5:30pm. Registration here!

Tagged: Fellows; Jaques; MDI Workshops; MDI Workshops Fall 2023; workshops