Amir Zeldes

I am a computational linguist specializing in work on and with corpora, including corpus linguistics studies, building corpora, and creating annotation interfaces and NLP tools that make corpus creation easier. I also run the Georgetown University Corpus Linguistics lab, Corpling@GU , and I am currently president of the ACL Special Interest Group on Annotation (SIGANN ).

My main research interests are in computational models of discourse, above the sentence level: I study "how we construct discourse, given what we want to say". In particular, I have been working on predictive computational models of referentiality and discourse relations. Which entities do we track in conversation? How are they introduced into the discourse and referred back to? How do we recognize discourse relations which signal how a current utterance relates to preceding or subsequent utterances, such as by contrasting with other claims, or supporting them with evidence? How do we signal the main point of a text or a paragraph, and how do we signal supporting information?

For example, if we read even a very short text such as "Yun fell. Kim pushed her", we infer a lot of things: we understand that there are two events, that the same two people were involved in both (her=Yun), that the second probably happened before the first, and that the first was caused by the second. But how do we do this? And can we make computers understand these kinds of inferences? It turns out that computers find this very hard!

Academic Appointment(s)

Primary
Associate Professor, College - Department of Linguistics