Preliminary Program

Discourse indicators for content selection in summarization

Annie Louis, Aravind Joshi, Ani Nenkova
University of Pennsylvania

Abstract

We present analyses aimed at eliciting which speciﬁc aspects of discourse provide the strongest indication for text importance. In the context of content selection for single document summarization of news, we examine the beneﬁts of both the graph structure of text provided by discourse relations and the semantic sense of these relations. We ﬁnd that the structure information from discourse is the most robust indicator of importance. Semantic sense only provides constraints on content selection but is not indicative of important content by itself. However, sense features complement structure information and lead to improved performance. Further, both types of discourse information prove complementary to non-discourse features. While our results establish the usefulness of discourse features, we also ﬁnd that lexical overlap provides a simple and cheap alternative to discourse for computing text structure with comparable performance for the task of content selection.