Asli Celikyilmaz, Microsoft Research
Title: Neural Text Generation: Progress and Challenges
Abstract: Automatic text generation enables computers to summarize text, describe pictures to visually impaired, write stories or articles about an event, have conversations in customer-service, chit-chat with individuals, and other settings, etc. Neural text generation – using neural network models to generate coherent text – have seen a paradigm shift in the last years, caused by the advances in deep contextual language modeling (e.g., LSTMs, GPT) and transfer learning (e.g., ELMO, BERT). While these tools have dramatically improved the state of text generation, particularly for low resource tasks, state-of-the-art neural text generation models still face many challenges: a lack of diversity in generated text, commonsense violations in depicted situations, difficulties in making use of multi-modal input, and many more. I will discuss existing technology to generate text with better discourse structure, narrative flow, or one that can use world knowledge more intelligently. I will conclude the talk with a discussion of current challenges and shortcomings of neural text generation, pointing to avenues for future research.
Bio:
Asli Celikyilmaz is a Principal Researcher at Microsoft Research (MSR) in Redmond, Washington. She is also an Affiliate Professor at the University of Washington. She has received Ph.D. Degree in Information Science from University of Toronto, Canada, and later continued her Postdoc study at Computer Science Department of the University of California, Berkeley. Her research interests are mainly in deep learning and natural language, specifically on language generation with long-term coherence, language understanding, language grounding with vision, and building intelligent agents for human-computer interaction She is serving on the editorial boards of Transactions of the ACL (TACL) as area editor and Open Journal of Signal Processing (OJSP) as Associate Editor. She has received several “best of” awards including NAFIPS 2007, Semantic Computing 2009, and CVPR 2019.
Diane Litman, University of Pittsburgh
Title: Argument Mining, Discourse Analysis, and Educational Applications
Abstract: The written and spoken arguments of students are educational data that can be automatically mined for purposes such as student assessment or teacher professional development. This talk will illustrate some of the opportunities and challenges in educationally-oriented argument mining. I will first describe how we are using discourse analysis to improve argument mining systems that are being embedded in educational technologies for essay grading and for analyzing classroom discussions. I will then present intrinsic and extrinsic evaluation results for two of our argument mining systems, using benchmark persuasive essay corpora as well as our recently released Discussion Tracker corpus of collaborative argumentation in high school classrooms.
Bio: Diane Litman is Professor of Computer Science, Senior Scientist with the Learning Research and Development Center, and Faculty Co-Director of the Graduate Program in Intelligent Systems, all at the University of Pittsburgh. Her current research focuses on enhancing the effectiveness of educational technology through the use of spoken and natural language processing techniques such as argument mining, summarization, multi-party dialogue systems, and revision analysis. She is a Fellow of the Association for Computational Linguistics, has twice been elected Chair of the North American Chapter of the Association for Computational Linguistics, has co-authored multiple papers winning best paper awards, and was the SIGdial Program Co-Chair in 2018.
Gabriel Skantze, KTH Royal Institute of Technologies
Title: Conversational Turn-taking in Human-robot Interaction
Abstract: The last decade has seen a breakthrough for speech interfaces, much thanks to the advancements in speech recognition. Apart from voice assistants in smart speakers and phones, an emerging application area are social robots, which are expected to serve as receptionists, teachers, companions, coworkers, etc. Just like we prefer physical meetings over phone calls and video conferencing, social robots can potentially offer a much richer interaction experience than non-embodied dialogue systems. One example of this is the Furhat robot head, which started as a research project at KTH, but is now used in commercial applications, such as serving as a concierge at airports and performing job interviews.
However, even though this recent progress is very exciting, current dialogue systems are still limited in several ways, especially for human-robot interaction. In this talk, I will specifically address the modelling of conversational turn-taking. As current systems lack the sophisticated coordination mechanisms found in human-human interaction, they are often plagued by interruptions or sluggish responses. In a face-to-face conversation, we use various multi-modal signals for this coordination, including linguistic and prosodic cues, as well as gaze and gestures. I will present our work on the use of deep learning for modelling these cues, which can allow the system to predict, and even project, potential turn-shifts. I will also present user studies which show how the robot can regulate turn-taking in multi-party dialogue by employing various turn-taking signals. This can be used to both facilitate a smoother interaction, as well as shaping the turn-taking dynamics and participation equality in multi-party settings.
Bio: Gabriel Skantze is professor in speech technology with a specialization in dialogue systems at KTH Royal Institute of Technology. His research focuses on the development of computational models for situated dialogue and human-robot interaction. He is also co-founder and chief scientist at Furhat Robotics, a startup based in Stockholm developing a platform for social robotics. Since 2019, he is the president of SIGdial.