Modeling and Predicting Quality in Spoken Human-Computer Interaction

Alexander Schmitt,  Benjamin Schatz,  Wolfgang Minker
Ulm University


Abstract

In this work we describe the modeling and prediction of Interaction Quality (IQ) in Spoken Dialogue Systems (SDS) using Support Vector Machines. The model can be employed to estimate the quality of the ongoing interaction at arbitrary points in a spoken human-computer interaction. We show that the use of 52 completely automatic features characterizing the system-user exchange significantly outperforms state-of-the-art approaches. The model is evaluated on publically available data from the CMU Let’s Go Bus Information system. It reaches a performance of 61.6% unweighted average recall when discriminating between 5 classes (good to very poor). It can be further shown that incorporating knowledge about the user’s emotional state does hardly improve the performance.