Evaluation Metrics For End-to-End Coreference Resolution Systems

Jie Cai and Michael Strube


Commonly used coreference resolution evaluation metrics can only be applied to key mentions. We here propose two variants of the B-cubed and CEAF coreference resolution evaluation algorithms which can be applied to coreference resolution systems dealing with system mentions. We describe experiments showing that our variants lead to intuitive and reliable results.