OpenCodePapers

dialogue-evaluation-on-usr-topicalchat

Dialogue Evaluation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeSpearman CorrelationPearson CorrelationModelNameReleaseDate
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation✓ Link0.51090.4575MDD-Eval2021-12-14
Proxy Indicators for the Quality of Open-domain Dialogues✓ Link0.48770.4974Lin-Reg (all)
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation✓ Link0.41920.4220USR2020-05-01
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation✓ Link0.32450.4068USR - DR (x = c)2020-05-01
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation✓ Link0.30860.3345USR - MLM2020-05-01
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation✓ Link0.14190.3221USR - DR (x = f)2020-05-01