reading-comprehension-on-muserc

Reading Comprehension

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Average F1	EM	ModelName	ReleaseDate
[]()		0.941	0.819	Golden Transformer
mT5: A massively multilingual pre-trained text-to-text transformer	✓ Link	0.844	0.543	MT5 Large	2020-10-22
[]()		0.83	0.561	ruRoberta-large finetune
[]()		0.815	0.537	ruT5-large-finetune
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark	✓ Link	0.806	0.42	Human Benchmark	2020-10-29
[]()		0.769	0.446	ruT5-base-finetune
[]()		0.76	0.427	ruBert-large finetune
[]()		0.742	0.399	ruBert-base finetune
[]()		0.74	0.546	RuGPT3XL few-shot
[]()		0.729	0.333	RuGPT3Large
[]()		0.711	0.324	RuBERT plain
[]()		0.706	0.308	RuGPT3Medium
[]()		0.687	0.278	RuBERT conversational
[]()		0.673	0.364	YaLM 1.0B few-shot
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks		0.671	0.237	heuristic majority	2021-05-03
[]()		0.653	0.221	RuGPT3Small
[]()		0.646	0.327	SBERT_Large
[]()		0.642	0.319	SBERT_Large_mt_ru_finetuning
[]()		0.639	0.239	Multilingual Bert
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark	✓ Link	0.587	0.242	Baseline TF-IDF1.1	2020-10-29
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks		0.45	0.071	Random weighted	2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks		0.0	0.0	majority_class	2021-05-03

OpenCodePapers