common-sense-reasoning-on-parus

Common Sense Reasoning

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Accuracy	ModelName	ReleaseDate
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark	✓ Link	0.982	Human Benchmark	2020-10-29
[]()		0.908	Golden Transformer
[]()		0.766	YaLM 1.0B few-shot
[]()		0.676	RuGPT3XL few-shot
[]()		0.66	ruT5-large-finetune
[]()		0.598	RuGPT3Medium
[]()		0.584	RuGPT3Large
[]()		0.574	RuBERT plain
[]()		0.562	RuGPT3Small
[]()		0.554	ruT5-base-finetune
[]()		0.528	Multilingual Bert
[]()		0.508	ruRoberta-large finetune
[]()		0.508	RuBERT conversational
mT5: A massively multilingual pre-trained text-to-text transformer	✓ Link	0.504	MT5 Large	2020-10-22
[]()		0.498	SBERT_Large_mt_ru_finetuning
[]()		0.498	SBERT_Large
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks		0.498	majority_class	2021-05-03
[]()		0.492	ruBert-large finetune
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark	✓ Link	0.486	Baseline TF-IDF1.1	2020-10-29
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks		0.48	Random weighted	2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks		0.478	heuristic majority	2021-05-03
[]()		0.476	ruBert-base finetune

OpenCodePapers