OpenCodePapers

common-sense-reasoning-on-rucos

Common Sense Reasoning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAverage F1EMModelNameReleaseDate
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark✓ Link0.930.89Human Benchmark2020-10-29
[]()0.920.924Golden Transformer
[]()0.860.859YaLM 1.0B few-shot
[]()0.810.764ruT5-large-finetune
[]()0.790.752ruT5-base-finetune
[]()0.740.716ruBert-base finetune
[]()0.730.716ruRoberta-large finetune
[]()0.680.658ruBert-large finetune
[]()0.670.665RuGPT3XL few-shot
mT5: A massively multilingual pre-trained text-to-text transformer✓ Link0.570.562MT5 Large2020-10-22
[]()0.360.351SBERT_Large
[]()0.350.347SBERT_Large_mt_ru_finetuning
[]()0.320.314RuBERT plain
[]()0.290.29Multilingual Bert
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.260.257heuristic majority2021-05-03
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark✓ Link0.260.252Baseline TF-IDF1.12020-10-29
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.250.247Random weighted2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.250.247majority_class2021-05-03
[]()0.230.224RuGPT3Medium
[]()0.220.218RuBERT conversational
[]()0.210.204RuGPT3Small
[]()0.210.202RuGPT3Large