OpenCodePapers

common-sense-reasoning-on-parus

Common Sense Reasoning
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyModelNameReleaseDate
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark✓ Link0.982Human Benchmark2020-10-29
[]()0.908Golden Transformer
[]()0.766YaLM 1.0B few-shot
[]()0.676RuGPT3XL few-shot
[]()0.66ruT5-large-finetune
[]()0.598RuGPT3Medium
[]()0.584RuGPT3Large
[]()0.574RuBERT plain
[]()0.562RuGPT3Small
[]()0.554ruT5-base-finetune
[]()0.528Multilingual Bert
[]()0.508ruRoberta-large finetune
[]()0.508RuBERT conversational
mT5: A massively multilingual pre-trained text-to-text transformer✓ Link0.504MT5 Large2020-10-22
[]()0.498SBERT_Large_mt_ru_finetuning
[]()0.498SBERT_Large
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.498majority_class2021-05-03
[]()0.492ruBert-large finetune
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark✓ Link0.486Baseline TF-IDF1.12020-10-29
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.48Random weighted2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.478heuristic majority2021-05-03
[]()0.476ruBert-base finetune