OpenCodePapers

reading-comprehension-on-muserc

Reading Comprehension
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAverage F1EMModelNameReleaseDate
[]()0.9410.819Golden Transformer
mT5: A massively multilingual pre-trained text-to-text transformer✓ Link0.8440.543MT5 Large2020-10-22
[]()0.830.561ruRoberta-large finetune
[]()0.8150.537ruT5-large-finetune
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark✓ Link0.8060.42Human Benchmark2020-10-29
[]()0.7690.446ruT5-base-finetune
[]()0.760.427ruBert-large finetune
[]()0.7420.399ruBert-base finetune
[]()0.740.546RuGPT3XL few-shot
[]()0.7290.333RuGPT3Large
[]()0.711 0.324RuBERT plain
[]()0.706 0.308RuGPT3Medium
[]()0.6870.278RuBERT conversational
[]()0.6730.364YaLM 1.0B few-shot
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.6710.237heuristic majority2021-05-03
[]()0.6530.221RuGPT3Small
[]()0.6460.327SBERT_Large
[]()0.6420.319SBERT_Large_mt_ru_finetuning
[]()0.639 0.239Multilingual Bert
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark✓ Link0.587 0.242Baseline TF-IDF1.12020-10-29
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.450.071Random weighted2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks0.0 0.0majority_class2021-05-03