Paper | Code | Average F1 | EM | ModelName | ReleaseDate |
---|---|---|---|---|---|
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | ✓ Link | 0.93 | 0.89 | Human Benchmark | 2020-10-29 |
[]() | 0.92 | 0.924 | Golden Transformer | ||
[]() | 0.86 | 0.859 | YaLM 1.0B few-shot | ||
[]() | 0.81 | 0.764 | ruT5-large-finetune | ||
[]() | 0.79 | 0.752 | ruT5-base-finetune | ||
[]() | 0.74 | 0.716 | ruBert-base finetune | ||
[]() | 0.73 | 0.716 | ruRoberta-large finetune | ||
[]() | 0.68 | 0.658 | ruBert-large finetune | ||
[]() | 0.67 | 0.665 | RuGPT3XL few-shot | ||
mT5: A massively multilingual pre-trained text-to-text transformer | ✓ Link | 0.57 | 0.562 | MT5 Large | 2020-10-22 |
[]() | 0.36 | 0.351 | SBERT_Large | ||
[]() | 0.35 | 0.347 | SBERT_Large_mt_ru_finetuning | ||
[]() | 0.32 | 0.314 | RuBERT plain | ||
[]() | 0.29 | 0.29 | Multilingual Bert | ||
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | 0.26 | 0.257 | heuristic majority | 2021-05-03 | |
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | ✓ Link | 0.26 | 0.252 | Baseline TF-IDF1.1 | 2020-10-29 |
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | 0.25 | 0.247 | Random weighted | 2021-05-03 | |
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | 0.25 | 0.247 | majority_class | 2021-05-03 | |
[]() | 0.23 | 0.224 | RuGPT3Medium | ||
[]() | 0.22 | 0.218 | RuBERT conversational | ||
[]() | 0.21 | 0.204 | RuGPT3Small | ||
[]() | 0.21 | 0.202 | RuGPT3Large |