OpenCodePapers

semantic-textual-similarity-on-mrpc

Semantic Textual Similarity
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyF1ModelNameReleaseDate
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ Link93.7%91.7MT-DNN-SMART2019-11-08
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations✓ Link93.4%ALBERT2019-09-26
RoBERTa: A Robustly Optimized BERT Pretraining Approach✓ Link92.3%RoBERTa (ensemble)2019-07-26
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding91.5%93.6%StructBERTRoBERTa ensemble2019-08-13
Learning to Encode Position for Transformer with Continuous Dynamical Model✓ Link91.4%FLOATER-large2020-03-13
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ Link91.3%SMART2019-11-08
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale✓ Link91.0%RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)2022-08-15
SpanBERT: Improving Pre-training by Representing and Predicting Spans✓ Link90.9%SpanBERT2019-07-24
XLNet: Generalized Autoregressive Pretraining for Language Understanding✓ Link90.8%XLNet (single model)2019-06-19
AutoBERT-Zero: Evolving BERT Backbone from Scratch90.7%AutoBERT-Zero (Large)2021-07-15
CLEAR: Contrastive Learning for Sentence Representation90.6%MLM+ del-word+ reorder2020-12-31
AutoBERT-Zero: Evolving BERT Backbone from Scratch90.5%AutoBERT-Zero (Base)2021-07-15
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks✓ Link90.4PSQ (Chen et al., 2020)2020-10-27
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter✓ Link90.2%DistilBERT 66M2019-10-02
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link90.0%91.9T5-11B2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link89.9%92.4T5-Large2019-10-23
Q8BERT: Quantized 8Bit BERT✓ Link89.7Q8BERT (Zafrir et al., 2019)2019-10-14
[]()89.6%ELECTRA
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link89.2%92.5T5-3B2019-10-23
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices✓ Link88.8%MobileBERT2020-04-06
ERNIE: Enhanced Language Representation with Informative Entities✓ Link88.2%ERNIE2019-05-17
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT88.2Q-BERT (Shen et al., 2020)2019-09-12
FNet: Mixing Tokens with Fourier Transforms✓ Link88%FNet-Large2021-05-09
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?✓ Link87.8%SqueezeBERT2020-06-19
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization✓ Link87.5%91.4Charformer-Tall2021-06-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link87.5%90.7T5-Base2019-10-23
How to Train BERT with an Academic Budget✓ Link87.5%24hBERT2021-04-15
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding✓ Link87.4%ERNIE 2.0 Large2019-07-29
TinyBERT: Distilling BERT for Natural Language Understanding✓ Link87.3%TinyBERT-6 67M2019-09-23
RealFormer: Transformer Likes Residual Attention✓ Link87.01%90.91%RealFormer2020-12-21
SubRegWeigh: Effective and Efficient Annotation Weighing with Subword Regularization✓ Link86.82%RoBERTa + SubRegWeigh (K-means)2024-09-10
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer✓ Link86.6%89.7T5-Small2019-10-23
TinyBERT: Distilling BERT for Natural Language Understanding✓ Link86.4%TinyBERT-4 14.5M2019-09-23
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding✓ Link86.1%ERNIE 2.0 Base2019-07-29
Discriminative Improvements to Distributional Sentence Similarity80.4%85.9%TF-KLD2013-10-01
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning✓ Link78.6%84.4%GenSen2018-03-30
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data✓ Link76.2%83.1%InferSent2017-05-05
Big Bird: Transformers for Longer Sequences✓ Link91.5BigBird2020-07-28
Entailment as Few-Shot Learner✓ Link91.0RoBERTa-large 355M + Entailment as Few-shot Learner2021-04-29
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding✓ Link89.3BERT-LARGE2018-10-11
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention✓ Link88.1%Nyströmformer2021-02-07
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning✓ LinkBERT-Base2020-12-22
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning✓ LinkBERT-Large2020-12-22
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ LinkSMART-BERT2019-11-08
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ LinkSMARTRoBERTa2019-11-08