linguistic-acceptability-on-cola

Linguistic Acceptability

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Accuracy	MCC	ModelName	ReleaseDate
Acceptability Judgements via Examining the Topology of Attention Maps	✓ Link	88.6%		En-BERT + TDA + PCA	2022-05-19
Can BERT eat RuCoLA? Topological Data Analysis to Explain	✓ Link	88.2%	0.726	BERT+TDA	2023-04-04
Can BERT eat RuCoLA? Topological Data Analysis to Explain	✓ Link	87.3%	0.695	RoBERTa+TDA	2023-04-04
tasksource: A Dataset Harmonization Framework for Streamlined NLP Multi-Task Learning and Evaluation	✓ Link	87.15%		deberta-v3-base+tasksource	2023-01-14
Entailment as Few-Shot Learner	✓ Link	86.4%		RoBERTa-large 355M + Entailment as Few-shot Learner	2021-04-29
Not all layers are equally as important: Every Layer Counts BERT		82.7		LTG-BERT-base 98M	2023-11-03
Not all layers are equally as important: Every Layer Counts BERT		82.6		ELC-BERT-base 98M	2023-11-03
Acceptability Judgements via Examining the Topology of Attention Maps	✓ Link	82.1%	0.565	En-BERT + TDA	2022-05-19
FNet: Mixing Tokens with Fourier Transforms	✓ Link	78%		FNet-Large	2021-05-09
Not all layers are equally as important: Every Layer Counts BERT		77.6		LTG-BERT-small 24M	2023-11-03
Not all layers are equally as important: Every Layer Counts BERT		76.1		ELC-BERT-small 24M	2023-11-03
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	✓ Link	70.8%		T5-11B	2019-10-23
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding		69.2%		StructBERTRoBERTa ensemble	2019-08-13
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations	✓ Link	69.1%		ALBERT	2019-09-26
XLNet: Generalized Autoregressive Pretraining for Language Understanding	✓ Link	69%		XLNet (single model)	2019-06-19
Learning to Encode Position for Transformer with Continuous Dynamical Model	✓ Link	69%		FLOATER-large	2020-03-13
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale	✓ Link	68.6%		RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)	2022-08-15
Multi-Task Deep Neural Networks for Natural Language Understanding	✓ Link	68.4%		MT-DNN	2019-01-31
[]()		68.2%		ELECTRA
RoBERTa: A Robustly Optimized BERT Pretraining Approach	✓ Link	67.8%		RoBERTa (ensemble)	2019-07-26
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks	✓ Link	67.5		PSQ (Chen et al., 2020)	2020-10-27
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	✓ Link	67.1%		T5-XL 3B	2019-10-23
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT		65.1		Q-BERT (Shen et al., 2020)	2019-09-12
Q8BERT: Quantized 8Bit BERT	✓ Link	65.0		Q8BERT (Zafrir et al., 2019)	2019-10-14
SpanBERT: Improving Pre-training by Representing and Predicting Spans	✓ Link	64.3%		SpanBERT	2019-07-24
CLEAR: Contrastive Learning for Sentence Representation		64.3%		MLM+ del-span+ reorder	2020-12-31
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding	✓ Link	63.5%		ERNIE 2.0 Large	2019-07-29
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	✓ Link	61.2%		T5-Large 770M	2019-10-23
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	✓ Link	60.5%		BERT-LARGE	2018-10-11
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language	✓ Link	60.3%		data2vec	2022-02-07
RealFormer: Transformer Likes Residual Attention	✓ Link	59.83%		RealFormer	2020-12-21
Big Bird: Transformers for Longer Sequences	✓ Link	58.5%		BigBird	2020-07-28
How to Train BERT with an Academic Budget	✓ Link	57.1		24hBERT	2021-04-15
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding	✓ Link	55.2%		ERNIE 2.0 Base	2019-07-29
ERNIE: Enhanced Language Representation with Informative Entities	✓ Link	52.3%		ERNIE	2019-05-17
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization	✓ Link	51.8%		Charformer-Tall	2021-06-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	✓ Link	51.1%		T5-Base	2019-10-23
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	✓ Link	49.1%		DistilBERT 66M	2019-10-02
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?	✓ Link	46.5%		SqueezeBERT	2020-06-19
TinyBERT: Distilling BERT for Natural Language Understanding	✓ Link	43.3%		TinyBERT-4 14.5M	2019-09-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	✓ Link	41.0%		T5-Small	2019-10-23
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning	✓ Link	14.1%		LM-CPPF RoBERTa-base	2023-05-29
RuCoLA: Russian Corpus of Linguistic Acceptability	✓ Link		0.6	RemBERT	2022-10-23

OpenCodePapers

linguistic-acceptability-on-cola