OpenCodePapers

natural-language-inference-on-snli

Natural Language Inference
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCode% Test Accuracy% Train AccuracyParametersDev Accuracy% Dev AccuracyAccuracyModelNameReleaseDate
First Train to Generate, then Generate to Train: UnitedSynT5 for Few-Shot NLI94.7UnitedSynT5 (3B)2024-12-12
First Train to Generate, then Generate to Train: UnitedSynT5 for Few-Shot NLI93.5UnitedSynT5 (335M)2024-12-12
Entailment as Few-Shot Learner✓ Link93.1355Neural Tree Indexers for Text Understanding2021-04-29
Entailment as Few-Shot Learner✓ Link93.1?355mEFL (Entailment as Few-shot Learner) + RoBERTa-large2021-04-29
Self-Explaining Structures Improve NLP Models✓ Link92.3340RoBERTa-large+Self-Explaining2020-12-03
Self-Explaining Structures Improve NLP Models✓ Link92.3?355m+RoBERTa-large + self-explaining layer2020-12-03
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data✓ Link92.192.6340mCA-MTL2020-09-19
Semantics-aware BERT for Language Understanding✓ Link91.994.4339mSemBERT2019-09-05
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ Link91.792.6MT-DNN-SMARTLARGEv02019-11-08
Multi-Task Deep Neural Networks for Natural Language Understanding✓ Link91.697.2330mMT-DNN2019-01-31
Explicit Contextual Semantics for Text Comprehension91.395.7308mSJRC (BERT-Large +SRL)2018-09-08
Multi-Task Deep Neural Networks for Natural Language Understanding✓ Link90.599.1220Ntumpha2019-01-31
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information90.195.053.3mDensely-Connected Recurrent and Co-Attentive Network Ensemble2018-05-29
What Do Questions Exactly Ask? MFAE: Duplicate Question Identification with Multi-Fusion Asking Emphasis✓ Link90.0793.18MFAE2020-05-07
Improving Language Understanding by Generative Pre-Training✓ Link89.996.685mFine-Tuned LM-Pretrained Transformer2018-06-11
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference✓ Link89.696.179m300D DMAN Ensemble2019-07-23
[]()89.696.179m300D DMAN Ensemble
Multiway Attention Networks for Modeling Sentence Pairs✓ Link89.495.558m150D Multiway Attention Network Ensemble2018-07-01
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference89.394.845m450D DR-BiLSTM Ensemble2018-02-15
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference89.392.517.5m300D CAFE Ensemble2017-12-30
Deep contextualized word representations✓ Link89.392.140mESIM + ELMo Ensemble2018-02-15
Neural Natural Language Inference Models Enhanced with External Knowledge✓ Link89.193.643mKIM Ensemble2017-11-12
Explicit Contextual Semantics for Text Comprehension89.189.16.1mSLRC2018-09-08
Simple and Effective Text Matching with Richer Alignment Features✓ Link88.994.02.8mRE22019-08-01
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information88.993.16.7mDensely-Connected Recurrent and Co-Attentive Network2018-05-29
DEIM: An effective deep encoding and interaction model for sentence matching88.992.622mDEIM2022-03-20
Natural Language Inference over Interaction Space✓ Link88.992.317m448D Densely Interactive Inference Network (DIIN, code) Ensemble2017-09-13
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference✓ Link88.895.49.2m300D DMAN2019-07-23
[]()88.895.49.2m300D DMAN
Bilateral Multi-Perspective Matching for Natural Language Sentences✓ Link88.893.26.4mBiMPM Ensemble2017-02-13
Deep contextualized word representations✓ Link88.791.68.0mESIM + ELMo2018-02-15
Neural Natural Language Inference Models Enhanced with External Knowledge✓ Link88.694.14.3mKIM2017-11-12
Enhanced LSTM for Natural Language Inference✓ Link88.693.57.7m600D ESIM + 300D Syntactic TreeLSTM2016-09-20
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference88.594.17.5m450D DR-BiLSTM2018-02-15
Stochastic Answer Networks for Natural Language Inference✓ Link88.593.33.5mStochastic Answer Network2018-04-21
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference88.589.84.7m300D CAFE2017-12-30
Multiway Attention Networks for Modeling Sentence Pairs✓ Link88.394.514m150D Multiway Attention Network2018-07-01
Learned in Translation: Contextualized Word Vectors✓ Link88.188.522mBiattentive Classification Network + CoVe + Char2017-08-01
Attention Boosted Sequential Inference Model88.1aESIM2018-12-05
Natural Language Inference over Interaction Space✓ Link88.091.24.4m448D Densely Interactive Inference Network (DIIN, code)2017-09-13
Enhanced LSTM for Natural Language Inference✓ Link88.0Enhanced Sequential Inference Model (Chen et al., [2017a])2016-09-20
Bilateral Multi-Perspective Matching for Natural Language Sentences✓ Link87.590.91.6mBiMPM2017-02-13
[]()87.590.72.0m300D re-read LSTM
Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition87.590.72.0m300D re-read LSTM2016-12-01
Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding✓ Link87.489.07.0m2400D Multiple-Dynamic Self-Attention Model2018-08-22
Neural Tree Indexers for Text Understanding✓ Link87.388.53.2m300D Full tree matching NTI-SLSTM-LSTM w/ global attention2016-07-15
Cell-aware Stacked LSTMs for Modeling Sentences87300D 2-layer Bi-CAS-LSTM2018-09-07
A Decomposable Attention Model for Natural Language Inference✓ Link86.890.5580k200D decomposable attention feed-forward model with intra-sentence attention2016-06-06
A Decomposable Attention Model for Natural Language Inference✓ Link86.890.5580k200D decomposable attention model with intra-sentence attention2016-06-06
Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding✓ Link86.887.32.1m600D Dynamic Self-Attention Model2018-08-22
Parameter Re-Initialization through Cyclical Batch Size Schedules86.73CBS-1 + ESIM2018-12-04
Dynamic Meta-Embeddings for Improved Sentence Representations✓ Link86.791.69m512D Dynamic Meta-Embeddings2018-04-21
Enhancing Sentence Embedding with Generalized Pooling✓ Link86.694.965m600D BiLSTM with generalized pooling2018-06-26
Sentence Embeddings in NLI with Iterative Refinement Encoders✓ Link86.689.922m600D Hierarchical BiLSTM with Max Pooling (HBMP, code)2018-08-27
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information86.591.45.6mDensely-Connected Recurrent and Co-Attentive Network (encoder)2018-05-29
Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling✓ Link86.392.63.1m300D Reinforced Self-Attention Network2018-01-31
Distance-based Self-Attention Network for Natural Language Inference86.389.64.7mDistance-based Self-Attention Network2017-12-06
A Decomposable Attention Model for Natural Language Inference✓ Link86.389.5380k200D decomposable attention feed-forward model2016-06-06
A Decomposable Attention Model for Natural Language Inference✓ Link86.389.5380k200D decomposable attention model2016-06-06
Long Short-Term Memory-Networks for Machine Reading✓ Link86.388.53.4m450D LSTMN with deep attention fusion2016-01-25
Learning Natural Language Inference with LSTM✓ Link86.192.01.9m300D mLSTM word-by-word attention model2015-12-30
Learning to Compose Task-Specific Tree Structures✓ Link86.093.110m600D Gumbel TreeLSTM encoders2017-07-10
Shortcut-Stacked Sentence Encoders for Multi-Domain Inference✓ Link86.091.029m600D Residual stacked encoders2017-08-07
Star-Transformer✓ Link86.0Star-Transformer (no cross sentence attention)2019-02-25
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference85.987.33.7m300D CAFE (no cross-sentence attention)2017-12-30
[]()85.91200D REGMAPR (Base+Reg)
Shortcut-Stacked Sentence Encoders for Multi-Domain Inference✓ Link85.789.89.7m300D Residual stacked encoders2017-08-07
Long Short-Term Memory-Networks for Machine Reading✓ Link85.787.31.7m300D LSTMN with deep attention fusion2016-01-25
Learning to Compose Task-Specific Tree Structures✓ Link85.691.22.9m300D Gumbel TreeLSTM encoders2017-07-10
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding✓ Link85.691.12.4m300D Directional self-attention network encoders2017-09-14
Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference✓ Link85.590.512m600D (300+300) Deep Gated Attn. BiLSTM encoders2017-08-04
Neural Semantic Encoders✓ Link85.486.93.2m300D MMA-NSE encoders with attention2016-07-14
Modelling Interaction of Sentence Pair with coupled-LSTMs85.186.7190k50D stacked TC-LSTMs2016-05-18
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention✓ Link85.085.92.8m600D (300+300) BiLSTM encoders with intra-attention and symbolic preproc.2016-05-30
Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News✓ Link84.8Stacked Bi-LSTMs (shortcut connections, max-pooling)2018-11-02
Neural Semantic Encoders✓ Link84.686.23.0m300D NSE encoders2016-07-14
Deep Fusion LSTMs for Text Semantic Matching84.685.2320k100D DF-LSTM2016-08-01
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data✓ Link84.585.640m4096D BiLSTM with max-pooling2017-05-05
Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News✓ Link84.5Bi-LSTM sentence encoder (max-pooling)2018-11-02
Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News✓ Link84.4Stacked Bi-LSTMs (shortcut connections, max-pooling, attention)2018-11-02
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention✓ Link84.284.52.8m600D (300+300) BiLSTM encoders with intra-attention2016-05-30
Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms✓ Link83.8SWEM-max2018-05-24
Reasoning about Entailment with Neural Attention✓ Link83.585.3250k100D LSTMs w/ word-by-word attention2015-09-22
Neural Tree Indexers for Text Understanding✓ Link83.482.54.0m300D NTI-SLSTM-LSTM encoders2016-07-15
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention✓ Link83.386.42.0m600D (300+300) BiLSTM encoders2016-05-30
A Fast Unified Model for Parsing and Sentence Understanding✓ Link83.289.23.7m300D SPINN-PI encoders2016-03-19
Natural Language Inference by Tree-Based Convolution and Heuristic Matching82.183.33.5m300D Tree-based CNN encoders2015-12-28
Order-Embeddings of Images and Language✓ Link81.498.815m1024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training2015-11-19
DELTA: A DEep learning based Language Technology plAtform✓ Link80.7DELTA (LSTM)2019-08-02
A Fast Unified Model for Parsing and Sentence Understanding✓ Link80.683.93.0m300D LSTM encoders2016-03-19
A large annotated corpus for learning natural language inference✓ Link78.299.7+ Unigram and bigram features2015-08-21
A large annotated corpus for learning natural language inference✓ Link77.684.8220k100D LSTM encoders2015-08-21
A large annotated corpus for learning natural language inference✓ Link50.449.4Unlexicalized features2015-08-21
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ Link91.6MT-DNN-SMART_100%ofTrainingData2019-11-08
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ Link88.7MT-DNN-SMART_10%ofTrainingData2019-11-08
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ Link86MT-DNN-SMART_1%ofTrainingData2019-11-08
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization✓ Link82.7MT-DNN-SMART_0.1%ofTrainingData2019-11-08
SplitEE: Early Exit in Deep Neural Networks with Split Computing✓ Link79.0SplitEE-S2023-09-17