OpenCodePapers

natural-language-inference-on-anli-test

Natural Language Inference
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeA1A2A3ModelNameReleaseDate
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}81.872.574.8T5-3B (explanation prompting)2023-05-01
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}75.660.659.9T0-11B (explanation prompting)2023-05-01
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective✓ Link7550.547.7InfoBERT (RoBERTa)2020-10-05
PaLM 2 Technical Report✓ Link73.163.467.1PaLM 2-L (one-shot)2023-05-17
RoBERTa: A Robustly Optimized BERT Pretraining Approach✓ Link72.449.844.4RoBERTa (Large)2019-07-26
Adversarial Training for Large Neural Language Models✓ Link72.352.148.4ALUM (RoBERTa-LARGE)2020-04-20
XLNet: Generalized Autoregressive Pretraining for Language Understanding✓ Link70.350.949.4XLNet (Large)2019-06-19
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets✓ Link62.352.654.1ChatGPT2023-05-29
PaLM 2 Technical Report✓ Link58.149.554.5PaLM 2-M (one-shot)2023-05-17
PaLM 2 Technical Report✓ Link53.148.853.2PaLM 2-S (one-shot)2023-05-17
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning✓ Link41.737.241.9T0-3B (CoT fine-tuned)2023-05-23
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners✓ Link39.9937.0537.73Flipped-3B2022-10-06
Language Models are Few-Shot Learners✓ Link36.83440.2GPT-32020-05-28
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models36.3035.0037.60KiC-770M2022-10-28
Exploring the Benefits of Training Expert Language Models over Instruction Tuning✓ Link35.4934.6431.22RoE-3B2023-02-07
BloombergGPT: A Large Language Model for Finance✓ Link33.633.835.17BLOOM 176B (one-shot)2023-03-30
BloombergGPT: A Large Language Model for Finance✓ Link33.134.234.92OPT 66B (one-shot)2023-03-30
BloombergGPT: A Large Language Model for Finance✓ Link32.934.437.33Bloomberg GPT (one-shot)2023-03-30
BloombergGPT: A Large Language Model for Finance✓ Link32.633.836.17GPT-NeoX (one-shot)2023-03-30
Large Language Models Can Self-Improve66.567.9PaLM 540B (Self Improvement, Self Consistency)2022-10-20
Large Language Models Can Self-Improve65.367.3PaLM 540B (Self Improvement, CoT Prompting)2022-10-20
Large Language Models Can Self-Improve64.866.9PaLM 540B (Self Improvement, Standard-Prompting)2022-10-20
Large Language Models Can Self-Improve64.563.4PaLM 540B (Self Consistency)2022-10-20
Large Language Models Can Self-Improve58.960.6PaLM 540B (CoT Prompting)2022-10-20
Large Language Models Can Self-Improve55.855.8PaLM 540B (Standard-Prompting)2022-10-20