Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.557 | | Test-Time Fine-Tuning with SIFT + Llama-3.2 (3B) | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.595 | | Test-Time Fine-Tuning with SIFT + Phi-3 (3.8B) | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.606 | | Test-Time Fine-Tuning with SIFT + Llama-3.2 (1B) | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.629 | | Gemma-2 27B | 2024-10-10 |
GLM-130B: An Open Bilingual Pre-trained Model | ✓ Link | 0.634 | | GLM-130B | 2022-10-05 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.640 | | Llama-3.2 3B | 2024-10-10 |
GLM-130B: An Open Bilingual Pre-trained Model | ✓ Link | 0.65 | | Jurassic-1 | 2022-10-05 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.651 | | Phi-3 14B | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.670 | | Gemma-2 9B | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.678 | | Phi-3 7B | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.679 | | Phi-3 3.8B | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.697 | | Llama-3.2 1B | 2024-10-10 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 0.7177 | | GPT-3 Davinci 175B (pre-trained) | 2020-12-31 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.721 | | Gemma-2 2B | 2024-10-10 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.737 | | Llama-3.2-Instruct 3B | 2024-10-10 |
GLM-130B: An Open Bilingual Pre-trained Model | ✓ Link | 0.742 | | GPT-3 | 2022-10-05 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.762 | | Test-Time Fine-Tuning with SIFT + GPT-2 (774M) | 2024-10-10 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 0.7980 | | GPT-3 Curie 6.7B (pre-trained) | 2020-12-31 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.807 | | Llama-3.2-Instruct 1B | 2024-10-10 |
Test-Time Training on Nearest Neighbors for Large Language Models | ✓ Link | 0.85 | | GPT-2 Large 774M (test-time training on nearest neighbors) | 2023-05-29 |
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs | ✓ Link | 0.862 | | Test-Time Fine-Tuning with SIFT + GPT-2 (124M) | 2024-10-10 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 0.8718 | | GPT-3 Babbage 1.3B (pre-trained) | 2020-12-31 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 0.9631 | | GPT-3 Ada 350M (pre-trained) | 2020-12-31 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 1.0468 | | GPT-2 XL 1.5B (pre-trained) | 2020-12-31 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 1.0828 | | GPT-2 Large 774M (pre-trained) | 2020-12-31 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 1.0928 | | GPT-2 Medium 355M (pre-trained) | 2020-12-31 |
The Pile: An 800GB Dataset of Diverse Text for Language Modeling | ✓ Link | 1.2253 | | GPT-2 Small 124M (pre-trained) | 2020-12-31 |
Need a Small Specialized Language Model? Plan Early! | | | 10 | Larger Transformer 771M (fine-tuned) | 2024-02-02 |
Hungry Hungry Hippos: Towards Language Modeling with State Space Models | ✓ Link | | 10.2 | Hybrid H3 125M | 2022-12-28 |
Knowledge Unlearning for Mitigating Privacy Risks in Language Models | ✓ Link | | 10.44 | GPT-Neo 2.7B | 2022-10-04 |
Hungry Hungry Hippos: Towards Language Modeling with State Space Models | ✓ Link | | 10.7 | Transformer 125M | 2022-12-28 |
Knowledge Unlearning for Mitigating Privacy Risks in Language Models | ✓ Link | | 11.46 | GPT-Neo 1.3B | 2022-10-04 |
Need a Small Specialized Language Model? Plan Early! | | | 12 | Smaller Transformer 126M (fine-tuned) | 2024-02-02 |
Knowledge Unlearning for Mitigating Privacy Risks in Language Models | ✓ Link | | 17.81 | OPT 2.7B | 2022-10-04 |
Knowledge Unlearning for Mitigating Privacy Risks in Language Models | ✓ Link | | 17.83 | GPT-Neo 125M | 2022-10-04 |
Knowledge Unlearning for Mitigating Privacy Risks in Language Models | ✓ Link | | 19.55 | OPT 1.3B | 2022-10-04 |
Need a Small Specialized Language Model? Plan Early! | | | 28.1 | Larger Transformer 771M (pre-trained) | 2024-02-02 |
Knowledge Unlearning for Mitigating Privacy Risks in Language Models | ✓ Link | | 32.26 | OPT 125M | 2022-10-04 |
Need a Small Specialized Language Model? Plan Early! | | | 33 | Smaller Transformer 126M (pre-trained) | 2024-02-02 |