OpenCodePapers

long-context-understanding-on-ada-leval-tsort

Long-Context Understanding

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	2k	4k	8k	16k	32k	64k	128k	ModelName	ReleaseDate
GPT-4 Technical Report	✓ Link	18.5	15.5	7.5	3.5	6.0	6.0	6.0	GPT-4-Turbo-1106	2023-03-15
GPT-4 Technical Report	✓ Link	15.5	16.5	8.5	5.5	2.0	4.0	2.0	GPT-4-Turbo-0125	2023-03-15
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena	✓ Link	5.4	5.0	2.4	3.1				Vicuna-13b-v1.5-16k	2023-06-09
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena	✓ Link	5.3	5.0	3.1	2.5				LongChat-7b-v1.5-32k	2023-06-09
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena	✓ Link	5.3	2.2	2.3	1.7				Vicuna-7b-v1.5-16k	2023-06-09
InternLM2 Technical Report	✓ Link	5.1	3.9	5.1	4.3				InternLM2-7b	2024-03-26
[]()		5.0	5.0	4.5	3.0	0.0	0.0		Claude-2
[]()		4.0	4.5	4.5	5.5				GPT-3.5-Turbo-1106
GLM-130B: An Open Bilingual Pre-trained Model	✓ Link	2.3	2.4	2.0	0.7				ChatGLM3-6b-32k	2022-10-05
GLM-130B: An Open Bilingual Pre-trained Model	✓ Link	0.9	0.2	0.7	0.9				ChatGLM2-6b-32k	2022-10-05