OpenCodePapers

long-context-understanding-on-ada-leval-tsort

Long-Context Understanding
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCode2k4k8k16k32k64k128kModelNameReleaseDate
GPT-4 Technical Report✓ Link18.515.57.53.56.06.06.0GPT-4-Turbo-11062023-03-15
GPT-4 Technical Report✓ Link15.516.58.55.52.04.02.0GPT-4-Turbo-01252023-03-15
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena✓ Link5.45.02.43.1Vicuna-13b-v1.5-16k2023-06-09
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena✓ Link5.35.03.12.5LongChat-7b-v1.5-32k2023-06-09
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena✓ Link5.32.22.31.7Vicuna-7b-v1.5-16k2023-06-09
InternLM2 Technical Report✓ Link5.13.95.14.3InternLM2-7b2024-03-26
[]()5.05.04.53.00.00.0Claude-2
[]()4.04.54.55.5GPT-3.5-Turbo-1106
GLM-130B: An Open Bilingual Pre-trained Model✓ Link2.32.42.00.7ChatGLM3-6b-32k2022-10-05
GLM-130B: An Open Bilingual Pre-trained Model✓ Link0.90.20.70.9ChatGLM2-6b-32k2022-10-05