OpenCodePapers

long-context-understanding-on-ada-leval

Long-Context Understanding
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCode1k2k4k6k8k12k16k32k64k128kModelNameReleaseDate
GPT-4 Technical Report✓ Link74.073.567.559.553.549.544.016.00.00.0GPT-4-Turbo-11062023-03-15
GPT-4 Technical Report✓ Link73.573.565.563.056.552.044.530.00.00.0GPT-4-Turbo-01252023-03-15
[]()65.043.523.515.017.012.011.04.00.0Claude-2
[]()61.548.541.529.517.02.52.5GPT-3.5-Turbo-1106
InternLM2 Technical Report✓ Link58.649.533.912.313.42.00.80.50.50.0InternLM2-7b2024-03-26
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena✓ Link53.429.213.14.32.21.40.9Vicuna-13b-v1.5-16k2023-06-09
GLM-130B: An Open Bilingual Pre-trained Model✓ Link39.818.89.05.03.40.90.5ChatGLM3-6b-32k2022-10-05
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena✓ Link37.011.15.83.21.81.91.0Vicuna-7b-v1.5-16k2023-06-09
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena✓ Link32.410.75.73.11.91.60.8LongChat-7b-v1.5-32k2023-06-09
GLM-130B: An Open Bilingual Pre-trained Model✓ Link31.210.94.51.61.60.00.3ChatGLM2-6b-32k2022-10-05