OpenCodePapers

text-to-sql-on-bird-big-bench-for-large-scale

Text-To-SQL
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeExecution Accuracy % (Test)Execution Accuracy % (Dev)Execution Accurarcy (Human)ModelNameReleaseDate
A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL✓ Link75.6373.34XiYan-SQL2024-11-13
[]()74.1274.32DSAIR + GPT-4o
CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL74.0673.14CHASE-SQL + Gemini2024-10-02
[]()73.1772.43ExSL + granite-34b-code
[]()72.2869.3OpenSearch-SQL+ v2 + GPT-4o
The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models71.8367.21Distillery + GPT-4o2024-08-14
[]()70.2672.16Insights AI
[]()70.2168.12PURPLE + RED + GPT-4o
[]()69.4068.91MCTS-SQL
[]()69.0366.95RECAP + Gemini
[]()68.8765.45ByteBrain
[]()67.8665.38ExSL + granite-20b-code
CHESS: Contextual Harnessing for Efficient SQL Synthesis✓ Link66.6965CHESS2024-05-27
[]()66.2167.99Arcwise + GPT-4o
[]()65.4563.36MCS-SQL + GPT-4
[]()65.2364.73SCL-SQL
[]()64.9561.34OpenSearch-SQL v1 + GPT-4
[]()64.8460.5PB-SQL v1
[]()64.5162.97PURPLE + GPT-4o
[]()64.0066.82MSL-SQL + DeepSeek-V2.5
[]()63.3955.48SENSE-13B
[]()63.3955.48SENSE
[]()63.2262.58GRA-SQL
[]()62.6658.5SuperSQL
[]()60.7159.71Dubo-SQL, v1
[]()60.3758.47SFT CodeS-15B
MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL✓ Link59.5957.56MAC-SQL + GPT-42023-12-18
[]()59.2557.17SFT CodeS-7B
Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation✓ Link57.4154.76DAIL-SQL + GPT-42023-08-29
DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction✓ Link55.9050.72DIN-SQL + GPT-42023-04-21
Can LLMs Effectively Leverage Graph Structural Information through Prompts, and Why?✓ Link54.8946.35GPT-4 (Baseline)2023-09-28
Can LLMs Effectively Leverage Graph Structural Information through Prompts, and Why?✓ Link49.0242.70Claude-2 (Baseline)2023-09-28
[]()47.7437.68Open SQL-7B
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs✓ Link40.0836.64CoT + ChatGPT2023-05-04
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs✓ Link39.3037.22ChatGPT (Baseline)2023-05-04
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs✓ Link36.4734.35Codex (Baseline)2023-05-04
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs✓ Link33.0427.38Palm-2 (Baseline)2023-05-04
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation✓ Link65.6MSc-SQL2024-10-16
[]()64.62SFT CodeS-15B + SQLFixAgent
Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM✓ Link48.92DELLM + MAC-SQL2024-02-18
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs✓ Link92.96Human Performance2023-05-04