OpenCodePapers

question-answering-on-sqa3d

Question Answering
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAnswerExactMatch (Question Answering)ModelNameReleaseDate
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion✓ Link54.6CREMA2024-02-08
Situational Awareness Matters in 3D Vision Language Reasoning✓ Link52.6Situation3D2024-06-11
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding✓ Link50.7Lexicon3D2024-09-05
Frozen Transformers in Language Models Are Effective Visual Encoder Layers✓ Link48.09LM4VisualEncoding2023-10-19
SQA3D: Situated Question Answering in 3D Scenes✓ Link47.20ScanQA (w/ auxiliary loss)2022-10-14
SQA3D: Situated Question Answering in 3D Scenes✓ Link46.58ScanQA2022-10-14
Deep Modular Co-Attention Networks for Visual Question Answering✓ Link43.42MCAN2019-06-25