OpenCodePapers

vision-and-language-navigation-on-touchdown

Vision and Language Navigation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTask Completion (TC)ModelNameReleaseDate
FLAME: Learning to Navigate with Multimodal LLM in Urban Environments✓ Link40.20FLAME2024-08-20
Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas✓ Link29.1ORAR + junction type + heading delta2022-03-25
Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas✓ Link24.2ORAR2022-03-25
Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation16.68ARC + L2STOP2020-09-28
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation✓ Link16.2VLN Transformer +M-50 +style2020-07-01
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation✓ Link14.9VLN Transformer2020-07-01
Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation14.13ARC2020-09-28
Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View✓ Link12.8Retouch-RConcat2020-01-10
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation✓ Link11.9Gated Attention (GA)2020-07-01
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation✓ Link11.8RConcat2020-07-01
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments✓ Link10.7RConcat2018-11-29
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments✓ Link5.5Gated Attention (GA)2018-11-29