Paper | Code | ndtw | ModelName | ReleaseDate |
---|---|---|---|---|
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning | 66.76 | MARVAL | 2022-10-06 | |
EnvEdit: Environment Editing for Vision-and-Language Navigation | ✓ Link | 64.61 | EnvEdit-PT | 2022-03-29 |
History Aware Multimodal Transformer for Vision-and-Language Navigation | ✓ Link | 59.94 | HAMT | 2021-10-25 |
How Much Can CLIP Benefit Vision-and-Language Tasks? | ✓ Link | 53.69 | CLEAR-CLIP | 2021-07-13 |
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding | ✓ Link | 41.05 | Monolingual Baseline | 2020-10-15 |
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding | ✓ Link | 36.81 | Multilingual Baseline | 2020-10-15 |