Paper | Code | val mAP | test mAP | ModelName | ReleaseDate |
---|---|---|---|---|---|
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking | ✓ Link | 42.6 | VideoMAE V2-g | 2023-03-29 | |
End-to-End Spatio-Temporal Action Localisation with Video Transformers | 41.7 | STAR/L | 2023-04-24 | ||
InternVideo: General Video Foundation Models via Generative and Discriminative Learning | ✓ Link | 41.01 | InternVideo | 2022-12-06 | |
Relation Modeling in Spatio-Temporal Action Localization | 40.52 | RM (multi-scale, ensemble) | 2021-06-15 | ||
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization | ✓ Link | 40.49 | 39.62 | ACAR (multi-scale, ensemble) | 2020-06-14 |
Relation Modeling in Spatio-Temporal Action Localization | 37.95 | RM (multi-scale, ir-CSN-152) | 2021-06-15 | ||
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization | ✓ Link | 36.36 | ACAR (multi-scale, R-101, 8 × 8) | 2020-06-14 |