Token Turing Machines | ✓ Link | 28.79 | TTM | 2022-11-16 |
CTRN: Class-Temporal Relational Network for Action Detection | | 27.8 | CTRN | 2021-10-26 |
Weakly-guided Self-supervised Pretraining for Temporal Activity Detection | ✓ Link | 26.95 | Coarse-Fine Networks (w/ self-supervised detection pretraining) | 2021-11-26 |
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection | ✓ Link | 26.53 | UniMD+Sync. (RGB+Flow) | 2024-04-07 |
PDAN: Pyramid Dilated Attention Network for Action Detection | ✓ Link | 26.5 | PDAN (RGB+Flow) | 2021-01-05 |
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection | | 26.5 | PAT | 2023-08-09 |
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection | ✓ Link | 25.4 | MS-TCT (RGB only) | 2021-12-07 |
AViD Dataset: Anonymized Videos from Diverse Countries | ✓ Link | 25.2 | 3D ResNet-50 + super-events pretrained on AViD | 2020-07-10 |
Coarse-Fine Networks for Temporal Activity Detection in Videos | ✓ Link | 25.1 | Coarse-Fine Networks | 2021-03-01 |
Representation Learning on Visual-Symbolic Graphs for Video Understanding | | 23.7 | I3D + biGRU + VS-ST-MPNN | 2019-05-17 |
Modeling Multi-Label Action Dependencies for Temporal Action Localization | ✓ Link | 23.7 | MLAD (RGB + Flow) | 2021-03-04 |
AViD Dataset: Anonymized Videos from Diverse Countries | ✓ Link | 23.2 | 3D ResNet-50 pretrained on AViD | 2020-07-10 |
Temporal Gaussian Mixture Layer for Videos | ✓ Link | 22.3 | TGM (RGB+Flow) | 2018-03-16 |
Learning Latent Super-Events to Detect Multiple Activities in Videos | ✓ Link | 19.41 | Super-events (RGB+Flow) | 2017-12-05 |
R-C3D: Region Convolutional 3D Network for Temporal Activity Detection | ✓ Link | 12.4 | R-C3D | 2017-03-22 |
Asynchronous Temporal Fields for Action Recognition | ✓ Link | 9.6 | Sigurdsson et al. | 2016-12-19 |