OpenCodePapers
action-recognition-in-videos-on-activitynet
Action Recognition
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
mAP
↕
ModelName
ReleaseDate
↕
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
✓ Link
96.9
Text4Vis (w/ ViT-L)
2022-07-04
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
✓ Link
96.1
BIKE
2022-12-31
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
✓ Link
95.9
InternVideo2-6B
2024-03-22
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition
94.3
NSNet (w/ Swin-L)
2022-07-21
Temporal Saliency Query Network for Efficient Video Recognition
93.7
TSQNet (w/ Swin-L)
2022-07-21
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
✓ Link
90.5
DSANet (w/ 3D ResNet50)
2021-05-25
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition
90.05
MARL (w/ SEResNeXt-152)
2019-07-31
Listen to Look: Action Recognition by Previewing Audio
✓ Link
89.9
ListenToLook
2019-12-10
Dynamic Sampling Networks for Efficient Action Recognition in Videos
87.9
DSN
2020-06-28
SMART Frame Selection for Action Recognition
84.4
SMART
2020-12-19
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
84.0
Ada3D
2020-12-29
Fine-grained Video Categorization with Redundancy Reduction Attention
83.4
RRA
2018-10-26
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
✓ Link
78.9
P3D
2017-11-28
Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web
53.8
VGG19 + 393K webcam images
2015-12-22
Towards Universal Representation for Unseen Action Recognition
53.8
CD-UAR
2018-03-22
Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web
52.3
VGG19
2015-12-22