temporal-action-localization-on-thumos14

Action LocalizationTemporal Action Localization

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Avg mAP (0.3:0.7)	mAP IOU@0.1	mAP IOU@0.2	mAP IOU@0.3	mAP IOU@0.4	mAP IOU@0.5	mAP IOU@0.6	mAP IOU@0.7	ModelName	ReleaseDate
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames	✓ Link	76.9			89.7	86.7	80.9	71.0	56.1	AdaTAD (VideoMAEv2-giant)	2023-11-28
Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism	✓ Link	74.2			88.7	84.6	78.2	66.6	51.9	RDFA-S6 (InternVideo2-6B)	2024-07-18
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding	✓ Link	72.72			86.89	83.09	76.90	65.91	50.82	ActionMamba(InternVideo2-6B)	2024-03-14
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding	✓ Link	72.0								InternVideo2-6B	2024-03-22
InternVideo: General Video Foundation Models via Generative and Discriminative Learning	✓ Link	71.58								ActionFormer (InternVideo features)	2022-12-06
Temporal Action Localization with Enhanced Instant Discriminability	✓ Link	70.1			84.8	80.0	73.3	63.8	48.8	TriDet (VideoMAE v2-g feature)	2023-09-11
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding	✓ Link	69.8								InternVideo2-1B	2024-03-22
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking	✓ Link	69.6			84.0	79.6	73.0	63.5	47.7	ActionFormer (VideoMAE V2-g features)	2023-03-29
TriDet: Temporal Action Detection with Relative Boundary Modeling	✓ Link	69.3			83.6	80.1	72.9	62.4	47.4	TriDet (I3D features)	2023-03-13
Action Sensitivity Learning for Temporal Action Localization		67.9			83.1	79.0	71.7	59.7	45.8	ASL(I3D features)	2023-05-25
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization	✓ Link	67.7			82.8	78.9	71.8	60.5	44.7	TemporalMaxer (I3D features)	2023-03-16
ActionFormer: Localizing Moments of Actions with Transformers	✓ Link	66.8			82.1	77.8	71.0	59.4	43.9	ActionFormer (I3D features)	2022-02-16
Dual DETRs for Multi-Label Temporal Action Detection		66.8			82.9	78.0	70.4	58.5	44.4	DualDETR (I3D features)	2024-03-31
TadML: A fast temporal action detection with Mechanics-MLP	✓ Link	59.70			73.29	69.73	62.53	53.36	39.60	TadML(two-stream)	2022-06-07
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection	✓ Link	59.6			75.5	70.8	63.5	50.9	37.4	BasicTAD (160,6,192,R50-SlowOnly)	2022-05-05
End-to-end Temporal Action Detection with Transformer	✓ Link	56.7			74.8	69.1	60.1	46.6	32.8	TadTR	2021-06-18
ReAct: Temporal Action Detection with Relational Queries	✓ Link	55.0			69.2	65.0	57.1	47.8	35.6	ReAct (TSN features)	2022-07-14
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection	✓ Link	54.9			68.4	65.0	58.6	49.2	33.5	BasicTAD (112,3,96,R50-SlowOnly)	2022-05-05
An Empirical Study of End-to-End Temporal Action Detection	✓ Link	54.2			69.4	64.3	56.0	46.4	34.9	E2E-TAD (SlowFast R50+TadTR)	2022-04-06
TadML: A fast temporal action detection with Mechanics-MLP	✓ Link	53.46			68.78	64.66	56.61	45.40	31.88	TadML(rgb-only)	2022-06-07
Multi-shot Temporal Event Localization: a Benchmark	✓ Link	53.4			68.9	64.0	56.9	46.3	31.0	MUSES	2020-12-17
Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action Localization	✓ Link	53.3			70.1	64.9	57.1	45.4	28.8	AVFusion	2021-06-27
Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning	✓ Link	52.8			68.6	63.8	57.0	46.3	31.8	TAGS (I3D)	2022-07-14
DCAN: Improving Temporal Action Detection via Dual Context Aggregation	✓ Link	52.3			68.2	62.7	54.1	43.9	32.6	DCAN (TSN features)	2021-12-07
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks	✓ Link	50.46	74.02	72.29	69.1	63.3	53.5	40.4	26	TSP	2020-11-23
Video Self-Stitching Graph Network for Temporal Action Localization	✓ Link	50.2			66.7	60.4	52.4	41.0	30.4	VSGN	2020-11-30
RGB Stream Is Enough for Temporal Action Detection	✓ Link	50.0			62.8	59.5	53.8	43.6	30.1	DaoTAD	2021-07-09
Decoupling Localization and Classification in Single Shot Temporal Action Detection	✓ Link	42.0			60.2	54.1	44.2	32.3	19.1	Decouple-SSAD	2019-04-16
Rethinking the Faster R-CNN Architecture for Temporal Action Localization		39.8	59.8	57.1	53.2	48.5	42.8	33.8	20.8	TAL-Net	2018-04-20
Graph Convolutional Module for Temporal Action Localization in Videos			72.5	70.9	66.5	60.8	51.9			GCM	2021-12-01
Activity Graph Transformer for Temporal Action Localization			72.1	69.8	65	58.1	50.2			AGT (Ours)	2021-01-21
Graph Convolutional Networks for Temporal Action Localization	✓ Link		69.5	67.8	63.6	57.8	49.1			P-GCN	2019-09-07
Weakly Supervised Temporal Action Localization Using Deep Metric Learning	✓ Link		62.3		46.8		29.6		9.7	DeepMetricLearner	2020-01-21
Cascaded Boundary Regression for Temporal Action Detection			60.1	56.7	50.1	41.3	31	19.1	9.9	CBR-TS	2017-05-02
R-C3D: Region Convolutional 3D Network for Temporal Activity Detection	✓ Link		54.5	51.5	44.8	35.6	28.9			R-C3D	2017-03-22
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals	✓ Link		54	50.9	44.1	34.9	25.6			TURN-FL-16 + S-CNN	2017-03-17
End-to-end Learning of Action Detection from Frame Glimpses in Videos	✓ Link		48.9	44.0	36.0	26.4	17.1			Yeung et al.	2015-11-22
Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs	✓ Link		47.7	43.5	36.3	28.7	19			S-CNN	2016-01-09
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation	✓ Link				53.5	45	36.9	28.4	20	BSN UNet	2018-06-08
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos	✓ Link				40.1	29.4	23.3	13.1	7.9	CDC	2017-03-04
G-TAD: Sub-Graph Localization for Temporal Action Detection	✓ Link						40.2			G-TAD	2019-11-26
BMN: Boundary-Matching Network for Temporal Action Proposal Generation	✓ Link						32.2			BMN	2019-07-23

OpenCodePapers

temporal-action-localization-on-thumos14