zero-shot-action-recognition-on-hmdb51

Action RecognitionZero-Shot Action Recognition

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Top-1 Accuracy	Top-5 Accuracy	Accuracy	ModelName	ReleaseDate
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models		64.7			MOV (ViT-L/14)	2022-07-15
Orthogonal Temporal Interpolation for Zero-Shot Video Recognition	✓ Link	64			OTI(ViT-L/14)	2023-08-14
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models	✓ Link	61.4			BIKE	2022-12-31
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models		60.8			MOV (ViT-B/16)	2022-07-15
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception		59.1			IMP-MoE-L	2023-05-10
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners		58.7	84.5		VideoCoCa	2022-12-09
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition	✓ Link	58.4			Text4Vis	2022-07-04
Leveraging Temporal Contextualization for Video Action Recognition	✓ Link	56.0			TC-CLIP	2024-04-15
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition	✓ Link	55.9			OST	2023-11-30
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge	✓ Link	52.3			MAXI	2023-03-15
VicTR: Video-conditioned Text Representations for Activity Recognition		51.0			VicTR (ViT-B/16)	2023-04-05
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition	✓ Link	50.7			LoCATe-GAT	2024-11-27
Expanding Language-Image Pretrained Models for General Video Recognition	✓ Link	44.6			X-CLIP	2022-08-04
CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition		43.2			CLASTER	2021-01-18
Cross-modal Representation Learning for Zero-shot Action Recognition		41.1			ResT	2022-05-03
Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification	✓ Link	39			AURL	2022-03-29
Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions	✓ Link	38.7			JigsawNet	2022-03-28
Synthetic Sample Selection for Generalized Zero-Shot Learning		35.9			SPOT	2023-04-06
Elaborative Rehearsal for Zero-shot Action Recognition	✓ Link	35.3			ER-ZSAR	2021-08-05
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications	✓ Link	32.7			E2E	2020-03-03
Towards Universal Representation for Unseen Action Recognition		24.4			UR	2018-03-22
I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs	✓ Link	23.2			TS-GCN	2019-07-17
Zero-Shot Action Recognition With Error-Correcting Output Codes		22.6			ZSECOC	2017-07-01
Alternative Semantic Representations for Zero-Shot Human Action Recognition		21.8			ASR	2017-06-28
Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation		19.7			MTE	2016-11-26
[]()		18.5			ESZSL
Objects2action: Classifying and localizing actions without any video example		15.6			O2A	2015-10-23
Evaluation of Output Embeddings for Fine-Grained Image Classification	✓ Link	13.3			SJE(word embedding)	2014-09-30
Actor-agnostic Multi-label Action Recognition with Multi-modal Query	✓ Link			69.43	MSQNet	2023-07-20

OpenCodePapers

zero-shot-action-recognition-on-hmdb51