zero-shot-action-recognition-on-ucf101

Action RecognitionZero-Shot Action Recognition

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	Top-1 Accuracy	Top-5 accuracy	ModelName	ReleaseDate
Orthogonal Temporal Interpolation for Zero-Shot Video Recognition	✓ Link	92.8		OTI(ViT-L/14)	2023-08-14
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception		91.5		IMP-MoE-L	2023-05-10
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models		87.1		MOV (ViT-L/14)	2022-07-15
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners		86.6	98.4	VideoCoCa	2022-12-09
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models	✓ Link	86.6		BIKE	2022-12-31
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition	✓ Link	85.8		Text4Vis	2022-07-04
Leveraging Temporal Contextualization for Video Action Recognition	✓ Link	85.4		TC-CLIP	2024-04-15
EVA-CLIP: Improved Training Techniques for CLIP at Scale	✓ Link	83.1		EVA-CLIP-E/14+	2023-03-27
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models		82.6		MOV (ViT-B/16)	2022-07-15
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition	✓ Link	79.7		OST	2023-11-30
EZ-CLIP: Efficient Zeroshot Video Action Recognition	✓ Link	79.1		EZ-CLIP	2023-12-13
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge	✓ Link	78.2		MAXI	2023-03-15
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition	✓ Link	76.0		LoCATe-GAT	2024-11-27
VicTR: Video-conditioned Text Representations for Activity Recognition		72.4		VicTR (ViT-B/16)	2023-04-05
Expanding Language-Image Pretrained Models for General Video Recognition	✓ Link	72.0		X-CLIP	2022-08-04
Cross-modal Representation Learning for Zero-shot Action Recognition		58.7		ResT	2022-05-03
Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification	✓ Link	58		AURL	2022-03-29
Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions	✓ Link	56.0		JigsawNet	2022-03-28
CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition		53.9		CLASTER	2021-01-18
Elaborative Rehearsal for Zero-shot Action Recognition	✓ Link	51.8		ER-ZSAR	2021-08-05
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications	✓ Link	48		E2E	2020-03-03
Synthetic Sample Selection for Generalized Zero-Shot Learning		40.9		SPOT	2023-04-06
I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs	✓ Link	34.2		TS-GCN	2019-07-17
Objects2action: Classifying and localizing actions without any video example		30.3		O2A	2015-10-23
Alternative Semantic Representations for Zero-Shot Human Action Recognition		24.4		ASR	2017-06-28
Towards Universal Representation for Unseen Action Recognition		17.5		UR	2018-03-22
[]()		16.7		IAP
[]()		15.9		DAP
Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation		15.8		MTE	2016-11-26
Zero-Shot Action Recognition With Error-Correcting Output Codes		15.1		ZSECOC	2017-07-01
An embarrassingly simple approach to zero-shot learning	✓ Link	15.0		ESZSL	2015-07-06
[]()		14.9		HAA
Evaluation of Output Embeddings for Fine-Grained Image Classification	✓ Link	12.0		SJE(Attribute)	2014-09-30
Semantic Embedding Space for Zero-Shot Action Recognition		10.9		SVE	2015-02-05
Evaluation of Output Embeddings for Fine-Grained Image Classification	✓ Link	9.9		SJE(Word Embedding)	2014-09-30

OpenCodePapers

zero-shot-action-recognition-on-ucf101