OpenCodePapers

zero-shot-action-recognition-on-ucf101

Zero-Shot Action Recognition
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeTop-1 AccuracyTop-5 accuracyModelNameReleaseDate
Orthogonal Temporal Interpolation for Zero-Shot Video Recognition✓ Link92.8OTI(ViT-L/14)2023-08-14
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception91.5IMP-MoE-L2023-05-10
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models87.1MOV (ViT-L/14)2022-07-15
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners86.698.4VideoCoCa2022-12-09
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models✓ Link86.6BIKE2022-12-31
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition✓ Link85.8Text4Vis2022-07-04
Leveraging Temporal Contextualization for Video Action Recognition✓ Link85.4TC-CLIP2024-04-15
EVA-CLIP: Improved Training Techniques for CLIP at Scale✓ Link83.1EVA-CLIP-E/14+2023-03-27
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models82.6MOV (ViT-B/16)2022-07-15
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition✓ Link79.7OST2023-11-30
EZ-CLIP: Efficient Zeroshot Video Action Recognition✓ Link79.1EZ-CLIP2023-12-13
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge✓ Link78.2MAXI2023-03-15
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition✓ Link76.0LoCATe-GAT2024-11-27
VicTR: Video-conditioned Text Representations for Activity Recognition72.4VicTR (ViT-B/16)2023-04-05
Expanding Language-Image Pretrained Models for General Video Recognition✓ Link72.0X-CLIP2022-08-04
Cross-modal Representation Learning for Zero-shot Action Recognition58.7ResT2022-05-03
Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification✓ Link58AURL2022-03-29
Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions✓ Link56.0JigsawNet2022-03-28
CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition53.9CLASTER2021-01-18
Elaborative Rehearsal for Zero-shot Action Recognition✓ Link51.8ER-ZSAR2021-08-05
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications✓ Link48E2E2020-03-03
Synthetic Sample Selection for Generalized Zero-Shot Learning40.9SPOT2023-04-06
I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs✓ Link34.2TS-GCN2019-07-17
Objects2action: Classifying and localizing actions without any video example30.3O2A2015-10-23
Alternative Semantic Representations for Zero-Shot Human Action Recognition24.4ASR2017-06-28
Towards Universal Representation for Unseen Action Recognition17.5UR2018-03-22
[]()16.7IAP
[]()15.9DAP
Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation15.8MTE2016-11-26
Zero-Shot Action Recognition With Error-Correcting Output Codes15.1ZSECOC2017-07-01
An embarrassingly simple approach to zero-shot learning✓ Link15.0ESZSL2015-07-06
[]()14.9HAA
Evaluation of Output Embeddings for Fine-Grained Image Classification✓ Link12.0SJE(Attribute)2014-09-30
Semantic Embedding Space for Zero-Shot Action Recognition10.9SVE2015-02-05
Evaluation of Output Embeddings for Fine-Grained Image Classification✓ Link9.9SJE(Word Embedding)2014-09-30