Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition | ✓ Link | 90.9 | 92.2 | 6 | | ProtoGCN | 2024-11-28 |
Joint Mixing Data Augmentation for Skeleton-based Action Recognition | ✓ Link | 90.9 | 91.9 | | | JMDA (based on Skeleton MixFormer) | 2024-10-13 |
Language Knowledge-Assisted Representation Learning for Skeleton-Based Action Recognition | ✓ Link | 90.7 | 91.8 | 6 | | LA-GCN | 2023-05-21 |
MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition | | 90.6 | 92.2 | 6 | | MSA-GCN | 2024-12-19 |
Shap-Mix: Shapley Value Guided Mixing for Long-Tailed Skeleton Based Action Recognition | ✓ Link | 90.4 | 91.7 | 4 | | Shap-Mix | 2024-07-17 |
Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition | ✓ Link | 90.3 | 91.7 | 6 | | MMCL | 2024-07-22 |
BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition | ✓ Link | 90.3 | 91.5 | 4 | | BlockGCN | 2024-01-01 |
TSGCNeXt: Dynamic-Static Multi-Graph Convolution for Efficient Skeleton-Based Action Recognition with Long-term Learning Potential | ✓ Link | 90.2 | 91.7 | 4 | | TSGCNeXt | 2023-04-23 |
Hierarchically Decomposed Graph Convolutional Networks for Skeleton-Based Action Recognition | ✓ Link | 90.1 | 91.6 | 6 | | HD-GCN | 2022-08-23 |
Masked Motion Predictors are Strong 3D Action Representation Learners | ✓ Link | 90.0 | 91.3 | | | MAMP | 2023-08-14 |
STEP CATFormer: Spatial-Temporal Effective Body-Part Cross Attention Transformer for Skeleton-based Action Recognition | ✓ Link | 90.0 | 91.2 | 4 | | STEP-CATFormer | 2023-12-06 |
Hypergraph Transformer for Skeleton-based Action Recognition | ✓ Link | 89.9 | 91.3 | 4 | | Hyperformer | 2022-11-17 |
Generative Action Description Prompts for Skeleton-based Action Recognition | ✓ Link | 89.9 | 91.1 | 4 | | LST | 2022-08-10 |
SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition | ✓ Link | 89.8 | 91.4 | 4 | | SkateFormer | 2024-03-14 |
InfoGCN: Representation Learning for Human Skeleton-Based Action Recognition | ✓ Link | 89.8 | 91.2 | 6 | | InfoGCN | 2022-01-01 |
Action Recognition with Multi-stream Motion Modeling and Mutual Information Maximization | | 89.7 | 91.0 | | | Stream-GCN | 2023-06-13 |
DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition | ✓ Link | 89.6 | 91.3 | 4 | | DG-STGCN | 2022-10-12 |
Graph Contrastive Learning for Skeleton-based Action Recognition | ✓ Link | 89.5 | 91.0 | 4 | | SkeletonGCL (based on CTR-GCN) | 2023-01-26 |
Joint-Partition Group Attention for skeleton-based action recognition | ✓ Link | 89.4 | 91.4 | 4 | | JPFormer | 2024-07-30 |
Skeleton-based Action Recognition via Temporal-Channel Aggregation | ✓ Link | 89.4 | 90.8 | 4 | | TCA-GCN | 2022-05-31 |
PSUMNet: Unified Modality Part Streams are All You Need for Efficient Pose-based Action Recognition | ✓ Link | 89.4 | 90.6 | | | PSUMNet | 2022-08-11 |
DSTSA-GCN: Advancing Skeleton-Based Gesture Recognition with Semantic-Aware Spatio-Temporal Topology Modeling | ✓ Link | 89.12 | 90.97 | 4 | | DSTSA-GCN | 2025-01-21 |
TSGCNeXt: Dynamic-Static Multi-Graph Convolution for Efficient Skeleton-Based Action Recognition with Long-term Learning Potential | ✓ Link | 89.1 | 90.3 | 4 | | TSGCNeXT | 2023-04-23 |
Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition | ✓ Link | 88.9 | 90.6 | 4 | | CTR-GCN | 2021-07-26 |
LLMs are Good Action Recognizers | | 88.7 | 91.5 | | | Lit-llama | 2024-03-31 |
Spatial Temporal Graph Attention Network for Skeleton-Based Action Recognition | ✓ Link | 88.7 | 90.4 | 4 | | STGAT | 2022-08-18 |
Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition | ✓ Link | 88.7 | 89.1 | | | EfficientGCN-B4 | 2021-06-29 |
PYSKL: Towards Good Practices for Skeleton Action Recognition | ✓ Link | 88.6 | 90.8 | 4 | | ST-GCN++ [PYSKL, 3D Skeleton] | 2022-05-19 |
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition | ✓ Link | 88.2 | 89.3 | 4 | | DualHead-Net | 2021-08-10 |
Fusing Higher-order Features in Graph Neural Networks for Skeleton-based Action Recognition | ✓ Link | 88.2% | 89.2% | 4 | | AngNet-JA + BA + JBA + VJBA | 2021-05-04 |
Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition | ✓ Link | 87.9 | 88.0 | | | EfficientGCN-B2 | 2021-06-29 |
Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation | | 87.5 | 89.2 | 4 | | Skeletal GNN | 2021-08-16 |
MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning | | 87.4 | 89.5 | | | MaskCLR | 2024-01-01 |
Multi-scale spatial–temporal convolutional neural network for skeleton-based action recognition | ✓ Link | 87.4 | 88.3 | | | MSSTNet | 2023-05-12 |
Stronger, Faster and More Explainable: A Graph Convolutional Baseline for Skeleton-based Action Recognition | ✓ Link | 87.3 | 88.3 | | | PA-ResGCN-B19 | 2020-10-20 |
Quo Vadis, Skeleton Action Recognition ? | ✓ Link | 87.22% | 88.8% | | | Ensemble-top5 (MS-G3D Net + 4s Shift-GCN + VA-CNN (ResNeXt101) + 2s SDGCN + GCN-NAS (retrained)) | 2020-07-04 |
Revisiting Skeleton-based Action Recognition | ✓ Link | 86.9 | 90.3 | | | PoseC3D (w. HRNet 2D Skeleton) | 2021-04-28 |
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition | ✓ Link | 86.9% | 88.4% | | | MS-G3D Net | 2020-03-31 |
Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition | ✓ Link | 86.6% | 89.0 % | | | DSTA-Net | 2020-07-07 |
VPN: Learning Video-Pose Embedding for Activities of Daily Living | ✓ Link | 86.3 | 87.8 | | | VPN | 2020-07-06 |
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition | ✓ Link | 86.2 | 88.4 | | | ST-GCN [PYSKL, 3D Skeleton] | 2018-01-23 |
Skeleton-Based Action Recognition With Shift Graph Convolutional Network | ✓ Link | 85.9% | 87.6% | 4 | | 4s Shift-GCN | 2020-06-01 |
Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition | ✓ Link | 85.9 | 84.3 | | | EfficientGCN-B0 | 2021-06-29 |
Feedback Graph Convolutional Network for Skeleton-based Action Recognition | | 85.4% | 87.4% | | | FGCN | 2020-03-17 |
[]() | | 84.88 | 86.90 | | | VA-CNN (ResNeXt-101) | |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 84.8 | 86.2 | | 32.4 | S-TR (2-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 84.8 | 86.1 | | 0.3 | CoS-TR* (2-stream) | 2022-03-21 |
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition | ✓ Link | 84.7 | 89.0 | | | ST-GCN [PYSKL, 2D Skeleton] | 2018-01-23 |
HYperbolic Self-Paced Learning for Self-Supervised Skeleton-based Action Representations | ✓ Link | 84.5 | 86.3 | | | 3s-HYSP | 2023-03-10 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 84.0 | 85.5 | | 0.32 | CoST-GCN* (2-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 84 | 85.4 | | 37.38 | AGCN (2-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 83.7 | 85.1 | | 33.46 | ST-GCN (2-stream) | 2022-03-21 |
Skeleton-based Action Recognition via Spatial and Temporal Transformer Networks | ✓ Link | 82.7% | 84.7% | | | ST-TR-agcn | 2020-08-17 |
HYperbolic Self-Paced Learning for Self-Supervised Skeleton-based Action Representations | ✓ Link | 81.4 | 82 | | | HYSP | 2023-03-10 |
Richly Activated Graph Convolutional Network for Robust Skeleton-based Action Recognition | ✓ Link | 81.1% | 82.7% | | | 3s RA-GCN | 2020-08-09 |
USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation | ✓ Link | 80.6 | 79.3 | | | 3s-USDRL (DSTE) This work | 2024-12-12 |
Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition | | 80.5% | 83.2% | | | Mix-Dimension | 2020-07-30 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 80.4 | 82 | | 0.44 | CoAGCN* (2-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 80.2 | 81.8 | | 16.2 | S-TR (1-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 79.7 | 81.7 | | 0.15 | CoS-TR* (1-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 79.7 | 80.7 | | 18.69 | AGCN (1-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 79.4 | 81.6 | | 0.16 | CoST-GCN* (1-stream) | 2022-03-21 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 79 | | | 16.73 | ST-GCN (1-stream) | 2022-03-21 |
Vertex Feature Encoding and Hierarchical Temporal Modeling in a Spatial-Temporal Graph Convolutional Network for Action Recognition | | 78.3% | 79.8% | | | GVFE + AS-GCN with DH-TCN | 2019-12-20 |
Continual Spatio-Temporal Graph Convolutional Networks | ✓ Link | 77.3 | 79.1 | | 0.22 | CoAGCN* (1-stream) | 2022-03-21 |
Gimme Signals: Discriminative signal encoding for multimodal activity recognition | ✓ Link | 70.8% | 71.6% | | | Gimme Signals (Skeleton, AIS) | 2020-03-13 |
Learning stochastic differential equations using RNN with log signature features | | 68.3% | 67.2% | | | Logsig-RNN | 2019-08-22 |
Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints | ✓ Link | 67.9% | 62.8% | | | TSRJI (Late Fusion) + HCN | 2019-09-11 |
SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition | ✓ Link | 67.7% | 66.9% | | | SkeleMotion + Yang et al. (2018) | 2019-07-30 |
Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints | ✓ Link | 65.5% | 59.7% | | | TSRJI (Late Fusion) | 2019-09-11 |
Recognizing Human Actions as the Evolution of Pose Estimation Maps | | 64.6% | 66.9% | | | Body Pose Evolution Map | 2018-06-01 |
SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition | ✓ Link | 62.9% | 63.0% | | | SkeleMotion [Magnitude-Orientation (TSA)] | 2019-07-30 |
Learning clip representations for skeleton-based 3d action recognition | | 62.2% | 61.8% | | | Multi-Task CNN with RotClips | 2018-03-05 |
Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks | | 61.2% | 63.3% | | | Two-Stream Attention LSTM | 2017-07-18 |
Enhanced skeleton visualization for view invariant human action recognition | | 60.3% | 63.2% | | | Skeleton Visualization (Single Stream) | 2017-08-01 |
Skeleton-Based Online Action Prediction Using Scale Selection Network | | 59.9% | 62.4% | | | FSNet | 2019-02-08 |
A New Representation of Skeleton Sequences for 3D Action Recognition | | 58.4% | 57.9% | | | Multi-Task Learning Network | 2017-03-09 |
Global Context-Aware Attention LSTM Networks for 3D Action Recognition | | 58.3% | 59.2% | | | GCA-LSTM | 2017-07-01 |
Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates | | 58.2% | 60.9% | | | Internal Feature Fusion | 2017-06-26 |
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition | | 55.7% | 57.9% | | | Spatio-Temporal LSTM | 2016-07-24 |
Jointly learning heterogeneous features for rgb-d activity recognition | | 50.8% | 54.7% | | | Dynamic Skeletons | 2016-12-15 |
Early action prediction by soft regression | | 36.3% | 44.9% | | | Soft RNN | 2018-08-06 |
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis | ✓ Link | 25.5% | 26.3% | | | Part-Aware LSTM | 2016-04-11 |