ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation | ✓ Link | 93.3 | 92.8 | ViTPose (ViTAE-G, GT bounding boxes) | 2022-04-26 |
UniHCP: A Unified Model for Human-Centric Perceptions | ✓ Link | 87.4 | | UniHCP (direct eval) | 2023-03-06 |
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation | ✓ Link | 87.0 | 86.0 | PoseBH-H | 2025-05-23 |
RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose | ✓ Link | 80.3 | 80.5 | RTMPose(RTMPose-l, GT bounding boxes) | 2023-03-13 |
Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle | ✓ Link | 48.3 | 48.6 | BBox-Mask-Pose 2x | 2024-12-02 |
Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity | ✓ Link | 47.2 | 47.7 | BUCTD (CID-W32) | 2023-06-13 |
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception | ✓ Link | 45.6 | | HQNet (ViT-L) | 2023-12-09 |
Contextual Instance Decoupling for Robust Multi-Person Pose Estimation | ✓ Link | 45.0 | 46.1 | CID (HRNet-W48) | 2022-01-01 |
Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle | ✓ Link | 45.0 | 45.3 | MaskPose-b | 2024-12-02 |
Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation | ✓ Link | 42.5 | 42.0 | MIPNet (HRNet-W48) | 2021-01-27 |
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception | ✓ Link | 40.0 | | HQNet (ResNet-50) | 2023-12-09 |
Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation | ✓ Link | 37.2 | 37.8 | HRNet-W48 | 2021-01-27 |
Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation | | 36.0 | 41.8 | HGG (AE+) | 2020-07-23 |
Simple Baselines for Human Pose Estimation and Tracking | ✓ Link | 33.3 | 41.0 | ResNet-152 | 2018-04-17 |
Associative Embedding: End-to-End Learning for Joint Detection and Grouping | ✓ Link | 32.8 | 40.0 | Associative Embedding+ | 2016-11-16 |
RMPE: Regional Multi-person Pose Estimation | ✓ Link | 30.7 | 38.8 | RMPE | 2016-12-01 |
Simple Baselines for Human Pose Estimation and Tracking | ✓ Link | 29.5 | 32.1 | ResNet-50 | 2018-04-17 |
Associative Embedding: End-to-End Learning for Joint Detection and Grouping | ✓ Link | 29.5 | 32.1 | Associative Embedding | 2016-11-16 |
TransPose: Keypoint Localization via Transformer | ✓ Link | | 62.3 | TransPose-H | 2020-12-28 |