InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions | ✓ Link | 97.2 | | | InternImage-H | 2022-11-10 |
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset | ✓ Link | 97.1 | 30.8 | | MMPedestron | 2024-07-14 |
Progressive End-to-End Object Detection in Crowded Scenes | ✓ Link | 94.1 | 37.7 | | Progressive DETR | 2022-03-15 |
Dense Distinct Query for End-to-End Object Detection | ✓ Link | 93.8 | 39.7 | 98.7 | DDQ DETR (R50) | 2023-03-22 |
Dense Distinct Query for End-to-End Object Detection | ✓ Link | 93.5 | 40.4 | 98.6 | DDQ R-CNN (R50) | 2023-03-22 |
Hulk: A Universal Knowledge Translator for Human-Centric Tasks | ✓ Link | 93 | 36.5 | | Hulk(Finetune, ViT-L) | 2023-12-04 |
Dense Distinct Query for End-to-End Object Detection | ✓ Link | 92.7 | 41.0 | 98.2 | DDQ FCN (R50 One-Stage) | 2023-03-22 |
UniHCP: A Unified Model for Human-Centric Perceptions | ✓ Link | 92.5 | 41.6 | | UniHCP (finetune) | 2023-03-06 |
Hulk: A Universal Knowledge Translator for Human-Centric Tasks | ✓ Link | 92.4 | 40.7 | | Hulk(Finetune, ViT-B) | 2023-12-04 |
V2F-Net: Explicit Decomposition of Occluded Pedestrian Detection | | 91.03 | 42.28 | 84.2 | V2F-Net | 2021-04-07 |
Detection in Crowded Scenes: One Proposal, Multiple Predictions | ✓ Link | 90.7 | 41.4 | | CrowdDet | 2020-03-20 |
Beta R-CNN: Looking into Pedestrian Detection from Another Perspective | | 89.6 | 40.3 | | Beta R-CNN | 2022-10-23 |
NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination | ✓ Link | 89.0 | 43.9 | | NOH-NMS | 2020-07-27 |
IterDet: Iterative Scheme for Object Detection in Crowded Environments | | 88.08 | 49.44 | | IterDet (Faster RCNN, ResNet50, 2 iterations) | 2020-05-12 |
PS-RCNN: Detecting Secondary Human Instances in a Crowd via Primary Object Suppression | | 87.94 | | | PS-RCNN (Faster RCNN, ResNet50, COCO Instance Masks | 2020-03-16 |
PS-RCNN: Detecting Secondary Human Instances in a Crowd via Primary Object Suppression | | 86.05 | | | PS-RCNN (Faster RCNN, ResNet50) | 2020-03-16 |
CrowdHuman: A Benchmark for Detecting Human in a Crowd | ✓ Link | 84.95 | 50.49 | | Faster RCNN (ResNet50) | 2018-04-30 |
Adaptive NMS: Refining Pedestrian Detection in a Crowd | | 84.71 | 49.73 | | Adaptive NMS (Faster RCNN, ResNet50) | 2019-04-07 |
IterDet: Iterative Scheme for Object Detection in Crowded Environments | | 84.43 | 49.12 | | IterDet (Faster RCNN, ResNet50, 1 iteration) | 2020-05-12 |