Mr. DETR: Instructive Multi-Route Training for Detection Transformers | ✓ Link | 61.8 | 79.0 | 67.6 | 75.7 | 65.6 | 47.7 | | Mr. DETR (Swin-L, 1x, 5cale) | 2024-12-13 |
Mr. DETR: Instructive Multi-Route Training for Detection Transformers | ✓ Link | 58.4 | 76.3 | 63.9 | 75.3 | 62.8 | 40.8 | | Mr. DETR (Swin-L, 1x, 4scale) | 2024-12-13 |
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism | ✓ Link | 58.2 | 76.5 | 63.4 | 74.6 | 62.8 | 42.5 | | MI-DETR (Swin-L 1x) | 2025-03-03 |
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | ✓ Link | 58.1 | 76.4 | 63.5 | 73.5 | 63.0 | 41.8 | | Relation-DETR (Swin-L 2x) | 2024-07-16 |
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | ✓ Link | 57.8 | 76.1 | 62.9 | 74.4 | 62.1 | 41.2 | | Relation-DETR (Swin-L 1x) | 2024-07-16 |
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | ✓ Link | 57.3 | 75.5 | 62.3 | 74.5 | 61.8 | 40.9 | 220M | Salience-DETR (Focal-L 1x) | 2024-03-24 |
YOLOv6 v3.0: A Full-Scale Reloading | ✓ Link | 57.2 | 74.5 | | | | | | YOLOv6-L6(46 fps, V100, bs1) | 2023-01-13 |
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | ✓ Link | 56.5 | 75.0 | 61.5 | 72.8 | 61.2 | 40.2 | 210M | Salience-DETR (Swin-L 1x) | 2024-03-24 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 56.2 | | | | | | | MogaNet-XL (Cascade Mask R-CNN) | 2022-11-07 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 53.3 | | | | | | | MogaNet-L (Cascade Mask R-CNN) | 2022-11-07 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 52.6 | | | | | | | MogaNet-B (Cascade Mask R-CNN) | 2022-11-07 |
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | ✓ Link | 52.1 | 69.7 | 56.6 | 66.5 | 56.0 | 36.1 | | Relation-DETR (ResNet50 2x) | 2024-07-16 |
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | ✓ Link | 51.7 | 69.1 | 56.3 | 66.1 | 55.6 | 36.1 | | Relation-DETR (ResNet50 1x) | 2024-07-16 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 51.6 | | | | | | | MogaNet-S (Cascade Mask R-CNN) | 2022-11-07 |
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks | ✓ Link | 50.9 | | | | | | | RF-ConvNeXt-T Cascade R-CNN | 2022-06-14 |
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | ✓ Link | 50.0 | 67.7 | 54.2 | 64.4 | 54.4 | 33.3 | 56M | Salience-DETR (ResNet50 1x) | 2024-03-24 |
Enhanced Training of Query-Based Object Detection via Selective Query Recollection | ✓ Link | 49.8 | | | | | | | SQR-Adamixer-R101 | 2022-12-15 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 49.4 | | | | | | | MogaNet-L (Mask R-CNN 1x) | 2022-11-07 |
ViDT: An Efficient and Effective Fully Transformer-based Object Detector | ✓ Link | 49.2 | 69.4 | 53.1 | 66.9 | 52.6 | 30.6 | 0.1B | ViDT Swin-base | 2021-10-08 |
Enhanced Training of Query-Based Object Detection via Selective Query Recollection | ✓ Link | 48.9 | | | | | | | SQR-Adamixer-R50 | 2022-12-15 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 48.7 | | | | | | | MogaNet-L (RetinaNet 1x) | 2022-11-07 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 47.9 | | | | | | | MogaNet-B (Mask R-CNN 1x) | 2022-11-07 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 47.7 | | | | | | | MogaNet-B (RetinaNet 1x) | 2022-11-07 |
ViDT: An Efficient and Effective Fully Transformer-based Object Detector | ✓ Link | 47.5 | 67.7 | 51.4 | 64.8 | 50.7 | 29.2 | 61M | ViDT Swin-small | 2021-10-08 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 46.7 | | | | | | | MogaNet-S (Mask R-CNN 1x) | 2022-11-07 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 45.8 | | | | | | | MogaNet-S (RetinaNet 1x) | 2022-11-07 |
ViDT: An Efficient and Effective Fully Transformer-based Object Detector | ✓ Link | 44.8 | 64.5 | 48.7 | 62.1 | 47.6 | 25.9 | 38M | ViDT Swin-tiny | 2021-10-08 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 42.6 | | | | | | | MogaNet-T (Mask R-CNN 1x) | 2022-11-07 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 41.4 | | | | | | | MogaNet-T (RetinaNet 1x) | 2022-11-07 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 40.7 | | | | | | | MogaNet-XT (Mask R-CNN 1x) | 2022-11-07 |
ViDT: An Efficient and Effective Fully Transformer-based Object Detector | ✓ Link | 40.4 | 59.6 | 43.3 | 55.8 | 42.5 | 23.2 | 16M | ViDT Swin-nano | 2021-10-08 |
MogaNet: Multi-order Gated Aggregation Network | ✓ Link | 39.7 | | | | | | | MogaNet-XT (RetinaNet 1x) | 2022-11-07 |
Dynamic Head: Unifying Object Detection Heads with Attentions | ✓ Link | | 68 | 54.3 | 64.2 | | | | DyHead (Swin-T, multi scale) | 2021-06-15 |