SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory | ✓ Link | 81.7 | 92.2 | 76.9 | SAMURAI-L | 2024-11-18 |
A Distractor-Aware Memory for Visual Object Tracking with SAM2 | ✓ Link | 81.1 | | | DAM4SAM | 2024-11-26 |
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking | ✓ Link | 81 | 89.2 | 82.3 | SPMTrack-G | 2025-03-24 |
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation | ✓ Link | 80.4 | 89.8 | 75.8 | MITS | 2023-08-25 |
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking | ✓ Link | 80 | 89.4 | 79.9 | SPMTrack-L | 2025-03-24 |
Exploring Enhanced Contextual Information for Video-Level Object Tracking | ✓ Link | 80.0 | 88.5 | 80.2 | MCITrack-L384 | 2024-12-15 |
ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe | ✓ Link | 79.5 | 87.8 | 79.6 | ARTrackV2-L | 2023-12-28 |
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance | ✓ Link | 78.9 | 87.8 | 80.7 | LoRAT-g-378 | 2024-03-08 |
Autoregressive Visual Tracking | ✓ Link | 78.5 | 87.4 | 77.8 | ARTrack-L | 2023-01-01 |
ODTrack: Online Dense Temporal Token Learning for Visual Tracking | ✓ Link | 78.2 | | | ODTrack-L | 2024-01-03 |
Exploring Enhanced Contextual Information for Video-Level Object Tracking | ✓ Link | 77.9 | 88.2 | 76.8 | MCITrack-B224 | 2024-12-15 |
RTracker: Recoverable Tracking via PN Tree Structured Memory | ✓ Link | 77.9 | 87 | 76.9 | RTracker-L | 2024-03-28 |
Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance | ✓ Link | 77.5 | 86.2 | 78.1 | LoRAT-L-378 | 2024-03-08 |
HIPTrack: Visual Tracking with Historical Prompts | ✓ Link | 77.4 | 88.0 | 74.5 | HIPTrack | 2023-11-03 |
ODTrack: Online Dense Temporal Token Learning for Visual Tracking | ✓ Link | 77.0 | | | ODTrack-B | 2024-01-03 |
Target-Aware Tracking with Long-term Context Attention | ✓ Link | 76.6 | 85.7 | 73.4 | TATrack-L-GOT | 2023-02-27 |
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking | ✓ Link | 76.5 | 85.9 | 76.3 | SPMTrack-B | 2025-03-24 |
DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks | ✓ Link | 75.9 | 86.8 | 72 | DropMAE | 2023-04-02 |
NeighborTrack: Improving Single Object Tracking by Bipartite Matching with Neighbor Tracklets | ✓ Link | 75.7 | 85.72 | 73.3 | NeighborTrack-OSTrack | 2022-11-12 |
MixFormer: End-to-End Tracking with Iterative Mixed Attention | ✓ Link | 75.7 | 85.3 | 75.1 | MixViT-L(ConvMAE) | 2023-02-06 |
MixFormer: End-to-End Tracking with Iterative Mixed Attention | ✓ Link | 75.6 | 85.73 | 72.8 | MixFormer-L | 2022-03-21 |
Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking | ✓ Link | 74.8 | 81.9 | 72.2 | SeqTrack-L384 | 2023-04-27 |
Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework | ✓ Link | 73.7 | 83.2 | 70.8 | OSTrack-384 | 2022-03-22 |
Revealing the Dark Secrets of Masked Image Modeling | ✓ Link | 72.9 | | | SwinV2-L 1K-MIM | 2022-05-26 |
MixFormer: End-to-End Tracking with Iterative Mixed Attention | ✓ Link | 71.2 | 79.9 | 65.8 | MixFormer-1k | 2022-03-21 |
Revealing the Dark Secrets of Masked Image Modeling | ✓ Link | 70.8 | | | SwinV2-B 1K-MIM | 2022-05-26 |
MixFormer: End-to-End Tracking with Iterative Mixed Attention | ✓ Link | 70.7 | 80.0 | 67.8 | MixFormer | 2022-03-21 |
AiATrack: Attention in Attention for Transformer Visual Tracking | ✓ Link | 69.6 | 80.0 | 63.2 | AiATrack | 2022-07-20 |
SwinTrack: A Simple and Strong Baseline for Transformer Tracking | ✓ Link | 69.4 | 78 | 64.3 | SwinTrack-B | 2021-12-02 |
Learning Spatio-Temporal Transformer for Visual Tracking | ✓ Link | 68.8 | 78.1 | | STARK | 2021-03-31 |
Towards Sequence-Level Training for Visual Tracking | ✓ Link | 67.5 | 76.8 | 60.3 | SLT-TransT | 2022-08-11 |
Target Transformed Regression for Accurate Tracking | ✓ Link | 66.8 | 77.8 | 57.2 | TREG | 2021-04-01 |
Siam R-CNN: Visual Tracking by Re-Detection | ✓ Link | 64.9 | 72.8 | | Siam R-CNN | 2019-11-28 |
FEAR: Fast, Efficient, Accurate and Robust Visual Tracker | ✓ Link | 64.5 | | | FEAR-L | 2021-12-15 |
STMTrack: Template-free Visual Tracking with Space-time Memory Networks | ✓ Link | 64.2 | 73.7 | 57.5 | STMTrack | 2021-04-01 |
FEAR: Fast, Efficient, Accurate and Robust Visual Tracker | ✓ Link | 62.3 | | | FEAR-M | 2021-12-15 |
FEAR: Fast, Efficient, Accurate and Robust Visual Tracker | ✓ Link | 61.9 | | | FEAR-XS | 2021-12-15 |
Tracking-by-Trackers with a Distilled and Reinforced Model | ✓ Link | 61.7 | 72.9 | | TRASFUST | 2020-07-08 |
Ocean: Object-aware Anchor-free Tracking | ✓ Link | 61.1 | 72.1 | | Ocean | 2020-06-18 |
Learning Discriminative Model Prediction for Tracking | ✓ Link | 61.1 | 71.7 | | DiMP | 2019-04-15 |
ATOM: Accurate Tracking by Overlap Maximization | ✓ Link | 61.0 | 74.2 | | ATOM | 2018-11-19 |
SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines | ✓ Link | 61.0 | 74.2 | | SiamFC++ | 2019-11-14 |