| []() | | 77.3 | 62.0 | FlexTrack | |
| SUTrack: Towards Simple and Unified Single Object Tracking | ✓ Link | 76.9 | 61.9 | SUTrack-L384 | 2024-12-26 |
| Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking | ✓ Link | 76.7 | 61.0 | SeqTrackv2-L384 | 2023-04-27 |
| []() | | 76.2 | 60.7 | PromptTrack | |
| Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking | ✓ Link | 76.0 | 60.3 | STTrack | 2024-12-20 |
| Breaking Shallow Limits: Task-Driven Pixel Fusion for Gap-free RGBT Tracking | | 75.1 | 59.5 | TPF | 2025-03-14 |
| RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba | | 74.2 | 59.1 | AINet-B384 | 2024-08-16 |
| Adaptive Perception for Unified Visual Multi-modal Object Tracking | | 74.1 | 58.9 | APTrack | 2025-02-10 |
| Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking | ✓ Link | 74.1 | 58.8 | SeqTrackv2-L256 | 2023-04-27 |
| Cross Fusion RGB-T Tracking with Bi-directional Adapter | | 73.2 | 58.4 | CFBT | 2024-08-30 |
| Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation | ✓ Link | 73.2 | 58.1 | CKD | 2024-10-15 |
| []() | | 73.0 | 58.8 | MST | |
| MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking | ✓ Link | 73.0 | 57.9 | MambaVT-S256 | 2024-08-15 |
| MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking | ✓ Link | 72.7 | 57.5 | MambaVT-M256 | 2024-08-15 |
| Revisiting RGBT Tracking Benchmarks from the Perspective of Modality Validity: A New Benchmark, Problem, and Method | ✓ Link | 72.1 | 57.8 | MoETrack | 2024-04-30 |
| RGB-T Tracking via Multi-Modal Mutual Prompt Learning | ✓ Link | 72.0 | 57.1 | MPLT | 2023-08-31 |
| Transformer-based RGB-T Tracking with Channel and Spatial Feature Fusion | ✓ Link | 71.5 | 57.2 | CSTNet | 2024-05-06 |
| Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking | ✓ Link | 71.5 | 56.2 | SeqTrackv2-B384 | 2023-04-27 |
| From Two-Stream to One-Stream: Efficient RGB-T Tracking via Mutual Prompt Learning and Knowledge Distillation | | 71.4 | 56.7 | MMMP | 2024-03-25 |
| Generative-based Fusion Mechanism for Multi-Modal Tracking | ✓ Link | 70.7 | 56.6 | GMMT | 2023-09-04 |
| Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking | ✓ Link | 70.4 | 55.8 | SeqTrackv2-B256 | 2023-04-27 |
| AFter: Attention-based Fusion Router for RGBT Tracking | ✓ Link | 70.3 | 55.1 | AFter | 2024-05-04 |
| Bridging Search Region Interaction With Template for RGB-T Tracking | ✓ Link | 70.2 | 56.5 | TBSI | 2023-01-01 |
| Bi-directional Adapter for Multi-modal Tracking | ✓ Link | 70.2 | 56.3 | BAT | 2023-12-17 |
| Temporal Adaptive RGBT Tracking with Modality Prompt | | 70.2 | 56.1 | TATrack | 2024-01-02 |
| Cross-modulated Attention Transformer for RGBT Tracking | | 70.0 | 55.6 | CAFormer | 2024-08-05 |
| Transformer RGBT Tracking with Spatio-Temporal Multimodal Tokens | | 67.4 | 53.7 | STMT | 2024-01-03 |
| Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T Tracking | | 67.3 | 54.2 | M3PT | 2024-03-27 |
| OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | | 67.2 | 53.8 | OneTracker | 2024-03-14 |
| Single-Model and Any-Modality for Video Object Tracking | ✓ Link | 66.7 | 53.6 | Un-Track | 2023-11-27 |
| SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking | ✓ Link | 66.5 | 53.1 | SDSTrack | 2024-03-24 |
| Visual Prompt Multi-Modal Tracking | ✓ Link | 65.1 | 52.5 | ViPT | 2023-03-20 |
| Efficient RGB-T Tracking via Cross-Modality Distillation | | 59.0 | 46.6 | CMD | 2023-01-01 |
| Prompting for Multi-Modal Tracking | | 50.9 | 42.1 | ProTrack | 2022-07-29 |
| Attribute-Based Progressive Fusion Network for RGBT Tracking | ✓ Link | 50.0 | 36.2 | APFNet | 2022-01-26 |
| Duality-Gated Mutual Condition Network for RGBT Tracking | | 49.0 | 35.5 | DMCNet | 2020-11-14 |
| RGBT Tracking via Multi-Adapter Network with Hierarchical Divergence Loss | | 46.7 | 31.4 | MANet++ | 2020-11-14 |
| Challenge-Aware RGBT Tracking | | 45.0 | 31.4 | CAT | 2020-07-26 |
| Multi-Modal Fusion for End-to-End RGB-T Tracking | ✓ Link | 44.7 | 34.3 | mfDiMP | 2019-08-30 |