OpenCodePapers
referring-video-object-segmentation-on-mevis
Video Object Segmentation
Referring Video Object Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
J&F
↕
J
↕
F
↕
ModelName
ReleaseDate
↕
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
✓ Link
53.7
50.7
56.7
MPG-SAM 2
2025-01-23
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
✓ Link
53.2
50.5
55.9
FindTrack
2025-03-05
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation
✓ Link
51.3
48.5
54.2
GLUS
2025-04-10
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
✓ Link
50.9
48
53.7
VRS-HQ (Chat-UniVi-13B)
2025-01-15
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
49.3
44.7
53.9
ReferDINO (Swin-B)
2025-01-24
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
✓ Link
48.3
45.4
51.2
SAMWISE
2024-11-26
Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation
✓ Link
47.6
44.1
51.1
DsHmp + MTCM
2025-01-09
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
✓ Link
46.4
43
49.8
DsHmp
2024-04-04
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
✓ Link
42.7
39.9
45.5
HTR
2024-03-28
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
✓ Link
37.2
34.2
40.2
LMPM
2023-08-16
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
✓ Link
35.5
33.6
37.3
VLT+TC
2022-10-28
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
✓ Link
32
InternVideo2.5
2025-01-21
Language as Queries for Referring Video Object Segmentation
✓ Link
31.0
29.8
32.2
ReferFormer
2022-01-03
End-to-End Referring Video Object Segmentation with Multimodal Transformers
✓ Link
30.0
28.8
31.2
MTTR
2021-11-29
Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
✓ Link
29.3
27.8
30.8
LBDT
2022-06-08
URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark
✓ Link
27.8
25.7
29.9
URVOS