OpenCodePapers

referring-video-object-segmentation-on-mevis

Video Object SegmentationReferring Video Object Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeJ&FJFModelNameReleaseDate
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation✓ Link53.750.756.7MPG-SAM 22025-01-23
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation✓ Link53.250.555.9FindTrack2025-03-05
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation✓ Link51.348.554.2GLUS2025-04-10
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation✓ Link50.94853.7VRS-HQ (Chat-UniVi-13B)2025-01-15
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations49.344.753.9ReferDINO (Swin-B)2025-01-24
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation✓ Link48.345.451.2SAMWISE2024-11-26
Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation✓ Link47.644.151.1DsHmp + MTCM2025-01-09
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation✓ Link46.44349.8DsHmp2024-04-04
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory✓ Link42.739.945.5HTR2024-03-28
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions✓ Link37.234.240.2LMPM2023-08-16
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation✓ Link35.533.637.3VLT+TC2022-10-28
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling✓ Link32InternVideo2.52025-01-21
Language as Queries for Referring Video Object Segmentation✓ Link31.029.832.2ReferFormer2022-01-03
End-to-End Referring Video Object Segmentation with Multimodal Transformers✓ Link30.028.831.2MTTR2021-11-29
Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation✓ Link29.327.830.8LBDT2022-06-08
URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark✓ Link27.825.729.9URVOS