OpenCodePapers

referring-expression-segmentation-on-refer-1

Referring Expression Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeJ&FJFModelNameReleaseDate
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation✓ Link73.971.776.1MPG-SAM 22025-01-23
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation✓ Link716973.1VRS-HQ (Chat-UniVi-13B)2025-01-15
General Object Foundation Model for Images and Videos at Scale✓ Link70.668.272.9GLEE-Pro2023-12-14
Universal Instance Perception as Object Discovery and Retrieval✓ Link70.167.672.7UNINEXT-H2023-03-12
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations69.367.071.5 ReferDINO (Swin-B)2025-01-24
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation✓ Link68.466.470.4MUTR2023-05-25
Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation67.6 65.3 69.8 VLP (VLMo-L)2024-05-17
Segment Every Reference Object in Spatial and Temporal Spaces67.465.569.2UniRef-L (Swin-L)2023-01-01
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation✓ Link67.3±0.565.369.3SOC (Joint training, Video-Swin-B)2023-05-26
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory✓ Link67.165.368.9HTR (Pre-training)2024-03-28
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation✓ Link67.16569.1DsHmp (Video-Swin-Base)2024-04-04
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces✓ Link66.964.869.0UniRef++-L2023-12-25
ViLLa: Video Reasoning Segmentation with Large Language Model✓ Link66.564.668.6ViLLa2024-07-18
Tracking Anything with Decoupled Video Segmentation✓ Link66.0DEVA (ReferFormer)2023-09-07
Spectrum-guided Multi-granularity Referring Video Object Segmentation✓ Link65.763.967.4SgMg (Pre-training)2023-07-25
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation65.564.166.9GroPrompt2024-06-18
Expression Prompt Collaboration Transformer for Universal Referring Video Object Segmentation6562.967.2EPCFormer (ViT-H)2023-08-08
Universal Segmentation at Arbitrary Granularity with Language Instruction✓ Link64.962.867.0UniLSeg-1002023-12-04
LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation✓ Link64.262.566.0LoSh-R2023-06-14
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation✓ Link63.861.965.6VLT2022-10-28
OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation✓ Link63.561.665.5OnlineRefer (Swin-L, online)2023-07-18
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus✓ Link61.359.663.1R2VOS (Video-Swin-T)2022-07-04
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation✓ Link59.257.860.5SOC (Video-Swin-T)2023-05-26
UniVS: Unified and Universal Video Segmentation with Prompts as Queries✓ Link58.056.859.5UniVS(Swin-L)2024-02-28
Language as Queries for Referring Video Object Segmentation✓ Link57.356.158.4ReferFormer (ResNet-101)2022-01-03
Multi-Attention Network for Compressed Video Referring Object Segmentation✓ Link55.6354.7556.51MANET2022-07-26
Language as Queries for Referring Video Object Segmentation✓ Link55.654.856.6ReferFormer (ResNet-50)2022-01-03
End-to-End Referring Video Object Segmentation with Multimodal Transformers✓ Link55.3254.0056.64MTTR (w=12)2021-11-29
Local-Global Context Aware Transformer for Language-Guided Video Segmentation✓ Link5048.851.1Locater2022-03-18
Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation49.7050.9648.43MLRLSA2022-01-01
Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation49.5648.4450.67VLIDE2022-03-30
URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark✓ Link48.947.050.8URVOS
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling✓ Link34.2InternVideo2.52025-01-21