OpenCodePapers

referring-expression-segmentation-on-refcocog

Referring Expression Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeOverall IoUMean IoUIoUmIoUModelNameReleaseDate
Multi-label Cluster Discrimination for Visual Representation Learning✓ Link79.9MLCD-Seg-7B2024-07-24
HyperSeg: Towards Universal Visual Segmentation with Large Language Model✓ Link79.4HyperSeg2024-11-26
Universal Segmentation at Arbitrary Granularity with Language Instruction✓ Link79.27UniLSeg-1002023-12-04
Universal Segmentation at Arbitrary Granularity with Language Instruction✓ Link78.41UniLSeg-202023-12-04
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model✓ Link78.2EVF-SAM2024-06-28
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories✓ Link75.11SegAgent2025-03-11
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation✓ Link74.6DETRIS2025-01-15
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints✓ Link74.43C3VG2025-01-12
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation74.1GROUNDHOG2024-02-26
General Object Foundation Model for Images and Videos at Scale✓ Link72.9GLEE-Pro2023-12-14
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation70.48SafaRi-B2024-07-02
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation✓ Link69.271.15PolyFormer-L2023-02-14
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation✓ Link69.12MaskRIS (Swin-B, combined DB)2024-11-28
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation✓ Link67.7669.36PolyFormer-B2023-02-14
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation✓ Link65.5569.31MaskRIS (Swin-B)2024-11-28
Mask Grounding for Referring Image Segmentation✓ Link65.36MagNet2023-12-19
Generalized Decoding for Pixel, Image, and Language✓ Link64.6X-Decoder (Davit-d5)2022-12-21
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation✓ Link63.49VLT (Swin-B)2022-10-28
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation✓ Link61.24LAVT2021-12-04
Vision-Language Transformer and Query Generation for Referring Segmentation✓ Link52.99VLT (Darknet53)2021-08-12
Comprehensive Multi-Modal Interactions for Referring Image Segmentation✓ Link49.90SHNet2021-04-21
DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy✓ Link80.01DeRIS-L2025-07-02
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding✓ Link0.755469.73VATEX2024-04-12