OpenCodePapers

referring-expression-segmentation-on-refcoco

Referring Expression Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeOverall IoUmIoUPrecision@0.5Precision@0.6Precision@0.7Precision@0.8Precision@0.9Mean IoUModelNameReleaseDate
DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy✓ Link85.4185.72DeRIS-L2025-07-02
HyperSeg: Towards Universal Visual Segmentation with Large Language Model✓ Link84.8HyperSeg2024-11-26
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model✓ Link83.6PSALM2024-03-21
Multi-label Cluster Discrimination for Visual Representation Learning✓ Link83.6MLCD-Seg-7B2024-07-24
Hierarchical Open-vocabulary Universal Image Segmentation✓ Link82.8HIPIE2023-07-03
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model✓ Link82.4EVF-SAM2024-06-28
Universal Instance Perception as Object Discovery and Retrieval✓ Link82.19UNINEXT-H2023-03-12
Universal Segmentation at Arbitrary Granularity with Language Instruction✓ Link81.74UniLSeg-1002023-12-04
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation✓ Link81.0DETRIS2025-01-15
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints✓ Link80.89C3VG2025-01-12
General Object Foundation Model for Images and Videos at Scale✓ Link80.0GLEE-Pro2023-12-14
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories✓ Link79.7SegAgent2025-03-11
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation✓ Link78.71MaskRIS (Swin-B, combined DB)2024-11-28
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation78.5GROUNDHOG2024-02-26
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation77.21SafaRi-B2024-07-02
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation✓ Link76.4978.35MaskRIS (Swin-B)2024-11-28
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation✓ Link75.9676.94PolyFormer-L2023-02-14
Mask Grounding for Referring Image Segmentation✓ Link75.24MagNet2023-12-19
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation✓ Link74.82PolyFormer-B2023-02-14
GRES: Generalized Referring Expression Segmentation✓ Link73.82ReLA2023-06-01
Unleashing Text-to-Image Diffusion Models for Visual Perception✓ Link73.25VPD2023-03-03
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation✓ Link72.96VLT2022-10-28
Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation✓ Link71.06ETRIS2023-07-21
Referring Transformer: A One-step Approach to Multi-task Visual Grounding✓ Link70.56RefTR2021-06-06
CRIS: CLIP-Driven Referring Image Segmentation✓ Link70.47CRIS2021-11-30
MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation70.13MaIL2021-11-21
Vision-Language Transformer and Query Generation for Referring Segmentation✓ Link65.65VLT2021-08-12
Comprehensive Multi-Modal Interactions for Referring Image Segmentation✓ Link65.3275.1869.3661.2146.1616.23SHNet2021-04-21
Referring Image Segmentation via Cross-Modal Progressive Comprehension✓ Link61.36CPMC2020-10-01
Bi-Directional Relationship Inferring Network for Referring Image Segmentation61.35BRINet2020-06-01
RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation✓ Link59.45RefVOS with BERT + MLM loss2020-10-01
Referring Expression Object Segmentation with Caption-Aware Consistency✓ Link58.90LANG2SEG2019-10-10
RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation✓ Link58.65RefVOS with BERT Pre-train2020-10-01
Cross-Modal Self-Attention Network for Referring Image Segmentation✓ Link58.32CMSA2019-04-09
See-Through-Text Grouping for Referring Image Segmentation56.58STEP (1-fold)2019-10-01
MAttNet: Modular Attention Network for Referring Expression Comprehension✓ Link56.51MattNet2018-01-24
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding✓ Link78.16VATEX2024-04-12