OpenCodePapers

open-vocabulary-semantic-segmentation-on-3

Open Vocabulary Semantic Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodemIoUModelNameReleaseDate
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding✓ Link17.3UMG-CLIP-E/142024-01-12
MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation✓ Link16.8MaskCLIP++2024-12-16
Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation✓ Link16.2Mask-Adapter2024-12-05
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation✓ Link16.0CAT-Seg2023-03-21
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding✓ Link15.4UMG-CLIP-L/142024-01-12
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation✓ Link15.1MAFT+2024-08-01
SILC: Improving Vision Language Pretraining with Self-Distillation15.0SILC2023-10-20
PosSAM: Panoptic Open-vocabulary Segment Anything✓ Link14.9PosSAM2024-03-14
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP✓ Link14.8FC-CLIP2023-08-04
Open-Vocabulary Segmentation with Semantic-Assisted Calibration✓ Link14.0SCAN2023-12-07
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation✓ Link13.9SED2023-11-27
Side Adapter Network for Open-Vocabulary Semantic Segmentation✓ Link13.7SAN2023-02-23
Open-Vocabulary Semantic Segmentation with Image Embedding Balancing✓ Link13.7EBSeg-L2024-06-14
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction✓ Link12.4CLIPSelf2023-10-02
Learning Mask-aware CLIP Representations for Zero-Shot Segmentation✓ Link12.1MAFT-ViTL2023-09-30
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models✓ Link11.1ODISE2023-03-08
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP✓ Link9OVSeg Swin-B2022-10-09
Open-Vocabulary Universal Image Segmentation with MaskCLIP✓ Link8.2MaskCLIP2022-08-18
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model✓ Link7SimSeg2021-12-29