OpenCodePapers
open-vocabulary-semantic-segmentation-on-2
Open Vocabulary Semantic Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
mIoU
↕
ModelName
ReleaseDate
↕
Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation
✓ Link
38.2
Mask-Adapter
2024-12-05
MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation
✓ Link
38.2
MaskCLIP++
2024-12-16
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
✓ Link
38.2
UMG-CLIP-E/14
2024-01-12
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
✓ Link
37.9
CAT-Seg
2023-03-21
SILC: Improving Vision Language Pretraining with Self-Distillation
37.7
SILC
2023-10-20
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
✓ Link
36.1
MAFT+
2024-08-01
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
✓ Link
36.1
UMG-CLIP-L/14
2024-01-12
OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation
35.8
OVSeg + OpenDAS
2024-05-30
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
✓ Link
35.2
SED
2023-11-27
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
✓ Link
34.5
CLIPSelf
2023-10-02
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
✓ Link
34.1
FC-CLIP
2023-08-04
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
✓ Link
33.5
SCAN
2023-12-07
Open-Vocabulary Semantic Segmentation with Image Embedding Balancing
✓ Link
32.8
EBSeg-L
2024-06-14
Learning Mask-aware CLIP Representations for Zero-Shot Segmentation
✓ Link
32.0
MAFT-ViTL
2023-09-30
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
✓ Link
31.4
PACL
2022-12-09
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
✓ Link
29.9
ODISE
2023-03-08
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
✓ Link
29.6
OVSeg Swin-B
2022-10-09
Open-Vocabulary Universal Image Segmentation with MaskCLIP
✓ Link
23.7
MaskCLIP
2022-08-18
[]()
20.7
POMP
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model
✓ Link
20.5
SimSeg
2021-12-29
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
✓ Link
17.0
TTD (TCL)
2024-03-30
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
✓ Link
15.8
LaVG
2024-08-09
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
✓ Link
12.7
TTD (MaskCLIP)
2024-03-30