Paper | Code | mIoU | ModelName | ReleaseDate |
---|---|---|---|---|
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP | ✓ Link | 56.2 | FC-CLIP | 2023-08-04 |
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model | ✓ Link | 34.5 | SimSeg | 2021-12-29 |
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias | ✓ Link | 32.0 | TTD (TCL) | 2024-03-30 |
A Closer Look at the Explainability of Contrastive Language-Image Pre-training | ✓ Link | 31.4 | CLIP Surgery (CLIP without any fine-tuning) | 2023-04-12 |
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias | ✓ Link | 27.0 | TTD (MaskCLIP) | 2024-03-30 |