OpenCodePapers

zero-shot-composed-image-retrieval-zs-cir-on-2

Composed Image Retrieval (CoIR)Zero-Shot Composed Image Retrieval (ZS-CIR)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCode(Recall@10+Recall@50)/2R@10R@50ModelNameReleaseDate
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval✓ Link56.74RTD + LinCIR (CLIP G/14)2024-06-13
Language-only Efficient Training of Zero-shot Composed Image Retrieval✓ Link55.40LinCIR (CLIP G/14)2023-12-04
Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval✓ Link54.45SEIZE (CLIP G/14)2024-10-28
CoLLM: A Large Language Model for Composed Image Retrieval✓ Link49.939.160.7CoLLM (finetuned - BLIP-L/16)2025-03-25
SCOT: Self-Supervised Contrastive Pretraining For Zero-Shot Compositional Retrieval49.2438.4560.03SCOT (WACV 2025)2025-01-12
CoVR-2: Automatic Data Construction for Composed Video Retrieval✓ Link48.338.1558.44CoVR-BLIP-22023-08-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link48.13858.2MagicLens (CoCa L)2024-03-28
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval✓ Link47.34OSrCIR (CLIP G/14)2024-12-15
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link47.16WeiMoCIR (CLIP G/14)2024-09-07
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval✓ Link46.42MTCIR (CLIP L/14)2023-11-13
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion✓ Link45.37CompoDiff (CLIP G/14)2023-03-21
CoLLM: A Large Language Model for Composed Image Retrieval✓ Link45.334.656.0CoLLM (Pretrained - BLIP-L/16)2025-03-25
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link45.3MagicLens (CoCa B)2024-03-28
Zero-shot Composed Text-Image Retrieval✓ Link44.75TransAgg (Laion-CIR-Combined)2023-06-12
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link44.58WeiMoCIR (CLIP H/14)2024-09-07
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion✓ Link44.11CompoDiff (CLIP L/14)2023-03-21
LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval✓ Link43.98LDRE (CLIP G/14)2024-07-11
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval✓ Link42.87OSrCIR (CLIP B/32)2024-12-15
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval✓ Link42.82OSrCIR (CLIP L/14)2024-12-15
Vision-by-Language for Training-Free Compositional Image Retrieval✓ Link42.28CIReVL (CLIP G/14)2023-10-13
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link41.630.752.5MagicLens (CLIP L)2024-03-28
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link41.27WeiMoCIR (CLIP L/14)2024-09-07
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval✓ Link40.66RTD + LinCIR (CLIP L/14)2024-06-13
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link39.84WeiMoCIR (CLIP B/32)2024-09-07
CoLLM: A Large Language Model for Composed Image Retrieval✓ Link39.830.149.5CoLLM (Pretrained - CLIP-L/14)2025-03-25
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link39.39iSEARLE-XL-OTI (CLIP L/14)2024-05-05
Vision-by-Language for Training-Free Compositional Image Retrieval✓ Link38.82CIReVL (CLIP B/32)2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval✓ Link38.56CIReVL (CLIP L/14)2023-10-13
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval✓ Link38.35Context-I2W (CLIP L/14)2023-09-28
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link38.24iSEARLE-XL (CLIP L/14)2024-05-05
Zero-Shot Composed Image Retrieval with Textual Inversion✓ Link37.76SEARLE-XL-OTI (CLIP L/14)2023-03-27
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link36.85MagicLens (CLIP B)2024-03-28
Language-only Efficient Training of Zero-shot Composed Image Retrieval✓ Link36.39LinCIR (CLIP L/14)2023-12-04
Zero-Shot Composed Image Retrieval with Textual Inversion✓ Link35.90SEARLE-XL (CLIP L/14)2023-03-27
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link34.93iSEARLE-OTI (CLIP B/32)2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link34.60iSEARLE (CLIP B/32)2024-05-05
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval✓ Link34.20Pic2Word2023-02-06
Zero-Shot Composed Image Retrieval with Textual Inversion✓ Link32.71SEARLE (CLIP B/32)2023-03-27
Zero-Shot Composed Image Retrieval with Textual Inversion✓ Link32.39SEARLE-OTI (CLIP B/32)2023-03-27
"This is my unicorn, Fluffy": Personalizing frozen vision-language representations✓ Link28.51PALAVRA2022-04-04
ImageScope: Unifying Language-Guided Image Retrieval via Large Multimodal Model Collective Reasoning✓ Link31.3650.78ImageScope (CLIP-ViT-L/14)2025-03-13