OpenCodePapers

zero-shot-composed-image-retrieval-zs-cir-on-1

Composed Image Retrieval (CoIR)Zero-Shot Composed Image Retrieval (ZS-CIR)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeR@1R@5R@10R@50Rsubset@1ModelNameReleaseDate
CoLLM: A Large Language Model for Composed Image Retrieval✓ Link45.884.795.8CoLLM (finetuned - BLIP-L/16)2025-03-25
CoVR-2: Automatic Data Construction for Composed Video Retrieval✓ Link43.7473.6183.9596.1CoVR-BLIP-22023-08-28
ImageScope: Unifying Language-Guided Image Retrieval via Large Multimodal Model Collective Reasoning✓ Link39.3767.5478.0592.94ImageScope (CLIP-ViT-L/14)2025-03-13
Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy39.2570.078094.89IP-CIR + LDRE (CLIP G/14)2024-11-24
Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval✓ Link38.8769.4279.42SEIZE (CLIP G/14)2024-10-28
Zero-shot Composed Text-Image Retrieval✓ Link37.8768.8893.86TransAgg (Laion-CIR-Combined)2023-06-12
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval✓ Link37.2667.2577.33OSrCIR (CLIP G/14)2024-12-15
SCOT: Self-Supervised Contrastive Pretraining For Zero-Shot Compositional Retrieval36.8264.3474.4893.42SCOT (WACV 2025)2025-01-12
CoLLM: A Large Language Model for Composed Image Retrieval✓ Link35.0078.694.2CoLLM (Pretrained - BLIP-L/16)2025-03-25
Vision-by-Language for Training-Free Compositional Image Retrieval✓ Link34.6564.29CIReVL (CLIP G/14)2023-10-13
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link33.367.077.994.4MagicLens (CoCa L)2024-03-28
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link31.0460.41WeiMoCIR (CLIP G/14)2024-09-07
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link30.9460.87WeiMoCIR (CLIP L/14)2024-09-07
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link30.161.774.492.6MagicLens (CLIP L)2024-03-28
Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy29.7658.8271.2190.41IP-CIR + LDRE (CLIP L/14)2024-11-24
CoLLM: A Large Language Model for Composed Image Retrieval✓ Link29.772.891.5CoLLM (Pretrained - CLIP-L/14)2025-03-25
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval✓ Link29.4557.6869.86OSrCIR (CLIP L/14)2024-12-15
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link29.1159.76WeiMoCIR (CLIP H/14)2024-09-07
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity✓ Link26.3157.69WeiMoCIR (CLIP B/32)2024-09-07
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval✓ Link25.4254.5468.19OSrCIR (CLIP B/32)2024-12-15
Vision-by-Language for Training-Free Compositional Image Retrieval✓ Link24.5552.31CIReVL (CLIP L/14)2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval✓ Link23.9452.51CIReVL (CLIP B/32)2023-10-13
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval✓ Link67.47RTD + LinCIR (CLIP G/14)2024-06-13
LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval✓ Link66.39LDRE (CLIP G/14)2024-07-11
Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives✓ Link65.4264.87SPN4CIR (SPN-CC)2024-04-17
Language-only Efficient Training of Zero-shot Composed Image Retrieval✓ Link64.72LinCIR (CLIP G/14)2023-12-04
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link64.0MagicLens (CoCa B)2024-03-28
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval✓ Link58.87MTCIR (BLIP B/16)2023-11-13
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions✓ Link58.0MagicLens (CLIP B)2024-03-28
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion✓ Link57.61CompoDiff (CLIP G/14)2023-03-21
Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval✓ Link57.42SEIZE (CLIP B/32)2024-10-28
Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval✓ Link57.16SEIZE (CLIP L/14)2024-10-28
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval✓ Link56.17RTD + LinCIR (CLIP L/14)2024-06-13
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link55.69iSEARLE (CLIP B/32)2024-05-05
LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval✓ Link55.57LDRE (CLIP L/14)2024-07-11
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link55.18iSEARLE-OTI (CLIP B/32)2024-05-05
LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval✓ Link55.13LDRE (CLIP B/32)2024-07-11
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval✓ Link55.1Context-I2W (CLIP L/14)2023-09-28
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval✓ Link54.58MTCIR (CLIP L/14)2023-11-13
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion✓ Link54.36CompoDiff (CLIP L/14)2023-03-21
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link54.05iSEARLE-XL-OTI (CLIP L/14)2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval✓ Link54.00iSEARLE-XL (CLIP L/14)2024-05-05
Zero-Shot Composed Image Retrieval with Textual Inversion✓ Link53.42SEARLE2023-03-27
Language-only Efficient Training of Zero-shot Composed Image Retrieval✓ Link53.25LinCIR (CLIP L/14)2023-12-04
Zero-Shot Composed Image Retrieval with Textual Inversion✓ Link52.48SEARLE-XL2023-03-27
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval✓ Link51.70Pic2Word2023-02-06
"This is my unicorn, Fluffy": Personalizing frozen vision-language representations✓ Link43.49PALAVRA2022-04-04