Paper | Code | Top 1 Accuracy | ModelName | ReleaseDate |
---|---|---|---|---|
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning | 53.1 | OmniVec2 | 2024-01-01 | |
OmniVec: Learning robust representations with cross modal sharing | 49.8 | OmniVec | 2023-11-07 | |
CoCa: Contrastive Captioners are Image-Text Foundation Models | ✓ Link | 49.0 | CoCa (finetuned) | 2022-05-04 |
CoCa: Contrastive Captioners are Image-Text Foundation Models | ✓ Link | 47.4 | CoCa (frozen) | 2022-05-04 |