Paper | Code | CIDEr | SPICE | ModelName | ReleaseDate |
---|---|---|---|---|---|
Prismer: A Vision-Language Model with Multi-Task Experts | ✓ Link | 107.9 | 14.8 | Prismer | 2023-03-04 |
Language Models are General-Purpose Interfaces | ✓ Link | 58.7 | 8.6 | MetaLM | 2022-06-13 |
Unifying Vision-and-Language Tasks via Text Generation | ✓ Link | 4.4 | 5.3 | VL-T5 | 2021-02-04 |