Paper | Code | mAP | ModelName | ReleaseDate |
---|---|---|---|---|
ControlCap: Controllable Region-level Captioning | ✓ Link | 18.2 | ControlCap | 2024-01-31 |
GRiT: A Generative Region-to-text Transformer for Object Understanding | ✓ Link | 15.5 | GRiT (ViT-B) | 2022-12-01 |
Context and Attribute Grounded Dense Captioning | 10.5 | CAG-Net | 2019-04-02 | |
DenseCap: Fully Convolutional Localization Networks for Dense Captioning | ✓ Link | 5.4 | FCLN | 2015-11-24 |