Paper | Code | CIDEr | ROUGE-L | SPICE | ModelName | ReleaseDate |
---|---|---|---|---|---|---|
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative | 63.51 | 31.46 | 19.25 | CEN | 2024-06-10 | |
GiT: Towards Generalist Vision Transformer through Universal Language Interface | ✓ Link | 45.63 | 27.51 | 15.58 | GIT | 2024-03-14 |
SEM-POS: Grammatically and Semantically Correct Video Captioning | 37.16 | 25.39 | 14.46 | SEM-POS | 2023-03-26 | |
Action knowledge for video captioning with graph neural networks | ✓ Link | 35.08 | 25.11 | 14.55 | AKGNN | 2023-03-16 |