Two Heads are Better Than One: Hypergraph-Enhanced Graph Reasoning for Visual Event Ratiocination | | 16.7 | 14.1 | 37.8 | | | | | | | | HEGR | 2021-07-18 |
Imagine, Reason and Write: Visual Storytelling with Graph Knowledge and Relational Reasoning | | 15.4 | 11.0 | 35.6 | 66.7 | 41.6 | 25.0 | 29.6 | | | | IRW | 2021-05-18 |
Visual Storytelling with Hierarchical BERT Semantic Guidance | | 15.4 | | 36.5 | | | | | | | | HBSG | 2022-01-10 |
Coherent Visual Storytelling via Parallel Top-Down Visual and Topic Attention | | 15.2 | 11.5 | 36.5 | 67.5 | 42.7 | 25.3 | 30.8 | | | | CoVS | 2022-08-17 |
SentiStory: A Multi-Layered Sentiment-Aware Generative Model for Visual Storytelling | | 14.8 | 10.1 | 35.7 | 65.5 | 40.7 | 24.1 | 30.2 | | | | SentiStory | 2022-06-16 |
Diverse and Relevant Visual Storytelling with Scene Graph Embeddings | | 14.8 | 8.6 | 35.6 | 62.2 | 38.7 | 23.5 | 30.2 | | | | SGEmb | 2020-11-01 |
Hide-and-Tell: Learning to Bridge Photo Streams for Visual Storytelling | | 14.7 | 10 | 35.6 | 64.4 | 0.401 | 23.9 | 29.7 | | | | INet | 2020-02-03 |
Storytelling from an Image Stream Using Scene Graphs | | 14.7 | 9.8 | 35.8 | | | | 29.9 | | | | SGVST | 2020-04-03 |
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication | | 14.6 | 9.2 | 35.7 | 64.2 | 39.6 | 23.7 | 31 | | | | TAVST (RL) | 2019-11-11 |
What Makes A Good Story? Designing Composite Rewards for Visual Storytelling | ✓ Link | 14.4 | 6.7 | 35.2 | | | | 30.1 | 8.3 | | | BLEU-RL | 2019-09-11 |
Informative Visual Storytelling with Cross-modal Rules | ✓ Link | 14.3 | 9 | 35.5 | 63.8 | | | 30.2 | | | | VSCMR | 2019-07-07 |
What Makes A Good Story? Designing Composite Rewards for Visual Storytelling | ✓ Link | 14.3 | 7.2 | 34.8 | | | | 30 | 8.5 | | | MLE | 2019-09-11 |
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling | ✓ Link | 14.1 | 9.4 | 35 | 63.8 | 39.1 | 23.2 | 29.5 | | | | AREL-t-100 | 2018-04-24 |
Hierarchical memory decoder for visual narrating | | 14.1 | | 35.5 | | | | | | | | MemNet | 2020-09-01 |
Visual Storytelling via Predicting Anchor Word Embeddings in the Stories | | 14 | 9.9 | 35.5 | 65.1 | 40.0 | 23.4 | 30 | | | | StoryAnchor: w/ Predicted Nouns | 2020-01-13 |
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling | ✓ Link | 14 | 9 | 35 | 62.8 | 38.8 | 23.0 | 29.5 | | | | GAN | 2018-04-24 |
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling | ✓ Link | 13.7 | 8.7 | 34.8 | 62.3 | 38.2 | 22.5 | 29.7 | | | | XE-ss | 2018-04-24 |
What Makes A Good Story? Designing Composite Rewards for Visual Storytelling | ✓ Link | 13.6 | 9.1 | 35.2 | | | | 29.3 | 8.9 | | | AREL | 2019-09-11 |
Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling | | 13 | 11 | 36.1 | | | 23.1 | 30.7 | | | | MCSM+RNN | 2021-02-05 |
AOG-LSTM: An adaptive attention neural network for visual storytelling | | 12.9 | 12.0 | 36.0 | 69 | 44 | 23.9 | 30.1 | | | | AOG + ARS | 2023-06-26 |
Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling | ✓ Link | 12.8 | 12.1 | 35.2 | | | | 29.9 | | | | K-Storyteller | 2019-05-04 |
Contextualize, Show and Tell: A Neural Visual Storyteller | ✓ Link | 12.7 | 5.1 | 34.4 | 60.1 | 36.5 | 21.1 | 29.2 | | | | CST | 2018-06-03 |
What Makes A Good Story? Designing Composite Rewards for Visual Storytelling | ✓ Link | 12.4 | 8.6 | 33.9 | | | | 29.9 | 8.3 | | | ReCo-RL | 2019-09-11 |
Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation | | 12.32 | 10.71 | 35.23 | | | | 30.84 | 12.97 | | | HSRL w/ Joint Training | 2018-05-21 |
Vision Transformer Based Model for Describing a Set of Images as a Story | | 12.3 | 4.4 | 35.4 | 63 | 37.5 | 21.5 | 31 | | | | ViT-model | 2022-10-06 |
What Makes A Good Story? Designing Composite Rewards for Visual Storytelling | ✓ Link | 9.8 | 5.9 | 30.1 | | | | 25.1 | 7.5 | | | HSRL | 2019-09-11 |
Plot and Rework: Modeling Storylines for Visual Storytelling | ✓ Link | 7.65 | | 31.6 | | | | | | 1.37 | 45.79 | PR-VIST | 2021-05-14 |
Transitional Adaptation of Pretrained Models for Visual Storytelling | | | 13.8 | 37.2 | | | | 33.1 | | | | TAPM | 2021-06-19 |
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling | | | 8.37 | | | | | | | | | BERT-hLSTMs | 2020-12-03 |
Transitional Adaptation of Pretrained Models for Visual Storytelling | | | 8.3 | 34.1 | | | | 30.2 | | | | TAPM (no V&L) | 2021-06-19 |
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling | | | 7.98 | | | | | | | | | hLSTMs | 2020-12-03 |
Hierarchically-Attentive RNN for Album Summarization and Storytelling | | | 7.38 | 33.94 | | | 20.78 | 29.82 | | | | h-attn-rank | 2017-08-09 |
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation | ✓ Link | | | 30.14 | | | | | | | | GLAC Net | 2018-05-28 |