OpenCodePapers
text-to-video-generation-on-msr-vtt
Text-to-Video Generation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
FVD
↕
CLIPSIM
↕
CLIP-FID
↕
FID
↕
ModelName
ReleaseDate
↕
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
104.0
0.2793
9.35
Snap Video (512x288)
2024-02-22
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
110.4
0.2793
8.48
Snap Video (288×288)
2024-02-22
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
✓ Link
188.36
0.3012
11.27
Video-LaVIT
2024-02-05
VideoPoet: A Large Language Model for Zero-Shot Video Generation
213
0.3123
VideoPoet
2023-12-21
Make Pixels Dance: High-Dynamic Video Generation
381
0.3125
PixelDance
2023-11-18
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
✓ Link
406
0.2947
8.60
HiGen
2023-12-07
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
✓ Link
441
0.2991
8.19
TF-T2V
2023-12-25
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
✓ Link
538
0.3072
13.08
Show-1
2023-09-27
ModelScope Text-to-Video Technical Report
✓ Link
550
0.2930
11.09
ModelScopeT2V
2023-08-12
VideoComposer: Compositional Video Synthesis with Motion Controllability
✓ Link
580
0.2932
VideoComposer
2023-06-03
MagicVideo: Efficient Video Generation With Latent Diffusion Models
998
36.5
MagicVideo
2022-11-20
Make-A-Video: Text-to-Video Generation without Text-Video Data
✓ Link
0.3049
13.17
13.17
Make-A-Video
2022-09-29
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
✓ Link
0.2929
Video LDM
2023-04-18
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
✓ Link
0.2644
23.4
MMVG
2022-11-23
Make-A-Video: Text-to-Video Generation without Text-Video Data
✓ Link
0.2631
23.59
23.59
CogVideo (English)
2022-09-29
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
✓ Link
0.2614
24.78
CogVideo (Chinese)
2023-04-18
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
✓ Link
0.2439
47.68
47.68
NUWA
2021-11-24
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
✓ Link
0.2402
GODIVA
2021-04-30