OpenCodePapers

text-to-video-generation-on-ucf-101

Text-to-Video Generation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeFVD16ModelNameReleaseDate
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis200.2Snap Video (Zero-shot, 512x288)2024-02-22
Make Pixels Dance: High-Dynamic Video Generation242.82PixelDance (Zero-shot, 256x256)2023-11-18
Photorealistic Video Generation with Diffusion Models258.1W.A.L.T 3B2023-12-11
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis260.1Snap Video (Zero-shot, 288×288)2024-02-22
Lumiere: A Space-Time Diffusion Model for Video Generation✓ Link332.49Lumiere (Zero-shot, 1024x1024)2024-01-23
VideoPoet: A Large Language Model for Zero-Shot Video Generation355VideoPoet2023-12-21
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models355.19PYoCo (Zero-shot, 64x64)2023-05-17
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models✓ Link526.30LAVIE (Zero-shot, 320x512)2023-09-26
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models✓ Link550.61Video LDM (Zero-shot, 320x512)2023-04-18
MagicVideo: Efficient Video Generation With Latent Diffusion Models699MagicVideo (Zero-shot, 256x256)2022-11-20