OpenCodePapers

text-to-video-generation-on-ucf-101

Text-to-Video Generation

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	FVD16	ModelName	ReleaseDate
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis		200.2	Snap Video (Zero-shot, 512x288)	2024-02-22
Make Pixels Dance: High-Dynamic Video Generation		242.82	PixelDance (Zero-shot, 256x256)	2023-11-18
Photorealistic Video Generation with Diffusion Models		258.1	W.A.L.T 3B	2023-12-11
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis		260.1	Snap Video (Zero-shot, 288×288)	2024-02-22
Lumiere: A Space-Time Diffusion Model for Video Generation	✓ Link	332.49	Lumiere (Zero-shot, 1024x1024)	2024-01-23
VideoPoet: A Large Language Model for Zero-Shot Video Generation		355	VideoPoet	2023-12-21
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models		355.19	PYoCo (Zero-shot, 64x64)	2023-05-17
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models	✓ Link	526.30	LAVIE (Zero-shot, 320x512)	2023-09-26
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models	✓ Link	550.61	Video LDM (Zero-shot, 320x512)	2023-04-18
MagicVideo: Efficient Video Generation With Latent Diffusion Models		699	MagicVideo (Zero-shot, 256x256)	2022-11-20