Paper | Code | Total Score | Venue | Aesthetics | Motion | FaceSim | GmeScore | NexusScore | NaturalScore | ModelName | ReleaseDate |
---|---|---|---|---|---|---|---|---|---|---|---|
Identity-Preserving Text-to-Video Generation by Frequency Decomposition | ✓ Link | 0.5446 | Close-Source | 0.4460 | 0.4160 | 0.4010 | 0.6620 | 0.4592 | 0.7906 | Kling 1.6 | 2024-11-26 |
VACE: All-in-One Video Creation and Editing | ✓ Link | 0.5287 | Open-Source | 0.4721 | 0.1502 | 0.5509 | 0.6727 | 0.4420 | 0.7278 | Wan2.1-VACE-14B | 2025-03-10 |
Phantom: Subject-consistent video generation via cross-modal alignment | ✓ Link | 0.5232 | Open-Source | 0.4639 | 0.3342 | 0.5148 | 0.7065 | 0.3743 | 0.6866 | Phantom-Wan-14B | 2025-02-16 |
Phantom: Subject-consistent video generation via cross-modal alignment | ✓ Link | 0.5071 | Open-Source | 0.4667 | 0.1429 | 0.4855 | 0.6942 | 0.4244 | 0.7026 | Phantom-Wan-1.3B | 2025-02-16 |
SkyReels-A2: Compose Anything in Video Diffusion Transformers | ✓ Link | 0.4961 | Open-Source | 0.3940 | 0.2560 | 0.4595 | 0.6454 | 0.4377 | 0.6722 | SkyReels-A2-Wan2.1-14B-Preview | 2025-04-03 |
Identity-Preserving Text-to-Video Generation by Frequency Decomposition | ✓ Link | 0.4888 | Close-Source | 0.4687 | 0.2470 | 0.3080 | 0.6921 | 0.4541 | 0.6979 | Pika 2.1 | 2024-11-26 |
MAGREF: Masked Guidance for Any-Reference Video Generation | ✓ Link | 0.4793 | Open-Source | 0.4502 | 0.2181 | 0.3083 | 0.7047 | 0.4304 | 0.6949 | MAGREF-480P | 2025-05-29 |
Identity-Preserving Text-to-Video Generation by Frequency Decomposition | ✓ Link | 0.4759 | Close-Source | 0.4147 | 0.1352 | 0.3511 | 0.6757 | 0.4355 | 0.7144 | Vidu 2.0 | 2024-11-26 |
VACE: All-in-One Video Creation and Editing | ✓ Link | 0.4553 | Open-Source | 0.4824 | 0.1883 | 0.2058 | 0.7126 | 0.3795 | 0.7178 | Wan2.1-VACE-1.3B | 2025-03-10 |
VACE: All-in-One Video Creation and Editing | ✓ Link | 0.4395 | Open-Source | 0.4727 | 0.1203 | 0.1658 | 0.7138 | 0.4004 | 0.7056 | Wan2.1-VACE-1.3B-Preview | 2025-03-10 |