Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator | ✓ Link | 1.21 | 50 | | EDM2-L + DDO (SD-VAE, 25 steps, DPM-Solver-v3) | 2025-03-03 |
Unified Continuous Generative Models | ✓ Link | 1.24 | 300 | | DDT-XL/2 + UCGM-S (SD-VAE + 150 sampling steps + CFG) | 2025-05-12 |
Unified Continuous Generative Models | ✓ Link | 1.25 | 200 | | DDT-XL/2 + UCGM-S (SD-VAE + 100 sampling steps + CFG) | 2025-05-12 |
Guiding a Diffusion Model with a Bad Version of Itself | ✓ Link | 1.25 | | | EDM2-XXL Autoguidance | 2024-06-04 |
DDT: Decoupled Diffusion Transformer | ✓ Link | 1.28 | 500 | 305 | DDT-XL/2(22en6de 675M + guidance interval ) | 2025-04-08 |
Guiding a Diffusion Model with a Bad Version of Itself | ✓ Link | 1.34 | | | EDM2- S Autoguidance (XS, T /16) | 2024-06-04 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.366 | 1 | | SiDA-EDM2-XXL (1.5B) | 2024-10-19 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.379 | 1 | | SiDA-EDM2-XL (1.1B) | 2024-10-19 |
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models | ✓ Link | 1.40 | | | EDM2-XXL w/ guidance interval | 2024-04-11 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.413 | 1 | | SiDA-EDM2-L (777M) | 2024-10-19 |
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion | | 1.48 | | | SiD2 | 2024-10-25 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.488 | 1 | | SiDA-EDM2-M (498M) | 2024-10-19 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.669 | 1 | | SiDA-EDM2-S (280M) | 2024-10-19 |
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models | ✓ Link | 1.68 | | | EDM2-S w/ guidance interval | 2024-04-11 |
Generative Modeling with Explicit Memory | ✓ Link | 1.71 | | | GMem | 2024-12-11 |
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models | ✓ Link | 1.72 | | | DC-AE-f32 + USiT-2B | 2024-10-14 |
Autoregressive Image Generation without Vector Quantization | ✓ Link | 1.73 | | | MAR-L, Diff Loss | 2024-06-17 |
Self-Improving Diffusion Models with Synthetic Data | | 1.73 | | | SIMS | 2024-08-29 |
PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher | ✓ Link | 1.80 | | | PaGoDA | 2024-05-23 |
Analyzing and Improving the Training Dynamics of Diffusion Models | ✓ Link | 1.81 | 126 | | EDM2-XXL | 2023-12-05 |
Analyzing and Improving the Training Dynamics of Diffusion Models | ✓ Link | 1.85 | 126 | | EDM2-XL | 2023-12-05 |
Analyzing and Improving the Training Dynamics of Diffusion Models | ✓ Link | 1.88 | 126 | | EDM2-L | 2023-12-05 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.888 | 1 | | SiD-EDM2-XL (1.1B) | 2024-10-19 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.907 | 1 | | SiD-EDM2-L (777M) | 2024-10-19 |
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation | ✓ Link | 1.91 | | 324.3 | MAGVIT-v2 | 2023-10-09 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 1.969 | 1 | | SiD-EDM2-XXL (1.5B) | 2024-10-19 |
Analyzing and Improving the Training Dynamics of Diffusion Models | ✓ Link | 2.01 | 126 | | EDM2-M | 2023-12-05 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 2.06 | 1 | | SiD-EDM2-M (498M) | 2024-10-19 |
An Image is Worth 32 Tokens for Reconstruction and Generation | ✓ Link | 2.13 | | | TiTok-B-128 | 2024-06-11 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 2.156 | 1 | | SiDA-EDM2-XS (125M) | 2024-10-19 |
Analyzing and Improving the Training Dynamics of Diffusion Models | ✓ Link | 2.23 | 126 | | EDM2-S | 2023-12-05 |
CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling | | 2.31 | | | DiT-XL/2 with CADS | 2023-10-26 |
StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets | ✓ Link | 2.40 | | | StyleGAN-XL | 2022-02-01 |
An Image is Worth 32 Tokens for Reconstruction and Generation | ✓ Link | 2.49 | | | TiTok-L-64 | 2024-06-11 |
DiffiT: Diffusion Vision Transformers for Image Generation | ✓ Link | 2.67 | | 252.12 | DiffiT | 2023-12-04 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 2.707 | 1 | | SiD-EDM2-S (280M) | 2024-10-19 |
SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models | ✓ Link | 2.80 | | | DiT-XL/2 with SA-Solver | 2023-09-10 |
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization | ✓ Link | 2.89 | | | DiMR-XL/3R | 2024-06-13 |
Analyzing and Improving the Training Dynamics of Diffusion Models | ✓ Link | 2.91 | 126 | | EDM2-XS | 2023-12-05 |
GIVT: Generative Infinite-Vocabulary Transformers | ✓ Link | 2.92 | | | GIVT-Causal-L+A | 2023-12-04 |
Scalable Diffusion Models with Transformers | ✓ Link | 3.04 | | 240.82 | DiT-XL/2 | 2022-12-19 |
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation | ✓ Link | 3.07 | | 213.1 | MAGVIT-v2 (w/o guidance) | 2023-10-09 |
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | ✓ Link | 3.353 | 1 | | SiD-EDM2-XS (125M) | 2024-10-19 |
Discrete Predictor-Corrector Diffusion Models for Image Synthesis | | 3.54 | | 350.2 | DPC-U | 2022-09-29 |
High-Resolution Image Synthesis with Latent Diffusion Models | ✓ Link | 3.60 | | 247.67 | Latent Diffusion (LDM-4-G) | 2021-12-20 |
Polynomial Implicit Neural Representations For Large Diverse Datasets | ✓ Link | 3.81 | | | Poly-INR | 2023-03-20 |
Diffusion Models Beat GANs on Image Synthesis | ✓ Link | 3.85 | | 221.72 | ADM-G, ADM-U | 2021-05-11 |
Simple diffusion: End-to-end diffusion for high resolution images | ✓ Link | 4.28 | | 171 | simple diffusion (U-Net) | 2023-01-26 |
MaskGIT: Masked Generative Image Transformer | ✓ Link | 4.46 | | 342.0 | MaskGIT (a=0.05) | 2022-02-08 |
Simple diffusion: End-to-end diffusion for high resolution images | ✓ Link | 4.53 | | 205.3 | simple diffusion (U-ViT, L) | 2023-01-26 |
MaskGIT: Masked Generative Image Transformer | ✓ Link | 7.32 | | 156.0 | MaskGIT | 2022-02-08 |
Diffusion Models Beat GANs on Image Synthesis | ✓ Link | 7.72 | | 172.71 | ADM-G | 2021-05-11 |