OpenCodePapers

image-generation-on-imagenet-256x256

Image Generation
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeFIDInception scoreNFEModelNameReleaseDate
Unified Continuous Generative Models✓ Link1.0680SiT-XL/2 + UCGM-S (E2E-VAE + 40 sampling steps + CFG)2025-05-12
Unified Continuous Generative Models✓ Link1.2130UCGM-XL/2 (VA-VAE + 30 sampling steps, without guidance)2025-05-12
Unified Continuous Generative Models✓ Link1.2140UCGM-XL/2 (E2E-VAE + 40 sampling steps, without guidance)2025-05-12
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator✓ Link1.2150EDM2-L + DDO (SD-VAE, 25 steps, DPM-Solver-v3)2025-03-03
Unified Continuous Generative Models✓ Link1.21100LightningDiT + UCGM-S (VA-VAE + 50 sampling steps + CFG)2025-05-12
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation✓ Link1.24xAR-H2025-02-27
REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers✓ Link1.26314.9SiT-XL/2 + REPA-E2025-04-15
DDT: Decoupled Diffusion Transformer✓ Link1.26310.6DDT-XL/2(22en6de 675M + guidance interval )2025-04-08
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation✓ Link1.28xAR-L2025-02-27
Flow-Anchored Consistency Models✓ Link1.322FACM (2-step)2025-07-04
Generative Modeling with Explicit Memory✓ Link1.32GMem (with the guidance interval)2024-12-11
Diffusion Models without Classifier-free Guidance✓ Link1.34SiT-XL/2 + MG2025-02-17
AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model✓ Link1.35318.8AliTok-XL, autoregressive, 662M2025-06-05
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models✓ Link1.35LightningDiT + VA-VAE (with the guidance interval)2025-01-02
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion1.38SiD22024-10-25
U-REPA: Aligning Diffusion U-Nets to ViTs✓ Link1.41SiT↓-XL/2+U-REPA (with the guidance interval)2025-03-24
AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model✓ Link1.42326.6AliTok-XL, autoregressive, 318M2025-06-05
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think✓ Link1.42SiT-XL/2 + REPA (with the guidance interval)2024-10-09
Randomized Autoregressive Visual Generation✓ Link1.48RAR-XXL, autoregressive2024-11-01
Randomized Autoregressive Visual Generation✓ Link1.50RAR-XL, autoregressive2024-11-01
MaskBit: Embedding-free Image Generation via Bit Tokens✓ Link1.52MaskBit2024-09-24
Generative Modeling with Explicit Memory✓ Link1.53GMem (w/o guidance)2024-12-11
Elucidating the design space of language models for image generation✓ Link1.54ELM2024-10-21
Autoregressive Image Generation without Vector Quantization✓ Link1.55MAR-H, Diff Loss2024-06-17
PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher✓ Link1.56PaGoDA2024-05-23
Efficient Diffusion Training via Min-SNR Weighting Strategy✓ Link1.57ViT-XL/2 with limited Interval Guidance2023-03-16
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer✓ Link1.58MDTv22023-03-25
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves✓ Link1.58SiT-XL + SRA2025-05-05
Robust Latent Matters: Boosting Image Generation with Sampling Error✓ Link1.60RobustTok-L2025-03-11
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization✓ Link1.63DiMR-G/2R2024-06-13
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching✓ Link1.65FlowAR2024-12-19
Flow-Anchored Consistency Models✓ Link1.701FACM (1-step)2025-07-04
CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling1.70DiT-XL/2 with CADS2023-10-26
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization✓ Link1.70DiMR-XL/2R2024-06-13
Randomized Autoregressive Visual Generation✓ Link1.70RAR-L, autoregressive2024-11-01
DiffiT: Diffusion Vision Transformers for Image Generation✓ Link1.73DiffiT2023-12-04
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction✓ Link1.73VAR (Visual Autoregressive)2024-04-03
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation✓ Link1.78MAGVIT-v22023-10-09
Autoregressive Image Generation without Vector Quantization✓ Link1.78MAR-L, Diff Loss2024-06-17
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer✓ Link1.79MDT2023-03-25
Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models✓ Link1.83Discriminator Guidance2022-11-28
Diffusion Models Need Visual Priors for Image Generation1.83DoD-XL2024-10-11
Robust Latent Matters: Boosting Image Generation with Sampling Error✓ Link1.83RobustTok-B2025-03-11
Autoregressive Image Generation with Randomized Parallel Decoding✓ Link1.94ARPG-XXL2025-03-13
Randomized Autoregressive Visual Generation✓ Link1.95RAR-B, autoregressive2024-11-01
An Image is Worth 32 Tokens for Reconstruction and Generation✓ Link1.97TiTok-S-1282024-06-11
PixelFlow: Pixel-Space Generative Models with Flow✓ Link1.98PixelFlow2025-04-10
Relay Diffusion: Unifying diffusion process across resolutions for image synthesis✓ Link1.99RDM2023-09-04
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification✓ Link2.03FasterDiT-XL/22024-10-14
Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling✓ Link2.05338.08LEGO-XL2023-10-10
Autoregressive Image Generation with Randomized Parallel Decoding✓ Link2.1ARPG-XL2025-03-13
SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer✓ Link2.14StyleSAN-XL2023-01-30
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation✓ Link2.18LlamaGen2024-06-10
Scalable Diffusion Models with Transformers✓ Link2.27DiT-XL/22022-12-19
StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets✓ Link2.30StyleGAN-XL2022-02-01
Autoregressive Image Generation without Vector Quantization✓ Link2.31MAR-B, Diff Loss2024-06-17
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation✓ Link2.33Open-MAGVIT2-XL2024-09-06
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer✓ Link2.37ACDiT2024-12-10
Autoregressive Image Generation with Randomized Parallel Decoding✓ Link2.44ARPG-L2025-03-13
An Image is Worth 32 Tokens for Reconstruction and Generation✓ Link2.48TiTok-B-642024-06-11
GIVT: Generative Infinite-Vocabulary Transformers✓ Link2.59GIVT-Causal-L+A 2023-12-04
[]()2.74Patch Diffusion
An Image is Worth 32 Tokens for Reconstruction and Generation✓ Link2.77TiTok-B-322024-06-11
Diffusion Models Need Visual Priors for Image Generation2.79DoD-B2024-10-11
Polynomial Implicit Neural Representations For Large Diverse Datasets✓ Link2.86Poly-INR2023-03-20
MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization✓ Link3.02294.1MGVQ2025-07-14
Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models✓ Link3.18ADM-G++ (FID)2022-11-28
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective✓ Link3.39205.96DiGIT-0.7B2024-10-16
Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer3.41Contextual RQ-Transformer2022-06-09
Scaling up GANs for Text-to-Image Synthesis✓ Link3.45GigaGAN2023-03-09
Return of Unconditional Generation: A Self-supervised Representation Generation Method✓ Link3.49RCG-L (w/o guidance)2023-12-06
BIGRoC: Boosting Image Generation via a Robust Classifier✓ Link3.63BIGRoC-gt (Guided-Diffusion)2021-08-08
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation✓ Link3.65MAGVIT-v2 (w/o guidance)2023-10-09
BIGRoC: Boosting Image Generation via a Robust Classifier✓ Link3.69BIGRoC-pl (Guided-Diffusion)2021-08-08
Simple diffusion: End-to-end diffusion for high resolution images✓ Link3.71simple diffusion (U-Net)2023-01-26
Simple diffusion: End-to-end diffusion for high resolution images✓ Link3.75simple diffusion (U-ViT, L)2023-01-26
Autoregressive Image Generation using Residual Quantization✓ Link3.83RQ-Transformer2022-03-03
Diffusion Models Beat GANs on Image Synthesis✓ Link3.94ADM-G, ADM-U2021-05-11
Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation✓ Link3.96ADM-G + EDS (ED-DPM, classifier_scale=0.75)2022-06-23
MaskGIT: Masked Generative Image Transformer✓ Link4.02MaskGIT (a=0.05)2022-02-08
Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation✓ Link4.09ADM-G + EDS + ECT (ED-DPM, classifier_scale=1.0)2022-06-23
[]()4.29LDM
Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models✓ Link4.45ADM-G++ (Recall)2022-11-28
Flow Matching in Latent Space✓ Link4.46LFM2023-07-17
Scalable Adaptive Computation for Iterative Generation✓ Link4.51RIN2022-12-22
Diffusion Models Beat GANs on Image Synthesis✓ Link4.59ADM-G2021-05-11
Cascaded Diffusion Models for High Fidelity Image Generation4.88CDM2021-05-30
Taming Transformers for High-Resolution Image Synthesis✓ Link5.2VQGAN+Transformer (k=600, p=1.0, a=0.05)2020-12-17
MaskGIT: Masked Generative Image Transformer✓ Link6.18MaskGIT2022-02-08
Taming Transformers for High-Resolution Image Synthesis✓ Link6.59VQGAN+Transformer (k=mixed, p=1.0, a=0.005)2020-12-17
Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values✓ Link6.82Polarity-BigGAN2022-03-03
Large Scale GAN Training for High Fidelity Natural Image Synthesis✓ Link8.1BigGAN-deep2018-09-28
[]()11.84ADM
Improved Denoising Diffusion Probabilistic Models✓ Link12.3Improved DDPM2021-02-18