Diffusion
updated
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion
Models
Paper
• 2312.09608
• Published
• 16
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper
• 2310.17680
• Published
• 74
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Paper
• 2310.17994
• Published
• 8
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer
Level Loss
Paper
• 2401.02677
• Published
• 25
PIXART-δ: Fast and Controllable Image Generation with Latent
Consistency Models
Paper
• 2401.05252
• Published
• 49
InstantID: Zero-shot Identity-Preserving Generation in Seconds
Paper
• 2401.07519
• Published
• 57
Towards A Better Metric for Text-to-Video Generation
Paper
• 2401.07781
• Published
• 15
Quantum Denoising Diffusion Models
Paper
• 2401.07049
• Published
• 15
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
Interpolant Transformers
Paper
• 2401.08740
• Published
• 14
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper
• 2401.09962
• Published
• 9
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Paper
• 2401.10061
• Published
• 32
ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation
Paper
• 2312.02201
• Published
• 35
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Paper
• 2312.08128
• Published
• 13
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and
Generating with Multimodal LLMs
Paper
• 2401.11708
• Published
• 30
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper
• 2401.12945
• Published
• 87
Large-scale Reinforcement Learning for Diffusion Models
Paper
• 2401.12244
• Published
• 29
Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent
Diffusion Models for Virtual Try-All
Paper
• 2401.13795
• Published
• 68
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Paper
• 2401.14404
• Published
• 18
BootPIG: Bootstrapping Zero-shot Personalized Image Generation
Capabilities in Pretrained Diffusion Models
Paper
• 2401.13974
• Published
• 14
Transfer Learning for Text Diffusion Models
Paper
• 2401.17181
• Published
• 17
Training-Free Consistent Text-to-Image Generation
Paper
• 2402.03286
• Published
• 67
Paper
• 2402.03570
• Published
• 8
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion
Models by Leveraging CLIP Latent Space
Paper
• 2402.05195
• Published
• 19
Implicit Diffusion: Efficient Optimization through Stochastic Sampling
Paper
• 2402.05468
• Published
• 6
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Paper
• 2402.10210
• Published
• 35
Paper
• 2402.09470
• Published
• 13
DreamMatcher: Appearance Matching Self-Attention for
Semantically-Consistent Text-to-Image Personalization
Paper
• 2402.09812
• Published
• 16
Make a Cheap Scaling: A Self-Cascade Diffusion Model for
Higher-Resolution Adaptation
Paper
• 2402.10491
• Published
• 17
FiT: Flexible Vision Transformer for Diffusion Model
Paper
• 2402.12376
• Published
• 48
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image
Generation
Paper
• 2402.11929
• Published
• 11
Paper
• 2402.13144
• Published
• 100
MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for
Single or Sparse-view 3D Object Reconstruction
Paper
• 2402.12712
• Published
• 18
SDXL-Lightning: Progressive Adversarial Diffusion Distillation
Paper
• 2402.13929
• Published
• 27
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with
Trajectory Stitching
Paper
• 2402.14167
• Published
• 11
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in
Text-to-Image Generation
Paper
• 2402.17245
• Published
• 11
Trajectory Consistency Distillation
Paper
• 2402.19159
• Published
• 16
DistriFusion: Distributed Parallel Inference for High-Resolution
Diffusion Models
Paper
• 2402.19481
• Published
• 22
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain
Text-to-Image Customization
Paper
• 2403.00483
• Published
• 16
StableDrag: Stable Dragging for Point-based Image Editing
Paper
• 2403.04437
• Published
• 27
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K
Text-to-Image Generation
Paper
• 2403.04692
• Published
• 40
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Paper
• 2403.04634
• Published
• 17
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
Distillation
Paper
• 2403.12015
• Published
• 70
AnimateDiff-Lightning: Cross-Model Diffusion Distillation
Paper
• 2403.12706
• Published
• 18
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image
Generation
Paper
• 2403.16990
• Published
• 25
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Paper
• 2403.16627
• Published
• 22
FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image
Editing
Paper
• 2403.18605
• Published
• 11
Bigger is not Always Better: Scaling Properties of Latent Diffusion
Models
Paper
• 2404.01367
• Published
• 22
On the Scalability of Diffusion-based Text-to-Image Generation
Paper
• 2404.02883
• Published
• 19
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image
Generation
Paper
• 2404.02733
• Published
• 22
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion
Models
Paper
• 2404.02747
• Published
• 13
Freditor: High-Fidelity and Transferable NeRF Editing by Frequency
Decomposition
Paper
• 2404.02514
• Published
• 11
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
Prediction
Paper
• 2404.02905
• Published
• 74
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept
Matching
Paper
• 2404.03653
• Published
• 35
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Paper
• 2404.04860
• Published
• 25
UniFL: Improve Stable Diffusion via Unified Feedback Learning
Paper
• 2404.05595
• Published
• 24
BeyondScene: Higher-Resolution Human-Centric Scene Generation With
Pretrained Diffusion
Paper
• 2404.04544
• Published
• 23
Aligning Diffusion Models by Optimizing Human Utility
Paper
• 2404.04465
• Published
• 15
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Paper
• 2404.04478
• Published
• 13
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual
Editing
Paper
• 2404.05717
• Published
• 26
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth
Diffusion
Paper
• 2404.07199
• Published
• 27
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse
Controls to Any Diffusion Model
Paper
• 2404.09967
• Published
• 21
Long-form music generation with latent diffusion
Paper
• 2404.10301
• Published
• 27
EdgeFusion: On-Device Text-to-Image Generation
Paper
• 2404.11925
• Published
• 23
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image
Synthesis
Paper
• 2404.13686
• Published
• 29
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper
• 2404.14507
• Published
• 23
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and
Human Ratings
Paper
• 2404.16820
• Published
• 17
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Paper
• 2404.19752
• Published
• 24
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video
Generation
Paper
• 2405.01434
• Published
• 56
Customizing Text-to-Image Models with a Single Image Pair
Paper
• 2405.01536
• Published
• 22
Diffusion for World Modeling: Visual Details Matter in Atari
Paper
• 2405.12399
• Published
• 30
EM Distillation for One-step Diffusion Models
Paper
• 2405.16852
• Published
• 12
Kaleido Diffusion: Improving Conditional Diffusion Models with
Autoregressive Latent Modeling
Paper
• 2405.21048
• Published
• 16
Step-aware Preference Optimization: Aligning Preference with Denoising
Performance at Each Step
Paper
• 2406.04314
• Published
• 30
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Paper
• 2406.04333
• Published
• 38
MLCM: Multistep Consistency Distillation of Latent Diffusion Model
Paper
• 2406.05768
• Published
• 13
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
Paper
• 2406.06911
• Published
• 12
Interpreting the Weight Space of Customized Diffusion Models
Paper
• 2406.09413
• Published
• 20
Alleviating Distortion in Image Generation via Multi-Resolution
Diffusion Models
Paper
• 2406.09416
• Published
• 29
Make It Count: Text-to-Image Generation with an Accurate Number of
Objects
Paper
• 2406.10210
• Published
• 78
Exploring the Role of Large Language Models in Prompt Encoding for
Diffusion Models
Paper
• 2406.11831
• Published
• 22
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image
Diffusion Models
Paper
• 2406.12042
• Published
• 8
Immiscible Diffusion: Accelerating Diffusion Training with Noise
Assignment
Paper
• 2406.12303
• Published
• 4
Invertible Consistency Distillation for Text-Guided Image Editing in
Around 7 Steps
Paper
• 2406.14539
• Published
• 27
Repulsive Score Distillation for Diverse Sampling of Diffusion Models
Paper
• 2406.16683
• Published
• 4
Aligning Diffusion Models with Noise-Conditioned Perception
Paper
• 2406.17636
• Published
• 27
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Paper
• 2407.01392
• Published
• 44
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Paper
• 2407.06938
• Published
• 25
Video Diffusion Alignment via Reward Gradients
Paper
• 2407.08737
• Published
• 49
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Paper
• 2407.08083
• Published
• 32
Live2Diff: Live Stream Translation via Uni-directional Attention in
Video Diffusion Models
Paper
• 2407.08701
• Published
• 13
DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized
Deepfake Detection
Paper
• 2406.00856
• Published
• 12
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Paper
• 2407.16982
• Published
• 42
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular
Depth Estimation
Paper
• 2407.17952
• Published
• 32
Diffusion Feedback Helps CLIP See Better
Paper
• 2407.20171
• Published
• 36
Diffusion Augmented Agents: A Framework for Efficient Exploration and
Transfer Learning
Paper
• 2407.20798
• Published
• 24
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Paper
• 2407.21705
• Published
• 27
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models
Paper
• 2408.00735
• Published
• 16
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy
Curvature of Attention
Paper
• 2408.00760
• Published
• 7
ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative
Generation
Paper
• 2408.02226
• Published
• 11
An Object is Worth 64x64 Pixels: Generating 3D Object via Image
Diffusion
Paper
• 2408.03178
• Published
• 40
Diffusion Models as Data Mining Tools
Paper
• 2408.02752
• Published
• 15
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
• 2408.04619
• Published
• 175
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language
Models
Paper
• 2408.04594
• Published
• 14
Make-An-Agent: A Generalizable Policy Network Generator with
Behavior-Prompted Diffusion
Paper
• 2407.10973
• Published
• 11
Visual Text Generation in the Wild
Paper
• 2407.14138
• Published
• 9
Paper
• 2408.07009
• Published
• 62
DC3DO: Diffusion Classifier for 3D Objects
Paper
• 2408.06693
• Published
• 11
Paper
• 2408.07116
• Published
• 20