-
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices,
arXiv, 2410.11795
, arxiv, pdf, cication: -1Zhiyuan Ma, Yuzhu Zhang, Guoli Jia, ..., Jianjun Li, Bowen Zhou · (Efficient-DMs-Survey - ponyzym)
-
Decentralized Diffusion Models,
arXiv, 2501.05450
, arxiv, pdf, cication: -1David McAllister, Matthew Tancik, Jiaming Song, ..., Angjoo Kanazawa · (decentralizeddiffusion.github)
-
🌟 The GAN is dead; long live the GAN! A Modern GAN Baseline,
arXiv, 2501.05441
, arxiv, pdf, cication: -1Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov, ..., James Tompkin · (R3GAN - brownvc)
-
Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty,
arXiv, 2412.06771
, arxiv, pdf, cication: -1Meera Hahn, Wenjun Zeng, Nithish Kannen, ..., Been Kim, Zi Wang · (proactive_t2i_agents - google-deepmind)
-
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation,
arXiv, 2412.05148
, arxiv, pdf, cication: -1Donald Shenaj, Ondrej Bohdal, Mete Ozay, ..., Pietro Zanuttigh, Umberto Michieli
-
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training,
arXiv, 2412.09619
, arxiv, pdf, cication: -1Dongting Hu, Jierun Chen, Xijie Huang, ..., Yanwu Xu, Jian Ren · (snap-research.github)
-
A Noise is Worth Diffusion Guidance,
arXiv, 2412.03895
, arxiv, pdf, cication: -1Donghoon Ahn, Jiwon Kang, Sanghyun Lee, ..., Kyong Hwan Jin, Seungryong Kim · (cvlab-kaist.github)
-
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis,
arXiv, 2412.01819
, arxiv, pdf, cication: -1Anton Voronov, Denis Kuznedelev, Mikhail Khoroshikh, ..., Valentin Khrulkov, Dmitry Baranchuk · (switti - yandex-research)
-
🌟 ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting,
arXiv, 2411.17176
, arxiv, pdf, cication: -1Chengyou Jia, Changliang Xia, Zhuohang Dang, ..., Hangwei Qian, Minnan Luo · (chengyou-jia.github)
-
One Diffusion to Generate Them All,
arXiv, 2411.16318
, arxiv, pdf, cication: -1Duong H. Le, Tuan Pham, Sangho Lee, ..., Ranjay Krishna, Jiasen Lu · (OneDiffusion - lehduong)
-
SketchAgent: Language-Driven Sequential Sketch Generation,
arXiv, 2411.17673
, arxiv, pdf, cication: -1Yael Vinker, Tamar Rott Shaham, Kristine Zheng, ..., Judith E Fan, Antonio Torralba
-
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models,
arXiv, 2411.07126
, arxiv, pdf, cication: -1NVIDIA, :, Yuval Atzmon, ..., Yu Zeng, Qinsheng Zhang · (research.nvidia)
-
Monetico: An Efficient Reproduction of Meissonic for Text-to-Image Synthesis 🤗
· (Meissonic - viiika)
-
How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold,
arXiv, 2410.15002
, arxiv, pdf, cication: -1Sahil Verma, Royi Rassin, Arnav Das, ..., Hannaneh Hajishirzi, Yanai Elazar · (MIMETIC-2.git - vsahil) · (how-many-van-goghs-does-it-take.github)
-
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation,
arXiv, 2404.15100
, arxiv, pdf, cication: -1Xun Wu, Shaohan Huang, Furu Wei
-
GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation,
arXiv, 2411.18499
, arxiv, pdf, cication: -1Pengfei Zhou, Xiaopeng Peng, Jiajun Song, ..., Wenqi Shao, Kaipeng Zhang · (opening-benchmark.github)
-
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment,
arXiv, 2411.17188
, arxiv, pdf, cication: -1Dongping Chen, Ruoxi Chen, Shu Pu, ..., Pan Zhou, Ranjay Krishna · (interleave-eval.github) · (ISG - Dongping-Chen) · (arxiv) · (huggingface)
-
TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models,
arXiv, 2411.02437
, arxiv, pdf, cication: -1Georgia Gabriela Sampaio, Ruixiang Zhang, Shuangfei Zhai, ..., Navdeep Jaitly, Yizhe Zhang
-
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models,
arXiv, 2410.22775
, arxiv, pdf, cication: -1Arash Marioriyad, Parham Rezaei, Mahdieh Soleymani Baghshah, ..., Mohammad Hossein Rohban
-
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models,
arXiv, 2411.05007
, arxiv, pdf, cication: -1Muyang Li, Yujun Lin, Zhekai Zhang, ..., Jun-Yan Zhu, Song Han · (deepcompressor - mit-han-lab) · (nunchaku - mit-han-lab) · (hanlab.mit) · (svdquant.mit) · (hanlab.mit)
-
SEE-DPO: Self Entropy Enhanced Direct Preference Optimization,
arXiv, 2411.04712
, arxiv, pdf, cication: -1Shivanshu Shekhar, Shreyas Singh, Tong Zhang
-
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion,
arXiv, 2410.19324
, arxiv, pdf, cication: -1Emiel Hoogeboom, Thomas Mensink, Jonathan Heek, ..., Ruiqi Gao, Tim Salimans · (x)
-
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation,
arXiv, 2412.03255
, arxiv, pdf, cication: -1Qingdong He, Jinlong Peng, Pengcheng Xu, ..., Xiangtai Li, Jiangning Zhang
-
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition,
arXiv, 2412.19712
, arxiv, pdf, cication: -1Jiawei Lin, Shizhao Sun, Danqing Huang, ..., Ji Li, Jiang Bian
-
Learning Flow Fields in Attention for Controllable Person Image Generation,
arXiv, 2412.08486
, arxiv, pdf, cication: -1Zijian Zhou, Shikun Liu, Xiao Han, ..., Miaojing Shi, Sen He · (Leffa - franciszzj) · (huggingface) · (huggingface)
-
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation,
arXiv, 2412.00100
, arxiv, pdf, cication: -1Maitreya Patel, Song Wen, Dimitris N. Metaxas, ..., Yezhou Yang · (flowchef.github)
-
Style-Friendly SNR Sampler for Style-Driven Generation,
arXiv, 2411.14793
, arxiv, pdf, cication: -1Jooyoung Choi, Chaehun Shin, Yeongtak Oh, ..., Heeseung Kim, Sungroh Yoon
-
OminiControl: Minimal and Universal Control for Diffusion Transformer,
arXiv, 2411.15098
, arxiv, pdf, cication: -1Zhenxiong Tan, Songhua Liu, Xingyi Yang, ..., Qiaochu Xue, Xinchao Wang · (huggingface) · (OminiControl - Yuanshi9815) · (arxiv)
-
🌟 ROICtrl: Boosting Instance Control for Visual Generation,
arXiv, 2411.17949
, arxiv, pdf, cication: -1Yuchao Gu, Yipin Zhou, Yunfan Ye, ..., Kevin Qinghong Lin, Mike Zheng Shou · (roictrl.github) · (ROICtrl - showlab)
-
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement,
arXiv, 2411.06558
, arxiv, pdf, cication: -1Zhennan Chen, Yajie Li, Haofan Wang, ..., Jian Yang, Ying Tai · (RAG-Diffusion - NJU-PCALab)
-
text-to-pose - clement-bonnet
Improving Diffusion Model Control and Quality
-
In-Context LoRA for Diffusion Transformers
· (arxiv) · (In-Context-LoRA - ali-vilab) · (huggingface)
-
ControlNetPlus - xinsir6
All-in-one ControlNet for image generations and editing!
-
Controlling Language and Diffusion Models by Transporting Activations,
arXiv, 2410.23054
, arxiv, pdf, cication: -1Pau Rodriguez, Arno Blaas, Michal Klein, ..., Marco Cuturi, Xavier Suau
-
CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation,
arXiv, 2410.09400
, arxiv, pdf, cication: -1Yifeng Xu, Zhenliang He, Shiguang Shan, ..., Xilin Chen · (ctrlora - xyfJASON)
-
PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference,
arXiv, 2405.14430
, arxiv, pdf, cication: -1Jiarui Fang, Jinzhe Pan, Jiannan Wang, ..., Aoyu Li, Xibo Sun · (xDiT - xdit-project) · (models)
-
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model,
arXiv, 2410.13925
, arxiv, pdf, cication: 1ZiDong Wang, Zeyu Lu, Di Huang, ..., Wanli Ouyang, and Lei Bai
-
Taming Rectified Flow for Inversion and Editing,
arXiv, 2411.04746
, arxiv, pdf, cication: -1Jiangshan Wang, Junfu Pu, Zhongang Qi, ..., Xiu Li, Ying Shan · (RF-Solver-Edit - wangjiangshan0725)
-
Constant Acceleration Flow,
arXiv, 2411.00322
, arxiv, pdf, cication: -1Dogyun Park, Sojin Lee, Sihyeon Kim, ..., Youngjoon Hong, Hyunwoo J. Kim · (CAF - mlvlab)
-
Accelerated Diffusion Models via Speculative Sampling,
arXiv, 2501.05370
, arxiv, pdf, cication: -1Valentin De Bortoli, Alexandre Galashov, Arthur Gretton, ..., Arnaud Doucet
-
BiDM: Pushing the Limit of Quantization for Diffusion Models,
arXiv, 2412.05926
, arxiv, pdf, cication: -1Xingyu Zheng, Xianglong Liu, Yichen Bian, ..., Jinyang Guo, Haotong Qin
-
EfficientML.ai 2024 | Introduction to SVDQuant for 4-bit Diffusion Models 🎬
-
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up,
arXiv, 2412.16112
, arxiv, pdf, cication: -1Songhua Liu, Zhenxiong Tan, Xinchao Wang · (CLEAR - Huage001)
-
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance,
arXiv, 2412.02687
, arxiv, pdf, cication: -1Viet Nguyen, Anh Nguyen, Trung Dao, ..., Toan Tran, Anh Tran · (snoopi-onestep.github)
-
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training,
arXiv, 2412.02030
, arxiv, pdf, cication: -1Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou, ..., Yi-Zhe Song · (chendaryen.github) · (arxiv) · (huggingface) · (huggingface)
-
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models,
arXiv, 2410.21759
, arxiv, pdf, cication: -1Hang Guo, Yawei Li, Tao Dai, ..., Shu-Tao Xia, Luca Benini · (IntLoRA - csguoh)
-
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers,
arXiv, 2411.10510
, arxiv, pdf, cication: -1Joseph Liu, Joshua Geddes, Ziyu Guo, ..., Haomiao Jiang, Mahesh Kumar Nandwana · (SmoothCache - Roblox)
-
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply Better Samples,
arXiv, 2411.08954
, arxiv, pdf, cication: -1Noël Vouitsis, Rasa Hosseinzadeh, Brendan Leigh Ross, ..., Jesse C. Cresswell, Gabriel Loaiza-Ganem
-
Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models,
arXiv, 2410.11081
, arxiv, pdf, cication: -1Cheng Lu, Yang Song · (openai)
-
Nested Attention: Semantic-aware Attention Values for Concept Personalization,
arXiv, 2501.01407
, arxiv, pdf, cication: -1Or Patashnik, Rinon Gal, Daniil Ostashev, ..., Kfir Aberman, Daniel Cohen-Or
-
Dense-Face: Personalized Face Generation Model via Dense Annotation Prediction,
arXiv, 2412.18149
, arxiv, pdf, cication: -1Xiao Guo, Manh Tran, Jiaxin Cheng, ..., Xiaoming Liu
-
ObjectMate: A Recurrence Prior for Object Insertion and Subject-Driven Generation,
arXiv, 2412.08645
, arxiv, pdf, cication: -1Daniel Winter, Asaf Shul, Matan Cohen, ..., Alex Rav-Acha, Yedid Hoshen · (object-mate)
-
🌟 Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator,
arXiv, 2411.15466
, arxiv, pdf, cication: -1Chaehun Shin, Jooyoung Choi, Heeseung Kim, ..., Sungroh Yoon · (diptychprompting.github)
-
Diffusion Self-Distillation for Zero-Shot Customized Image Generation,
arXiv, 2411.18616
, arxiv, pdf, cication: -1Shengqu Cai, Eric Chan, Yunzhi Zhang, ..., Jiajun Wu, Gordon Wetzstein · (primecai.github)
-
consistory - NVlabs
Training-Free Consistent Text-to-Image Generation
-
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models,
arXiv, 2410.13370
, arxiv, pdf, cication: -1Donghao Zhou, Jiancheng Huang, Jinbin Bai, ..., Xiaowei Hu, Pheng-Ann Heng · (correr-zhou.github) · (MagicTailor - correr-zhou)
-
· (mp.weixin.qq) · (SDXL_EcomID_ComfyUI - alimama-creative)
-
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models,
arXiv, 2410.13370
, arxiv, pdf, cication: -1Donghao Zhou, Jiancheng Huang, Jinbin Bai, ..., Xiaowei Hu, Pheng-Ann Heng · (correr-zhou.github) · (arxiv) · (MagicTailor - Correr-Zhou)
-
TryOffAnyone: Tiled Cloth Generation from a Dressed Person,
arXiv, 2412.08573
, arxiv, pdf, cication: -1Ioannis Xarchakos, Theodoros Koukopoulos
-
FashionComposer: Compositional Fashion Image Generation,
arXiv, 2412.14168
, arxiv, pdf, cication: -1Sihui Ji, Yiyang Wang, Xi Chen, ..., Hao Luo, Hengshuang Zhao · (sihuiji.github)
-
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models,
arXiv, 2412.04146
, arxiv, pdf, cication: -1Xinghui Li, Qichao Sun, Pengze Zhang, ..., Songtao Zhao, Qian He · (crayon-shinchan.github) · (AnyDressing - Crayon-Shinchan)
-
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models,
arXiv, 2411.18350
, arxiv, pdf, cication: -1Riza Velioglu, Petra Bevandic, Robin Chan, ..., Barbara Hammer · (rizavelioglu.github) · (tryoffdiff - rizavelioglu)
-
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on,
arXiv, 2411.10499
, arxiv, pdf, cication: -1Boyuan Jiang, Xiaobin Hu, Donghao Luo, ..., Yunsheng Wu, Yanwei Fu · (FitDiT - BoyuanJiang)
-
Fashion-VDM: Video Diffusion Model for Virtual Try-On,
arXiv, 2411.00225
, arxiv, pdf, cication: -1Johanna Karras, Yingwei Li, Nan Liu, ..., Chris Lee, Ira Kemelmacher-Shlizerman · (johannakarras.github) · (arxiv)
-
KlingAI-Virtual-Try-On - AtaUllahB
-
Hidden in the Noise: Two-Stage Robust Watermarking for Images,
arXiv, 2412.04653
, arxiv, pdf, cication: -1Kasra Arabi, Benjamin Feuer, R. Teal Witter, ..., Chinmay Hegde, Niv Cohen
-
Detecting Human Artifacts from Text-to-Image Models,
arXiv, 2411.13842
, arxiv, pdf, cication: -1Kaihong Wang, Lingzhi Zhang, Jianming Zhang · (HADM - wangkaihong)
-
Fine-Tuned Vision Transformer (ViT) for NSFW Image Classification 🤗
-
Watermark Anything with Localized Messages,
arXiv, 2411.07231
, arxiv, pdf, cication: -1Tom Sander, Pierre Fernandez, Alain Durmus, ..., Teddy Furon, Matthijs Douze · (watermark-anything - facebookresearch)
-
🌟 Parallelized Autoregressive Visual Generation,
arXiv, 2412.15119
, arxiv, pdf, cication: -1Yuqing Wang, Shuhuai Ren, Zhijie Lin, ..., Jiashi Feng, Xihui Liu · (epiphqny.github) · (PAR - Epiphqny)
-
Next Patch Prediction for Autoregressive Visual Generation,
arXiv, 2412.15321
, arxiv, pdf, cication: -1Yatian Pang, Peng Jin, Shuo Yang, ..., Harry Yang, Li Yuan · (Next-Patch-Prediction - PKU-YuanGroup)
-
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching,
arXiv, 2412.17153
, arxiv, pdf, cication: -1Enshu Liu, Xuefei Ning, Yu Wang, ..., Zinan Lin · (imagination-research.github)
-
Causal Diffusion Transformers for Generative Modeling,
arXiv, 2412.12095
, arxiv, pdf, cication: -1Chaorui Deng, Deyao Zhu, Kunchang Li, ..., Shi Guang, Haoqi Fan · (causalfusion. - causalfusion)
-
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer,
arXiv, 2412.07720
, arxiv, pdf, cication: -1Jinyi Hu, Shengding Hu, Yuxuan Song, ..., Wei-Ying Ma, Maosong Sun · (ACDiT - thunlp)
-
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis,
arXiv, 2412.04431
, arxiv, pdf, cication: -1Jian Han, Jinlai Liu, Yi Jiang, ..., Bingyue Peng, Xiaobing Liu
-
🌟 X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models,
arXiv, 2412.01824
, arxiv, pdf, cication: -1Zeyi Sun, Ziyang Chu, Pan Zhang, ..., Dahua Lin, Jiaqi Wang · (X-Prompt - SunzeY)
-
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient,
arXiv, 2411.17787
, arxiv, pdf, cication: -1Zigeng Chen, Xinyin Ma, Gongfan Fang, ..., Xinchao Wang · (CoDe - czg1225) · (czg1225.github) · (huggingface)
-
🌟 JetFormer: An Autoregressive Generative Model of Raw Images and Text,
arXiv, 2411.19722
, arxiv, pdf, cication: -1Michael Tschannen, André Susano Pinto, Alexander Kolesnikov · (𝕏)
-
High-Resolution Image Synthesis via Next-Token Prediction,
arXiv, 2411.14808
, arxiv, pdf, cication: -1Dengsheng Chen, Jie Hu, Tiezhu Yue, ..., Xiaoming Wei · (d-jepa.github)
-
Factorized Visual Tokenization and Generation,
arXiv, 2411.16681
, arxiv, pdf, cication: -1Zechen Bai, Jianxiong Gao, Ziteng Gao, ..., Tong He, Mike Zheng Shou · (showlab.github)
-
M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation,
arXiv, 2411.10433
, arxiv, pdf, cication: -1Sucheng Ren, Yaodong Yu, Nataniel Ruiz, ..., Alan Yuille, Cihang Xie · (MVAR - OliverRensu)
-
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer,
arXiv, 2411.10781
, arxiv, pdf, cication: -1Shitong Shao, Zikai Zhou, Tian Ye, ..., Zhiqiang Xu, Zeke Xie
-
Continuous Speculative Decoding for Autoregressive Image Generation,
arXiv, 2411.11925
, arxiv, pdf, cication: -1Zili Wang, Robert Zhang, Kun Ding, ..., Fei Li, Shiming Xiang · (CSpD - MarkXCloud)
-
🌟 JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation,
arXiv, 2411.07975
, arxiv, pdf, cication: -1Yiyang Ma, Xingchao Liu, Xiaokang Chen, ..., Jiaying Liu, Chong Ruan · (Janus - deepseek-ai)
-
🌟 Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models,
arXiv, 2411.04996
, arxiv, pdf, cication: -1Weixin Liang, Lili Yu, Liang Luo, ..., Luke Zettlemoyer, Xi Victoria Lin
-
Randomized Autoregressive Visual Generation,
arXiv, 2411.00776
, arxiv, pdf, cication: -1Qihang Yu, Ju He, Xueqing Deng, ..., Xiaohui Shen, Liang-Chieh Chen · (yucornetto.github)
-
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling,
arXiv, 2410.10511
, arxiv, pdf, cication: 1Wenze Liu, Le Zhuo, Yi Xin, ..., Peng Gao, Xiangyu Yue · (SAR - poppuppy)
-
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective,
arXiv, 2410.12490
, arxiv, pdf, cication: -1Yongxin Zhu, Bocheng Li, Hang Zhang, ..., Linli Xu, Lidong Bing · (DiGIT - DAMO-NLP-SG) · (x)
-
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation,
arXiv, 2410.13861
, arxiv, pdf, cication: -1Rongyao Fang, Chengqi Duan, Kun Wang, ..., Hongsheng Li, Xihui Liu
-
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer,
arXiv, 2410.10812
, arxiv, pdf, cication: -1Haotian Tang, Yecheng Wu, Shang Yang, ..., Yao Lu, Song Han · (hart.mit)
-
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens,
arXiv, 2410.13863
, arxiv, pdf, cication: -1Lijie Fan, Tianhong Li, Siyang Qin, ..., Kaiming He, Yonglong Tian
-
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation,
arXiv, 2410.13848
, arxiv, pdf, cication: -1Chengyue Wu, Xiaokang Chen, Zhiyu Wu, ..., Chong Ruan, Ping Luo
-
Edicho: Consistent Image Editing in the Wild,
arXiv, 2412.21079
, arxiv, pdf, cication: -1Qingyan Bai, Hao Ouyang, Yinghao Xu, ..., Yujun Shen, Qifeng Chen
-
InvSR - zsyOAOA
-
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion,
arXiv, 2412.09626
, arxiv, pdf, cication: -1Haonan Qiu, Shiwei Zhang, Yujie Wei, ..., Yingya Zhang, Ziwei Liu · (haonanqiu) · (FreeScale - ali-vilab)
-
Arbitrary-steps Image Super-resolution via Diffusion Inversion,
arXiv, 2412.09013
, arxiv, pdf, cication: -1Zongsheng Yue, Kang Liao, Chen Change Loy · (InvSR - zsyOAOA)
-
BrushEdit: All-In-One Image Inpainting and Editing,
arXiv, 2412.10316
, arxiv, pdf, cication: -1Yaowei Li, Yuxuan Bian, Xuan Ju, ..., Yuexian Zou, Qiang Xu · (huggingface) · (huggingface) · (BrushEdit - TencentARC) · (liyaowei-stu.github)
-
facechain - modelscope
-
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics,
arXiv, 2412.07774
, arxiv, pdf, cication: -1Xi Chen, Zhifei Zhang, He Zhang, ..., Zhe Lin, Hengshuang Zhao · (xavierchen34.github)
-
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion,
arXiv, 2412.04301
, arxiv, pdf, cication: -1Trong-Tung Nguyen, Quang Nguyen, Khoi Nguyen, ..., Anh Tran, Cuong Pham · (swift-edit.github)
-
OSDFace - jkwang28
One-Step Diffusion Model for Face Restoration
-
OmniCreator: Self-Supervised Unified Generation with Universal Editing,
arXiv, 2412.02114
, arxiv, pdf, cication: -1Haodong Chen, Lan Wang, Harry Yang, ..., Ser-Nam Lim · (haroldchen19.github)
-
IC-Light - lllyasviel
· (huggingface)
-
Pathways on the Image Manifold: Image Editing via Video Generation,
arXiv, 2411.16819
, arxiv, pdf, cication: -1Noam Rotstein, Gal Yona, Daniel Silver, ..., David Bensaïd, Ron Kimmel
-
clarity-upscaler - philz1337x
-
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models,
arXiv, 2411.07232
, arxiv, pdf, cication: -1Yoad Tewel, Rinon Gal, Dvir Samuel, ..., Lior Wolf, Gal Chechik · (research.nvidia)
-
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision,
arXiv, 2411.07199
, arxiv, pdf, cication: -1Cong Wei, Zheyang Xiong, Weiming Ren, ..., Ge Zhang, Wenhu Chen · (tiger-ai-lab.github)
-
SeedEdit: Align Image Re-Generation to Image Editing,
arXiv, 2411.06686
, arxiv, pdf, cication: -1Yichun Shi, Peng Wang, Weilin Huang · (team.doubao)
-
MagicQuill: An Intelligent Interactive Image Editing System,
arXiv, 2411.09703
, arxiv, pdf, cication: -1Zichen Liu, Yue Yu, Hao Ouyang, ..., Qifeng Chen, Yujun Shen · (magicquill - magic-quill)
-
Training-free Regional Prompting for Diffusion Transformers,
arXiv, 2411.02395
, arxiv, pdf, cication: -1Anthony Chen, Jianjin Xu, Wenzhao Zheng, ..., Haofan Wang, Shanghang Zhang · (Regional-Prompting-FLUX - instantX-research)
-
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation,
arXiv, 2410.18666
, arxiv, pdf, cication: -1Yuang Ai, Xiaoqiang Zhou, Huaibo Huang, ..., Quanzeng You, Hongxia Yang · (DreamClear - shallowdream204)
-
DreamClear - shallowdream204
High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
-
SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing
-
InstantIR: Blind Image Restoration with Instant Generative Reference,
arXiv, 2410.06551
, arxiv, pdf, cication: -1Jen-Yuan Huang, Haofan Wang, Qixun Wang, ..., Peng Xing, Jen-Tse Huang
-
PMRF - ohayonguy
Towards Minimum MSE Photo-Realistic Image Restoration
-
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations,
arXiv, 2410.10792
, arxiv, pdf, cication: -1Litu Rout, Yujia Chen, Nataniel Ruiz, ..., Sanjay Shakkottai, Wen-Sheng Chu · (rf-inversion.github)
-
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation,
arXiv, 2412.21059
, arxiv, pdf, cication: -1Jiazheng Xu, Yu Huang, Jiale Cheng, ..., Jie Tang, Yuxiao Dong · (VisionReward - THUDM)
-
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation,
arXiv, 2412.21059
, arxiv, pdf, cication: -1Jiazheng Xu, Yu Huang, Jiale Cheng, ..., Jie Tang, Yuxiao Dong
-
Open Preference Dataset for Text-to-Image Generation by the 🤗 Community
-
Scalable Ranked Preference Optimization for Text-to-Image Generation,
arXiv, 2410.18013
, arxiv, pdf, cication: -1Shyamgopal Karthik, Huseyin Coskun, Zeynep Akata, ..., Jian Ren, Anil Kag
-
Improving Long-Text Alignment for Text-to-Image Diffusion Models,
arXiv, 2410.11817
, arxiv, pdf, cication: -1Luping Liu, Chao Du, Tianyu Pang, ..., Chongxuan Li, Dong Xu · (LongAlign - luping-liu)
-
LightningDiT - hustvl
Taming Optimization Dilemma in Latent Diffusion Models
-
causalfusion - causalfusion
-
🌟 Sana - NVlabs
Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer · (arxiv)
-
ComfyUI-Miaoshouai-Tagger - miaoshouai
-
ComfyUI-Image-Filters - spacepxl
-
Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model 🤗
· (huggingface) · (sd3.5 - Stability-AI)
-
State-of-the-art video and image generation with Veo 2 and Imagen 3
-
Frames: An image generation model offering unprecedented stylistic control. 𝕏
-
Survey of User Interface Design and Interaction Techniques in Generative AI Applications,
arXiv, 2410.22370
, arxiv, pdf, cication: -1Reuben Luera, Ryan A. Rossi, Alexa Siu, ..., Puneet Mathur, Nedim Lipka
-
Indie Game Studio Creates Custom Stable Diffusion Based Texturing Pipeline 𝕏