🚀 飞书wiki: 《SimpleSDXL创意生图应用指南》, 包括如何快速下载、安装和运行,如何操作进行创意出图,在不同应用场景下如何使用SimpleSDXL等等。
- [2024-09-16] 解锁Flux的lora和可图lora。自动修图开关移到增强修图标签内。添加提示面板开关,可点击触发批量通配符面板。反推提示词和图片参数提取两个标签移入参数设置栏。添加基于预置包的基础模型过滤功能。修复Comfyd引擎的跳过和中断逻辑bug。优化预置包参数和预置包导航。Flux模型自动适配硬件环境。优选Hyp8Q5KM模型,支持Flux Lora,兼具速度与质量的平衡。新增两个无缝贴图的预置包。升级comfyd到最新版。优化下载安装和启动流程,强制基础包检测,并提供模型包安装脚本。
- [2024.08.20] 新架构进一步优化,提升在Windows环境的兼容性,压缩Fooocus和Comfy后端切换的资源消耗。优化支持最低6G显存的Flux模型出图,提供质量优先的Fluxdev和速度优先的Flux+两个预置包,并可根据系统资源自动适配。同步升级主线v2.5.5,优化增强修图UI,更符合Fooocus交互习惯。
- [2024.07.31] 优化了新架构,更稳定,更快速。新增对快手可图模型的支持,使SimpleSDXL2除SDXL外,以6G显卡显存同时支持: 小马v6/Playground-v2.5/SD3m/混元/可图等多种模型出图,适用更多场景。同步主线v2.5.2,并对修图界面进行优化和适配,使之更容易被中文用户理解和使用。
- [2024.06.30] 扩展架构,新增Comfy后端,全新升级SimpleSDXL2。支持SDXL、混元、SD3和Playground-v2.5本地模型,最低6G显卡内存可用,保持Fooocus简洁、高效和稳定的出图风格。新增融图打光模块,可自主生成前景及蒙版,可自动抠取产品或人物图片切换场景进行融合。升级OBP一键提示词到最新版。UI整体优化。
- [2024.05.28] 同步主线升级到v2.4.3,新增nsfw过滤等功能。
- [2024.04.23] 升级OBP到最新版,集成Superprompt超级提示词扩展,为提示词增补细节描写。新增SD3生图引擎接口,可到stability.ai申请免费会员,获取接口密钥后无缝对接SD3新引擎生成图片。优化界面,包括将OBP和Superprompt入口整合到提示词框,新增预置包导航浮层提示、提示词框token数统计、图生图多个参数前置到操作页面等。
重要:如果项目给您带来了便利和价值,不要吝惜加颗星"⭐️",促进项目更好的发展!😜
Note: Please don't forget to give us a star if you like this project. Thanks! 😜
安装下载和使用,参见wiki: 《SimpleSDXL创意生图应用指南》, 包括如何快速下载、安装和运行,如何操作进行创意出图,在不同应用场景下如何使用SimpleSDXL等等。
- SimpleSDXL1独立分支的完全包,含环境、程序和默认模型,后续不增功能仅修bug : SimpleSDXL1_win64_all.zip (30G)
- 化繁为简 AI的本质应该是化繁为简,让操作更简洁,让想法更易达成。SimpleSDXL保持Fooocus的易用性,以SDXL模型生态为核心,朝着开源可控,简洁易用,功能完善的方向更进一步。
- 中文适配 中文环境与英语环境有很多差异。不仅仅在语言文字上,包括思维习惯、操作方式和网络环境都有很多不同。让中文用户使用更简单,用的更爽,也是SimpleSDXL 的原始初衷。
- 场景定制 文生图和图生图有非常多的使用场景,需要更好的配置定制能力。SimpleSDXL以预置包和嵌参图片为基础,面向场景提升Fooocus的开放性和可定制性,发挥出SDXL的强大能力。
在Fooocus基础上增强功能,可无缝升级,同步迭代,并行使用。而且经过了手机适配,PC和手机也可同步操作。
Enhanced features base on Fooocus, seamless upgrading and dual versions available synchronous iteration and parallel use. Adapted to mobile, PC and phone can be used synchronously.
在线离线自主选择,支持翻译后再编辑,更适于提示词表达。
Offline and online autonomous selection, support editing after translation, more suitable for Prompt.
- 中英文混合编辑 对提示词文本进行中英文切分后分别翻译再合并,适配提示词类的表达场景。
- 在线和离线翻译器 可自动安装离线翻译大模型和小尺寸的瘦模型,也可选择第三方翻译接口。离线模型需自身算力支持,第三方接口接入便捷成本低,但增加了接口依赖。用户可根据情况自主配置选>择。
- 支持翻译后再编辑 机器翻译的结果质量都不可控,存在翻译质量差导致生成内容偏差的现象。翻译后再编辑可以显性化翻译质量,提供用户再优化调整处理的空间。
- 多大厂接口随机选 选择国内大厂(百度、阿里和搜狗)的稳定接口,每次启动时随机选择,运行态相对固定。既避免对接口冲击又保持翻译的一致性。
- 私有翻译接口定制 可以配置私有接口,方便对接OpenAI等大语言模型的翻译能力。
具有语义识别的多种抠图算法,可自动生成蒙板,方便生成图片的组合加工。
Multiple cropping algorithms with semantic recognition that can automatically generate masks, facilitating the combination processing of generated images.
- 智能算法抠图 可以基于u2net进行图像分割,对重绘图片进行前后景分割,人物主体分割,并生成对应蒙板进行重绘。
- 语义识别抠图 可以基于bert+Sam,在语义理解基础上识别图片内容,再进行自动分割,生成蒙板后进行重绘。
- 点击识别抠图 点击图片某个区域,基于Sam算法对点击所在主体进行自动识别和分割,生成蒙板后进行重绘。
支持通配符词组表达和触发展示,可随机批量生成同Seed下的一组图片。
Supports wildcard phrase expressions and triggering display, allowing for random batch generate a set of images under the same seed.
- 词组语法 支持[Words]词组,以","分割的词列表。表示在同一seed下从每个words词组抽词进行组合批量生成图片。每种组合1张图片,总量是各词组词数的乘积,以实际需要的数量为准,不受出图数量参数的限制。
- 通配符组词 用通配符定义词组,格式为:
[__wildcard__:R|Lnumber:start]
R表示随机抽,L表示按顺序抽,默认=R;number是抽取的数量,默认=1;start是在顺序抽取时从第几个开始抽,默认=1。具体语法说明见通配符ReadMe - 自动触发输入 提示词框在输入'['或'_'时可自动触发通配符输入工具,可以通过界面选择追加通配符到提示词框。
- 嵌套及动态加载 支持通配符的多级嵌套和动态加载,增强通配符的表达能力。
- 定制和推送 支持自主定制通配符快捷方式,并推送给朋友使用。
预置包可通过界面切换和生成,模型下载会根据IP自动选择内外源。
The preset can be switched and generated through UI, and the model download will automatically select sources based on the access IP.
- 预置包导航 将presets目录下的预置包配置文件生成顶部导航入口,户点击顶部预置包导航后,调取对应配置文件,重置出图环境参数和相关配置。
- 生成预置包 将当前出图环境参数打包保存为新的预置包,将预置包文件存入presets目录下,自动加入顶部导航。
- 扩展预置参数 扩展主线的预置包参数范围,补充开发者模式的参数,以及风格样式的定义和通配符的定义。支持的预置包参数见预置包ReadMe
- 统一模型ID和下载 对接模型信息库,使用以模型文件哈希为基础的统一模型MUID。可自动检测预置包出图环境的可用性,缺失模型文件可自动下载补齐。
- 出图保护 当系统环境进入出图状态时,顶部导航不可点击,禁止加载预置包冲击出图环境。
原生版仅能浏览当前生成的图片集,已生成图片管理非常简陋。
Fooocus only can browse the current generated image set. Finished images management is very simple.
- 已出图片检索 对已出图片可以按照出图日期进行检索。单天出图量过大,则根据屏幕适配分组为子目录索引,避免撑爆相册组件。
- 已出图片删除 对崩坏的已出图片可以即时删除,联动删除出图参数日志,确保图片和参数日志保持一致性。
- 自动回填提示词 在浏览已出图片集过程中,可选择自动回填图片提示词,方便提示词的对照和修改,及图片的重生。
- 图片集交互优化 已出图片集索引栏可根据状态适配,自动收起和调整,避免目录过多挤占页面空间,干扰图片生成创作。
增强的参数管理,可即时查看可嵌入图片,也可提取参数回填界面,二次生成。
Enhanced parameter management for instant viewing and embedding of images, and can also extract parameters to backfill for secondary generation.
- 查看参数 从出图日志文件中提取当前图片的生成参数并用浮层完整展示。图集切换过程中,浮层内容跟随切换。
- 提参重生 用当前图片的生成参数覆盖默认预置包的参数,提示词回填,可以修改参数或提示词后重新出图。
- 嵌参图片 在系统未设置统一嵌参的情况,可以制作当前图片的参数打包嵌入,并保存到专属的嵌参图片目录。嵌参图片可通过图片描述工具提取参数形成新的出图环境配置。
- 云化适配 增加访问根路径启动参数,
--webroot
。当在云端服务器部署,并配置前置转发后,需要配置根路径参数,避免URL路径的混乱。 - 算力云化 前后端分离,本机的出图算力后端可支持远程的前端出图调用,实现前端操控和出图计算的分离,让无GPU卡设备也可使用SDXL模型出图。
- 主线同步 SimpleSDXL的增强代码保持良好的结构,与Fooocus主线版本保持良好的兼容性和扩展性,可以及时同步主线的新增能力和Bug修复。
Non-cherry-picked random batch by just typing two words "forest elf",
without any parameter tweaking, without any strange prompt tags.
See also non-cherry-picked generalization and diversity tests here and here and here and here.
In the entire open source community, only Fooocus can achieve this level of non-cherry-picked quality.
Fooocus is an image generating software (based on Gradio).
Fooocus is a rethinking of Stable Diffusion and Midjourney’s designs:
-
Learned from Stable Diffusion, the software is offline, open source, and free.
-
Learned from Midjourney, the manual tweaking is not needed, and users only need to focus on the prompts and images.
Fooocus has included and automated lots of inner optimizations and quality improvements. Users can forget all those difficult technical parameters, and just enjoy the interaction between human and computer to "explore new mediums of thought and expanding the imaginative powers of the human species" [1]
.
Fooocus has simplified the installation. Between pressing "download" and generating the first image, the number of needed mouse clicks is strictly limited to less than 3. Minimal GPU memory requirement is 4GB (Nvidia).
[1]
David Holz, 2019.
Recently many fake websites exist on Google when you search “fooocus”. Do not trust those – here is the only official source of Fooocus.
Using Fooocus is as easy as (probably easier than) Midjourney – but this does not mean we lack functionality. Below are the details.
Midjourney | Fooocus |
---|---|
High-quality text-to-image without needing much prompt engineering or parameter tuning. (Unknown method) |
High-quality text-to-image without needing much prompt engineering or parameter tuning. (Fooocus has an offline GPT-2 based prompt processing engine and lots of sampling improvements so that results are always beautiful, no matter if your prompt is as short as “house in garden” or as long as 1000 words) |
V1 V2 V3 V4 | Input Image -> Upscale or Variation -> Vary (Subtle) / Vary (Strong) |
U1 U2 U3 U4 | Input Image -> Upscale or Variation -> Upscale (1.5x) / Upscale (2x) |
Inpaint / Up / Down / Left / Right (Pan) | Input Image -> Inpaint or Outpaint -> Inpaint / Up / Down / Left / Right (Fooocus uses its own inpaint algorithm and inpaint models so that results are more satisfying than all other software that uses standard SDXL inpaint method/model) |
Image Prompt | Input Image -> Image Prompt (Fooocus uses its own image prompt algorithm so that result quality and prompt understanding are more satisfying than all other software that uses standard SDXL methods like standard IP-Adapters or Revisions) |
--style | Advanced -> Style |
--stylize | Advanced -> Advanced -> Guidance |
--niji | Multiple launchers: "run.bat", "run_anime.bat", and "run_realistic.bat". Fooocus support SDXL models on Civitai (You can google search “Civitai” if you do not know about it) |
--quality | Advanced -> Quality |
--repeat | Advanced -> Image Number |
Multi Prompts (::) | Just use multiple lines of prompts |
Prompt Weights | You can use " I am (happy:1.5)". Fooocus uses A1111's reweighting algorithm so that results are better than ComfyUI if users directly copy prompts from Civitai. (Because if prompts are written in ComfyUI's reweighting, users are less likely to copy prompt texts as they prefer dragging files) To use embedding, you can use "(embedding:file_name:1.1)" |
--no | Advanced -> Negative Prompt |
--ar | Advanced -> Aspect Ratios |
InsightFace | Input Image -> Image Prompt -> Advanced -> FaceSwap |
Describe | Input Image -> Describe |
We also have a few things borrowed from the best parts of LeonardoAI:
LeonardoAI | Fooocus |
---|---|
Prompt Magic | Advanced -> Style -> Fooocus V2 |
Advanced Sampler Parameters (like Contrast/Sharpness/etc) | Advanced -> Advanced -> Sampling Sharpness / etc |
User-friendly ControlNets | Input Image -> Image Prompt -> Advanced |
Fooocus also developed many "fooocus-only" features for advanced users to get perfect results. Click here to browse the advanced features.
You can directly download Fooocus with:
>>> Click here to download <<<
After you download the file, please uncompress it and then run the "run.bat".
The first time you launch the software, it will automatically download models:
- It will download default models to the folder "Fooocus\models\checkpoints" given different presets. You can download them in advance if you do not want automatic download.
- Note that if you use inpaint, at the first time you inpaint an image, it will download Fooocus's own inpaint control model from here as the file "Fooocus\models\inpaint\inpaint_v26.fooocus.patch" (the size of this file is 1.28GB).
After Fooocus 2.1.60, you will also have run_anime.bat
and run_realistic.bat
. They are different model presets (and require different models, but they will be automatically downloaded). Check here for more details.
After Fooocus 2.3.0 you can also switch presets directly in the browser. Keep in mind to add these arguments if you want to change the default behavior:
- Use
--disable-preset-selection
to disable preset selection in the browser. - Use
--always-download-new-model
to download missing models on preset switch. Default is fallback toprevious_default_models
defined in the corresponding preset, also see terminal output.
If you already have these files, you can copy them to the above locations to speed up installation.
Note that if you see "MetadataIncompleteBuffer" or "PytorchStreamReader", then your model files are corrupted. Please download models again.
Below is a test on a relatively low-end laptop with 16GB System RAM and 6GB VRAM (Nvidia 3060 laptop). The speed on this machine is about 1.35 seconds per iteration. Pretty impressive – nowadays laptops with 3060 are usually at very acceptable price.
Besides, recently many other software report that Nvidia driver above 532 is sometimes 10x slower than Nvidia driver 531. If your generation time is very long, consider download Nvidia Driver 531 Laptop or Nvidia Driver 531 Desktop.
Note that the minimal requirement is 4GB Nvidia GPU memory (4GB VRAM) and 8GB system memory (8GB RAM). This requires using Microsoft’s Virtual Swap technique, which is automatically enabled by your Windows installation in most cases, so you often do not need to do anything about it. However, if you are not sure, or if you manually turned it off (would anyone really do that?), or if you see any "RuntimeError: CPUAllocator", you can enable it here:
Click here to see the image instructions.
And make sure that you have at least 40GB free space on each drive if you still see "RuntimeError: CPUAllocator" !
Please open an issue if you use similar devices but still cannot achieve acceptable performances.
Note that the minimal requirement for different platforms is different.
See also the common problems and troubleshoots here.
(Last tested - 2024 Aug 12 by mashb1t)
Colab | Info |
---|---|
Fooocus Official |
In Colab, you can modify the last line to !python entry_with_update.py --share --always-high-vram
or !python entry_with_update.py --share --always-high-vram --preset anime
or !python entry_with_update.py --share --always-high-vram --preset realistic
for Fooocus Default/Anime/Realistic Edition.
You can also change the preset in the UI. Please be aware that this may lead to timeouts after 60 seconds. If this is the case, please wait until the download has finished, change the preset to initial and back to the one you've selected or reload the page.
Note that this Colab will disable refiner by default because Colab free's resources are relatively limited (and some "big" features like image prompt may cause free-tier Colab to disconnect). We make sure that basic text-to-image is always working on free-tier Colab.
Using --always-high-vram
shifts resource allocation from RAM to VRAM and achieves the overall best balance between performance, flexibility and stability on the default T4 instance. Please find more information here.
Thanks to camenduru for the template!
If you want to use Anaconda/Miniconda, you can
git clone https://github.com/lllyasviel/Fooocus.git
cd Fooocus
conda env create -f environment.yaml
conda activate fooocus
pip install -r requirements_versions.txt
Then download the models: download default models to the folder "Fooocus\models\checkpoints". Or let Fooocus automatically download the models using the launcher:
conda activate fooocus
python entry_with_update.py
Or, if you want to open a remote port, use
conda activate fooocus
python entry_with_update.py --listen
Use python entry_with_update.py --preset anime
or python entry_with_update.py --preset realistic
for Fooocus Anime/Realistic Edition.
Your Linux needs to have Python 3.10 installed, and let's say your Python can be called with the command python3 with your venv system working; you can
git clone https://github.com/lllyasviel/Fooocus.git
cd Fooocus
python3 -m venv fooocus_env
source fooocus_env/bin/activate
pip install -r requirements_versions.txt
See the above sections for model downloads. You can launch the software with:
source fooocus_env/bin/activate
python entry_with_update.py
Or, if you want to open a remote port, use
source fooocus_env/bin/activate
python entry_with_update.py --listen
Use python entry_with_update.py --preset anime
or python entry_with_update.py --preset realistic
for Fooocus Anime/Realistic Edition.
If you know what you are doing, and your Linux already has Python 3.10 installed, and your Python can be called with the command python3 (and Pip with pip3), you can
git clone https://github.com/lllyasviel/Fooocus.git
cd Fooocus
pip3 install -r requirements_versions.txt
See the above sections for model downloads. You can launch the software with:
python3 entry_with_update.py
Or, if you want to open a remote port, use
python3 entry_with_update.py --listen
Use python entry_with_update.py --preset anime
or python entry_with_update.py --preset realistic
for Fooocus Anime/Realistic Edition.
Note that the minimal requirement for different platforms is different.
Same with the above instructions. You need to change torch to the AMD version
pip uninstall torch torchvision torchaudio torchtext functorch xformers
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6
AMD is not intensively tested, however. The AMD support is in beta.
Use python entry_with_update.py --preset anime
or python entry_with_update.py --preset realistic
for Fooocus Anime/Realistic Edition.
Note that the minimal requirement for different platforms is different.
Same with Windows. Download the software and edit the content of run.bat
as:
.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
.\python_embeded\python.exe -m pip install torch-directml
.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
pause
Then run the run.bat
.
AMD is not intensively tested, however. The AMD support is in beta.
For AMD, use .\python_embeded\python.exe entry_with_update.py --directml --preset anime
or .\python_embeded\python.exe entry_with_update.py --directml --preset realistic
for Fooocus Anime/Realistic Edition.
Note that the minimal requirement for different platforms is different.
Mac is not intensively tested. Below is an unofficial guideline for using Mac. You can discuss problems here.
You can install Fooocus on Apple Mac silicon (M1 or M2) with macOS 'Catalina' or a newer version. Fooocus runs on Apple silicon computers via PyTorch MPS device acceleration. Mac Silicon computers don't come with a dedicated graphics card, resulting in significantly longer image processing times compared to computers with dedicated graphics cards.
- Install the conda package manager and pytorch nightly. Read the Accelerated PyTorch training on Mac Apple Developer guide for instructions. Make sure pytorch recognizes your MPS device.
- Open the macOS Terminal app and clone this repository with
git clone https://github.com/lllyasviel/Fooocus.git
. - Change to the new Fooocus directory,
cd Fooocus
. - Create a new conda environment,
conda env create -f environment.yaml
. - Activate your new conda environment,
conda activate fooocus
. - Install the packages required by Fooocus,
pip install -r requirements_versions.txt
. - Launch Fooocus by running
python entry_with_update.py
. (Some Mac M2 users may needpython entry_with_update.py --disable-offload-from-vram
to speed up model loading/unloading.) The first time you run Fooocus, it will automatically download the Stable Diffusion SDXL models and will take a significant amount of time, depending on your internet connection.
Use python entry_with_update.py --preset anime
or python entry_with_update.py --preset realistic
for Fooocus Anime/Realistic Edition.
See docker.md
See the guidelines here.
Below is the minimal requirement for running Fooocus locally. If your device capability is lower than this spec, you may not be able to use Fooocus locally. (Please let us know, in any case, if your device capability is lower but Fooocus still works.)
Operating System | GPU | Minimal GPU Memory | Minimal System Memory | System Swap | Note |
---|---|---|---|---|---|
Windows/Linux | Nvidia RTX 4XXX | 4GB | 8GB | Required | fastest |
Windows/Linux | Nvidia RTX 3XXX | 4GB | 8GB | Required | usually faster than RTX 2XXX |
Windows/Linux | Nvidia RTX 2XXX | 4GB | 8GB | Required | usually faster than GTX 1XXX |
Windows/Linux | Nvidia GTX 1XXX | 8GB (* 6GB uncertain) | 8GB | Required | only marginally faster than CPU |
Windows/Linux | Nvidia GTX 9XX | 8GB | 8GB | Required | faster or slower than CPU |
Windows/Linux | Nvidia GTX < 9XX | Not supported | / | / | / |
Windows | AMD GPU | 8GB (updated 2023 Dec 30) | 8GB | Required | via DirectML (* ROCm is on hold), about 3x slower than Nvidia RTX 3XXX |
Linux | AMD GPU | 8GB | 8GB | Required | via ROCm, about 1.5x slower than Nvidia RTX 3XXX |
Mac | M1/M2 MPS | Shared | Shared | Shared | about 9x slower than Nvidia RTX 3XXX |
Windows/Linux/Mac | only use CPU | 0GB | 32GB | Required | about 17x slower than Nvidia RTX 3XXX |
* AMD GPU ROCm (on hold): The AMD is still working on supporting ROCm on Windows.
* Nvidia GTX 1XXX 6GB uncertain: Some people report 6GB success on GTX 10XX, but some other people report failure cases.
Note that Fooocus is only for extremely high quality image generating. We will not support smaller models to reduce the requirement and sacrifice result quality.
See the common problems here.
Given different goals, the default models and configs of Fooocus are different:
Task | Windows | Linux args | Main Model | Refiner | Config |
---|---|---|---|---|---|
General | run.bat | juggernautXL_v8Rundiffusion | not used | here | |
Realistic | run_realistic.bat | --preset realistic | realisticStockPhoto_v20 | not used | here |
Anime | run_anime.bat | --preset anime | animaPencilXL_v500 | not used | here |
Note that the download is automatic - you do not need to do anything if the internet connection is okay. However, you can download them manually if you (or move them from somewhere else) have your own preparation.
In addition to running on localhost, Fooocus can also expose its UI in two ways:
- Local UI listener: use
--listen
(specify port e.g. with--port 8888
). - API access: use
--share
(registers an endpoint at.gradio.live
).
In both ways the access is unauthenticated by default. You can add basic authentication by creating a file called auth.json
in the main directory, which contains a list of JSON objects with the keys user
and pass
(see example in auth-example.json).
List of "Hidden" Tricks
The below things are already inside the software, and users do not need to do anything about these.
- GPT2-based prompt expansion as a dynamic style "Fooocus V2". (similar to Midjourney's hidden pre-processing and "raw" mode, or the LeonardoAI's Prompt Magic).
- Native refiner swap inside one single k-sampler. The advantage is that the refiner model can now reuse the base model's momentum (or ODE's history parameters) collected from k-sampling to achieve more coherent sampling. In Automatic1111's high-res fix and ComfyUI's node system, the base model and refiner use two independent k-samplers, which means the momentum is largely wasted, and the sampling continuity is broken. Fooocus uses its own advanced k-diffusion sampling that ensures seamless, native, and continuous swap in a refiner setup. (Update Aug 13: Actually, I discussed this with Automatic1111 several days ago, and it seems that the “native refiner swap inside one single k-sampler” is merged into the dev branch of webui. Great!)
- Negative ADM guidance. Because the highest resolution level of XL Base does not have cross attentions, the positive and negative signals for XL's highest resolution level cannot receive enough contrasts during the CFG sampling, causing the results to look a bit plastic or overly smooth in certain cases. Fortunately, since the XL's highest resolution level is still conditioned on image aspect ratios (ADM), we can modify the adm on the positive/negative side to compensate for the lack of CFG contrast in the highest resolution level. (Update Aug 16, the IOS App Draw Things will support Negative ADM Guidance. Great!)
- We implemented a carefully tuned variation of Section 5.1 of "Improving Sample Quality of Diffusion Models Using Self-Attention Guidance". The weight is set to very low, but this is Fooocus's final guarantee to make sure that the XL will never yield an overly smooth or plastic appearance (examples here). This can almost eliminate all cases for which XL still occasionally produces overly smooth results, even with negative ADM guidance. (Update 2023 Aug 18, the Gaussian kernel of SAG is changed to an anisotropic kernel for better structure preservation and fewer artifacts.)
- We modified the style templates a bit and added the "cinematic-default".
- We tested the "sd_xl_offset_example-lora_1.0.safetensors" and it seems that when the lora weight is below 0.5, the results are always better than XL without lora.
- The parameters of samplers are carefully tuned.
- Because XL uses positional encoding for generation resolution, images generated by several fixed resolutions look a bit better than those from arbitrary resolutions (because the positional encoding is not very good at handling int numbers that are unseen during training). This suggests that the resolutions in UI may be hard coded for best results.
- Separated prompts for two different text encoders seem unnecessary. Separated prompts for the base model and refiner may work, but the effects are random, and we refrain from implementing this.
- The DPM family seems well-suited for XL since XL sometimes generates overly smooth texture, but the DPM family sometimes generates overly dense detail in texture. Their joint effect looks neutral and appealing to human perception.
- A carefully designed system for balancing multiple styles as well as prompt expansion.
- Using automatic1111's method to normalize prompt emphasizing. This significantly improves results when users directly copy prompts from civitai.
- The joint swap system of the refiner now also supports img2img and upscale in a seamless way.
- CFG Scale and TSNR correction (tuned for SDXL) when CFG is bigger than 10.
After the first time you run Fooocus, a config file will be generated at Fooocus\config.txt
. This file can be edited to change the model path or default parameters.
For example, an edited Fooocus\config.txt
(this file will be generated after the first launch) may look like this:
{
"path_checkpoints": "D:\\Fooocus\\models\\checkpoints",
"path_loras": "D:\\Fooocus\\models\\loras",
"path_embeddings": "D:\\Fooocus\\models\\embeddings",
"path_vae_approx": "D:\\Fooocus\\models\\vae_approx",
"path_upscale_models": "D:\\Fooocus\\models\\upscale_models",
"path_inpaint": "D:\\Fooocus\\models\\inpaint",
"path_controlnet": "D:\\Fooocus\\models\\controlnet",
"path_clip_vision": "D:\\Fooocus\\models\\clip_vision",
"path_fooocus_expansion": "D:\\Fooocus\\models\\prompt_expansion\\fooocus_expansion",
"path_outputs": "D:\\Fooocus\\outputs",
"default_model": "realisticStockPhoto_v10.safetensors",
"default_refiner": "",
"default_loras": [["lora_filename_1.safetensors", 0.5], ["lora_filename_2.safetensors", 0.5]],
"default_cfg_scale": 3.0,
"default_sampler": "dpmpp_2m",
"default_scheduler": "karras",
"default_negative_prompt": "low quality",
"default_positive_prompt": "",
"default_styles": [
"Fooocus V2",
"Fooocus Photograph",
"Fooocus Negative"
]
}
Many other keys, formats, and examples are in Fooocus\config_modification_tutorial.txt
(this file will be generated after the first launch).
Consider twice before you really change the config. If you find yourself breaking things, just delete Fooocus\config.txt
. Fooocus will go back to default.
A safer way is just to try "run_anime.bat" or "run_realistic.bat" - they should already be good enough for different tasks.
Note that (Edit: it is already removed.)user_path_config.txt
is deprecated and will be removed soon.
entry_with_update.py [-h] [--listen [IP]] [--port PORT]
[--disable-header-check [ORIGIN]]
[--web-upload-size WEB_UPLOAD_SIZE]
[--hf-mirror HF_MIRROR]
[--external-working-path PATH [PATH ...]]
[--output-path OUTPUT_PATH]
[--temp-path TEMP_PATH] [--cache-path CACHE_PATH]
[--in-browser] [--disable-in-browser]
[--gpu-device-id DEVICE_ID]
[--async-cuda-allocation | --disable-async-cuda-allocation]
[--disable-attention-upcast]
[--all-in-fp32 | --all-in-fp16]
[--unet-in-bf16 | --unet-in-fp16 | --unet-in-fp8-e4m3fn | --unet-in-fp8-e5m2]
[--vae-in-fp16 | --vae-in-fp32 | --vae-in-bf16]
[--vae-in-cpu]
[--clip-in-fp8-e4m3fn | --clip-in-fp8-e5m2 | --clip-in-fp16 | --clip-in-fp32]
[--directml [DIRECTML_DEVICE]]
[--disable-ipex-hijack]
[--preview-option [none,auto,fast,taesd]]
[--attention-split | --attention-quad | --attention-pytorch]
[--disable-xformers]
[--always-gpu | --always-high-vram | --always-normal-vram | --always-low-vram | --always-no-vram | --always-cpu [CPU_NUM_THREADS]]
[--always-offload-from-vram]
[--pytorch-deterministic] [--disable-server-log]
[--debug-mode] [--is-windows-embedded-python]
[--disable-server-info] [--multi-user] [--share]
[--preset PRESET] [--disable-preset-selection]
[--language LANGUAGE]
[--disable-offload-from-vram] [--theme THEME]
[--disable-image-log] [--disable-analytics]
[--disable-metadata] [--disable-preset-download]
[--disable-enhance-output-sorting]
[--enable-auto-describe-image]
[--always-download-new-model]
[--rebuild-hash-cache [CPU_NUM_THREADS]]
Example prompt: __color__ flower
Processed for positive and negative prompt.
Selects a random wildcard from a predefined list of options, in this case the wildcards/color.txt
file.
The wildcard will be replaced with a random color (randomness based on seed).
You can also disable randomness and process a wildcard file from top to bottom by enabling the checkbox Read wildcards in order
in Developer Debug Mode.
Wildcards can be nested and combined, and multiple wildcards can be used in the same prompt (example see wildcards/color_flower.txt
).
Example prompt: [[red, green, blue]] flower
Processed only for positive prompt.
Processes the array from left to right, generating a separate image for each element in the array. In this case 3 images would be generated, one for each color. Increase the image number to 3 to generate all 3 variants.
Arrays can not be nested, but multiple arrays can be used in the same prompt. Does support inline LoRAs as array elements!
Example prompt: flower <lora:sunflowers:1.2>
Processed only for positive prompt.
Applies a LoRA to the prompt. The LoRA file must be located in the models/loras
directory.
Click here to browse the advanced features.
Fooocus also has many community forks, just like SD-WebUI's vladmandic/automatic and anapnoe/stable-diffusion-webui-ux, for enthusiastic users who want to try!
Fooocus' forks |
---|
fenneishi/Fooocus-Control runew0lf/RuinedFooocus MoonRide303/Fooocus-MRE metercai/SimpleSDXL mashb1t/Fooocus and so on ... |
See also About Forking and Promotion of Forks.
Special thanks to twri and 3Diva and Marc K3nt3L for creating additional SDXL styles available in Fooocus. Thanks daswer123 for contributing the Canvas Zoom!
The log is here.
We need your help! Please help translate Fooocus into international languages.
You can put json files in the language
folder to translate the user interface.
For example, below is the content of Fooocus/language/example.json
:
{
"Generate": "生成",
"Input Image": "入力画像",
"Advanced": "고급",
"SAI 3D Model": "SAI 3D Modèle"
}
If you add --language example
arg, Fooocus will read Fooocus/language/example.json
to translate the UI.
For example, you can edit the ending line of Windows run.bat
as
.\python_embeded\python.exe -s Fooocus\entry_with_update.py --language example
Or run_anime.bat
as
.\python_embeded\python.exe -s Fooocus\entry_with_update.py --language example --preset anime
Or run_realistic.bat
as
.\python_embeded\python.exe -s Fooocus\entry_with_update.py --language example --preset realistic
For practical translation, you may create your own file like Fooocus/language/jp.json
or Fooocus/language/cn.json
and then use flag --language jp
or --language cn
. Apparently, these files do not exist now. We need your help to create these files!
Note that if no --language
is given and at the same time Fooocus/language/default.json
exists, Fooocus will always load Fooocus/language/default.json
for translation. By default, the file Fooocus/language/default.json
does not exist.