Skip to content

How to convert Stable Diffusion models to Core ML

notapreppie edited this page Jan 4, 2025 · 12 revisions

Overview

Mochi Diffusion works with MLMODELC files, which are native to Apple's Core ML. To obtain an MLMODELC file, you need to first convert the original Stable Diffusion model (CKPT or SafeTensors) to Diffusers, and then convert the Diffusers to MLMODELC.

Requirements

  1. Install Homebrew and remember to follow the instructions under "Next steps"

    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
  2. Install Wget

    brew install wget
  3. Download and install Xcode

  4. Select Xcode as the active Command Line Tools provider.

    There are two ways to achieve this.

    1. In Terminal, run the following command:

      sudo xcode-select -s /Applications/Xcode.app
    2. Or open Xcode, go to the Xcode menu / Settings... / Locations and select your Xcode version in the "Command Line Tools" picker.

  5. Download and Install Miniconda

  6. Once done, run the commands below according to their display order

    git clone https://github.com/apple/ml-stable-diffusion.git
    conda create -n coreml_stable_diffusion python=3.8 -y
    conda activate coreml_stable_diffusion
    cd ml-stable-diffusion
    pip install -e .
    pip install omegaconf
    pip install safetensors
  7. Download this Python script and place it in the same folder as the model

↑ Back to top

SD model → Diffusers

This process takes ~1min to complete.

  1. Activate the Conda environment

    conda activate coreml_stable_diffusion
  2. Navigate to the folder where the script is located via cd /<YOUR-PATH> (you can also type cd and then drag the folder into the Terminal app)

  3. Now you have two options:

    1. If your model is in CKPT format, run

      python convert_original_stable_diffusion_to_diffusers.py --checkpoint_path <MODEL-NAME>.ckpt --device cpu --extract_ema --dump_path <MODEL-NAME>_diffusers
    2. If your model is in SafeTensors format, run

      python convert_original_stable_diffusion_to_diffusers.py --checkpoint_path <MODEL-NAME>.safetensors --from_safetensors --device cpu --extract_ema --dump_path <MODEL-NAME>_diffusers

Important Notes

  • When exclusively converting SDXL 1.0 models, be sure to include the following flag: --pipeline_class_name StableDiffusionXLPipeline
  • Starting with diffusers 0.29.0, there is a default max_shard_size of 10GB. If your model is large (SDXL, Pony, etc), the unet files will exceed this limit and it will split them. The next conversion step isn't able to handle this and it will error out. To get around this you can either...
    • ... use diffusers 0.28.2 (pip install diffusers==0.28.2) if you don't need the functions/features of the newer versions.
    • ... add the --half flag to the commands above if you can accept the loss in precision.
    • ... or edit line 188 of the conversion script to increase the max_shard_size.
      • ORIGINAL: pipe.save_pretrained(args.dump_path, safe_serialization=args.to_safetensors)
      • UPDATED: pipe.save_pretrained(args.dump_path, safe_serialization=args.to_safetensors, max_shard_size="15GB")
        • (15GB is just an example, use whatever size is appropriate for your task and hardware capabilities)

↑ Back to top

Diffusers → MLMODELC

This process takes ~25 minutes to complete.

Each conversion script actually runs twice to make 2 different types of one particular component. This enables the converted models to work with and without the ControlNet feature.

If you're doing this right after the previous step, ignore points 1 and 2.

  1. Activate the Conda environment

    conda activate coreml_stable_diffusion
  2. Navigate to the folder where the script is located via cd /<YOUR-PATH> (you can also type cd and then drag the folder into the Terminal app)

  3. Now you have two options:

    1. SPLIT_EINSUM, which is compatible with all compute units

      python -m python_coreml_stable_diffusion.torch2coreml --convert-vae-decoder --convert-vae-encoder --convert-unet --unet-support-controlnet --convert-text-encoder --model-version <MODEL-NAME>_diffusers --bundle-resources-for-swift-cli --attention-implementation SPLIT_EINSUM -o <MODEL-NAME>_split-einsum && python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --model-version <MODEL-NAME>_diffusers --bundle-resources-for-swift-cli --attention-implementation SPLIT_EINSUM -o <MODEL-NAME>_split-einsum
    2. ORIGINAL, which is only compatible with CPU & GPU

      python -m python_coreml_stable_diffusion.torch2coreml --compute-unit CPU_AND_GPU --convert-vae-decoder --convert-vae-encoder --convert-unet --unet-support-controlnet --convert-text-encoder --model-version <MODEL-NAME>_diffusers --bundle-resources-for-swift-cli --attention-implementation ORIGINAL -o <MODEL-NAME>_original && python -m python_coreml_stable_diffusion.torch2coreml --compute-unit CPU_AND_GPU --convert-unet --model-version <MODEL-NAME>_diffusers --bundle-resources-for-swift-cli --attention-implementation ORIGINAL -o <MODEL-NAME>_original
      1. Only when using the ORIGINAL implementation, it's possible to modify the output image size by adding the --latent-w <SIZE> and --latent-h <SIZE> flags. For example:

        python -m python_coreml_stable_diffusion.torch2coreml --latent-w 64 --latent-h 96 --compute-unit CPU_AND_GPU --convert-vae-decoder --convert-vae-encoder --convert-unet --unet-support-controlnet --convert-text-encoder --model-version <MODEL-NAME>_diffusers --bundle-resources-for-swift-cli --attention-implementation ORIGINAL -o <MODEL-NAME>_original_512x768 && python -m python_coreml_stable_diffusion.torch2coreml --latent-w 64 --latent-h 96 --compute-unit CPU_AND_GPU --convert-unet --model-version <MODEL-NAME>_diffusers --bundle-resources-for-swift-cli --attention-implementation ORIGINAL -o <MODEL-NAME>_original_512x768

        The chosen image size must be divisible by 64. Also, you have to specify it divided by 8 (e.g. 768/8=96).
        In the example above, the model will always output images at a resolution of 512x768

  4. The needed files will be created under the <MODEL-NAME>/Resources folder. Everything else can be discarded

Important Notes

  • When exclusively converting SDXL 1.0 models, be sure to include the following flag: --xl-version
  • As of today, ORIGINAL implementations with output sizes greater than 512x768 or 768x512, work slowly on lower-performance machines or do not work at all. 768x768 models had been tested with a time of ~1min/step with M1 (and some kernel panics), ~1s/step with M1 Max 32 GPU, and 1024x1024 models just can't be run (MPSNDArray error: product of dimension sizes > 2**31).

↑ Back to top

Troubleshooting

Miniconda

  • This package is incompatible with this version of macOS: after the "Software Licence Agreement" step, click on "Change Install Location..." and select "Install for me only"

Terminal errors

  • xcrun: error: unable to find utility "coremlcompiler", not a developer tool or in PATH: open Xcode and go to "Settings..." → "Locations" then click on the "Command Line Tools" drop-down menu and reselect the Command Line Tools version

  • ModuleNotFoundError: No module named 'pytorch_lightning': while the conda coreml_stable_diffusion environment is active, run

    pip install pytorch_lightning

    Every time you see a similar message, you can solve it by installing what is requested via pip install <NAME>

  • zsh: killed python: your Mac has run out of memory. Close some memory-hungry applications you may have open and do the process again. Still not working? Reboot. Still not working? Use nice -n 10 before the command. Still not working? Well, SPLIT_EINSUM conversions tend to be the more demanding, so while converting, close all the other apps and leave your Mac melting alone

Terminal warnings

  • If you get any of these

    TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    
    WARNING:__main__:Casted the `beta`(value=0.0) argument of `baddbmm` op from int32 to float32 dtype for conversion!
    WARNING:coremltools:Tuple detected at graph output. This will be flattened in the converted model.
    WARNING:coremltools:Saving value type of int64 into a builtin type of int32, might lose precision!

    You're fine

↑ Back to top

Resources

Scripts

VAEs

↑ Back to top