-
Notifications
You must be signed in to change notification settings - Fork 964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ControlNet #153
Support ControlNet #153
Conversation
🎉 nice! |
@@ -194,6 +208,9 @@ def bundle_resources_for_swift_cli(args): | |||
("unet", "Unet"), | |||
("unet_chunk1", "UnetChunk1"), | |||
("unet_chunk2", "UnetChunk2"), | |||
("control-unet", "ControledUnet"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Could we please change this toControlledUnet
?
var destinationG = try vImage_Buffer(width: Int(width), height: Int(height), bitsPerPixel: 8 * UInt32(MemoryLayout<Float>.size)) | ||
var destinationB = try vImage_Buffer(width: Int(width), height: Int(height), bitsPerPixel: 8 * UInt32(MemoryLayout<Float>.size)) | ||
|
||
var minFloat: [Float] = [-1.0, -1.0, -1.0, -1.0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff in this file looks unexpectedly large, could you please verify that the only changes are related to minFloat and maxFloat vars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work @ryu38! I left a few comments that you could hopefully address. Do you mind adding the new CLI args (Python and Swift) in the README?
for n in 0..<results.count { | ||
let result = results.features(at: n) | ||
if currentOutputs.count < results.count { | ||
let initOutput = result.featureNames.reduce(into: [String: MLMultiArray]()) { output, k in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use MLShapedArray instead of MLMultiArray
let result = results.features(at: n) | ||
if currentOutputs.count < results.count { | ||
let initOutput = result.featureNames.reduce(into: [String: MLMultiArray]()) { output, k in | ||
output[k] = MLMultiArray( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a lot faster if we could pre-allocate the output with the expected size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this suggesting that we should pre-allocate MLShapedArray with a specific shape in output dictionary? If we do this before allocating model results, would we create an MLShapedArray filled with zero values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, create it with the right size and fill with zeros.
let fileName = model + ".mlmodelc" | ||
return urls.controlNetDirURL.appending(path: fileName) | ||
} | ||
if (!controlNetURLs.isEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!controlNetURLs.isEmpty) { | |
if !controlNetURLs.isEmpty { |
let unetURL: URL, unetChunk1URL: URL, unetChunk2URL: URL | ||
|
||
// if ControlNet available, Unet supports additional inputs from ControlNet | ||
if (controlNet == nil) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (controlNet == nil) { | |
if controlNet == nil { |
"timestep" : MLMultiArray(t), | ||
"encoder_hidden_states": MLMultiArray(hiddenStates) | ||
] | ||
additionalResiduals?[$0.offset].forEach { (k, v) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
additionalResiduals?[$0.offset].forEach { (k, v) in | |
for (k, v) int additionalResiduals?[$0.offset] { |
@@ -29,6 +33,10 @@ public extension StableDiffusionPipeline { | |||
safetyCheckerURL = baseURL.appending(path: "SafetyChecker.mlmodelc") | |||
vocabURL = baseURL.appending(path: "vocab.json") | |||
mergesURL = baseURL.appending(path: "merges.txt") | |||
controlNetDirURL = baseURL.appending(path: "Controlnet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since torch2coreml seems to export to the controlnet
directory, it seems like a good idea to start with lower case here as well.
Thanks for your great contribution!
Thank you for your reviews! I'll check or fix them one by one. I'll also update README to include about the new args. |
@ryu38 I see that you have pushed some commits addressing the feedback. Please let me know when you would like me to re-review :) |
Update: I am running the final tests and I will merge this PR when they pass. The latest commit seems to have addressed all the feedback but I will do one more visual pass just in case |
@atiorh I apologize that I pushed new commit just before the branch merged. This commit addressed the remaining feedback and improved inference speed in ControlNet.swift. |
Just wow! Well done all. Can't wait to dig into this! 🎉🎉🎉🎉 |
Just realized the extra commit, this is my bad too! I don't have concerns with the diff though. Thanks for the contribution @ryu38 ! |
@atiorh Thank you for your confirmation! |
Excuse me, I call the following command: python -m python_coreml_stable_diffusion.torch2coreml \
--convert-vae-decoder --convert-vae-encoder --convert-unet \
--unet-support-controlnet --convert-text-encoder \
--model-version runwayml/stable-diffusion-v1-5 \
--bundle-resources-for-swift-cli \
--quantize-nbits 6 \
--attention-implementation SPLIT_EINSUM_V2 \
-o ~/MochiDiffusion/models && \
python -m python_coreml_stable_diffusion.torch2coreml \
--convert-unet --unet-support-controlnet \
--model-version runwayml/stable-diffusion-v1-5 \
--bundle-resources-for-swift-cli \
--quantize-nbits 6 \
--attention-implementation SPLIT_EINSUM_V2 \
-o ~/MochiDiffusion/models but only these files are generated, no Unet: If I want to get runnable model supported controlNet, what commands should I run? |
The files you ended up with are a working model, when used along with a ControlNet model. But they won't work without a ControlNet model. That is, they won't work for regular inference, or for Image2Image. To also get the The Note: I believe that you will also need to use the |
I added ControlNet feature in model conversion and inference.
New Files
controlnet.py
ControlNet.swift
Main Changes
torch2coreml.py
--convert-contronet
--convert-contronet lllyasviel/sd-controlnet-mlsd lllyasviel/sd-controlnet-canny
ControlNet_lllyasviel_sd-controlnet-mlsd.mlpackage
--unet-support-controlnet
*_control-unet.mlpackage
unet.py and UNet.swift
pipeline.py
--controlnet
--convert-contronet
option in torch2coreml.py--controlnet-inputs
--controlnet
StableDiffusionCLI
--controlnet
(enter model file names in Resources/controlnet without extension)--controlnet-inputs
Do not erase the below when submitting your pull request:
#########