You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great work, but I have a few questions. 1. If I need to train on a new data set, how should I set the data set format? 2. Compared with the original controlnet, is it just a matter of replacing the original text encoder with image VAE input?
The text was updated successfully, but these errors were encountered:
@yejy53 Training data can be constructed using huggingface datasets. Each sample should contain three data columns, blueprint (line drawing), image pomrpt, image (the image expected to be generated). The training part of the readme should be introduced. The reference image, that is, the image prompt, is encoded by VIT and used for cross attention, and VAE is not used. blueprint is injected into UNet through additional convolutional layers. The input of VAE has not been replaced and is still the image expected to be generated.
Thank you for your outstanding contributions. Could you kindly provide your email address? I have several specific inquiries that require your insight.
Thank you for your great work, but I have a few questions. 1. If I need to train on a new data set, how should I set the data set format? 2. Compared with the original controlnet, is it just a matter of replacing the original text encoder with image VAE input?
The text was updated successfully, but these errors were encountered: