few questions. #7

yejy53 · 2024-01-14T08:52:18Z

Thank you for your great work, but I have a few questions. 1. If I need to train on a new data set, how should I set the data set format? 2. Compared with the original controlnet, is it just a matter of replacing the original text encoder with image VAE input?

aihao2000 · 2024-01-14T15:51:28Z

@yejy53 Training data can be constructed using huggingface datasets. Each sample should contain three data columns, blueprint (line drawing), image pomrpt, image (the image expected to be generated). The training part of the readme should be introduced. The reference image, that is, the image prompt, is encoded by VIT and used for cross attention, and VAE is not used. blueprint is injected into UNet through additional convolutional layers. The input of VAE has not been replaced and is still the image expected to be generated.

yejy53 · 2024-01-15T02:54:08Z

Thank you for your outstanding contributions. Could you kindly provide your email address? I have several specific inquiries that require your insight.

aihao2000 · 2024-01-15T02:58:04Z

@yejy53 Of course, [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

few questions. #7

few questions. #7

yejy53 commented Jan 14, 2024

aihao2000 commented Jan 14, 2024

yejy53 commented Jan 15, 2024

aihao2000 commented Jan 15, 2024

few questions. #7

few questions. #7

Comments

yejy53 commented Jan 14, 2024

aihao2000 commented Jan 14, 2024

yejy53 commented Jan 15, 2024

aihao2000 commented Jan 15, 2024