Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

few questions. #7

Open
yejy53 opened this issue Jan 14, 2024 · 3 comments
Open

few questions. #7

yejy53 opened this issue Jan 14, 2024 · 3 comments

Comments

@yejy53
Copy link

yejy53 commented Jan 14, 2024

Thank you for your great work, but I have a few questions. 1. If I need to train on a new data set, how should I set the data set format? 2. Compared with the original controlnet, is it just a matter of replacing the original text encoder with image VAE input?

@aihao2000
Copy link
Owner

@yejy53 Training data can be constructed using huggingface datasets. Each sample should contain three data columns, blueprint (line drawing), image pomrpt, image (the image expected to be generated). The training part of the readme should be introduced. The reference image, that is, the image prompt, is encoded by VIT and used for cross attention, and VAE is not used. blueprint is injected into UNet through additional convolutional layers. The input of VAE has not been replaced and is still the image expected to be generated.

@yejy53
Copy link
Author

yejy53 commented Jan 15, 2024

Thank you for your outstanding contributions. Could you kindly provide your email address? I have several specific inquiries that require your insight.

@aihao2000
Copy link
Owner

@yejy53 Of course, [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants