Replies: 1 comment 1 reply
-
Hello @oooolga! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone,
I am currently exploring the Hugging Face Diffusers library and came across the
TextImageProjection
function (fromdiffusers.models.embeddings
). I'm having a bit of difficulty understanding the specific methodology it employs, particularly how it projects text embeddings with image embeddings.From what I gather, the function uses a concatenation method to combine text and image embeddings. I'm curious about the details of this process:
I would greatly appreciate if someone could provide a detailed explanation or point me toward any resources or documentation that might help clarify these aspects.
Thank you in advance for your help!
Beta Was this translation helpful? Give feedback.
All reactions