Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you please tell me why clip is used to calculate gram matrix instead of VGG. #12

Open
Jamie-Cheung opened this issue Aug 8, 2023 · 2 comments

Comments

@Jamie-Cheung
Copy link

Your work is very impressive and interests me very much. However, I would like to ask about the description of gram in the paper. Could you please tell me why clip is used to get the vector and calculate gram matrix instead of vgg.

@HolmesShuan
Copy link

I believe CLIP may be more robust to images with artifacts generated by diffusion models compared to VGG, which is pre-trained on natural images. CLIP is trained on a larger dataset and leverages its text embedding space, rather than relying solely on category labels. However, this hypothesis deserves further experimentation.

@HolmesShuan
Copy link

I believe CLIP may be more robust to images with artifacts generated by diffusion models compared to VGG, which is pre-trained on natural images. CLIP is trained on a larger dataset and leverages its text embedding space, rather than relying solely on category labels. However, this hypothesis deserves further experimentation.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants