Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why can the two scale models be used directly together without training? #2

Open
IzumiKDl opened this issue Dec 9, 2024 · 1 comment

Comments

@IzumiKDl
Copy link

IzumiKDl commented Dec 9, 2024

Thank you so much for your amazing work! When running the code, we noticed that two VAR models of different scales are directly employed as the drafter and refiner models without additional training. This raises the question: why can these two models be used together seamlessly without further fine-tuning?

@czg1225
Copy link
Owner

czg1225 commented Dec 9, 2024

Hi @IzumiKDl ,
Thanks for your interest! Although the drafter and refiner are two different VAR models, they use the same VQVAE encoder during the training phase. This means that they have a shared discrete latent space. As a result, they can conduct collaborative decoding without training. It is worth mentioning that fine-tuning the model on their specialized scales will further improve the generation effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants