Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: The dtype of q torch.bfloat16 does not match the q_data_type torch.float16 specified in plan function. #638

Open
Godlovecui opened this issue Nov 25, 2024 · 1 comment

Comments

@Godlovecui
Copy link

Hello,
I install flashinfer by AOT, where to modify q_data_type into torch.bfloat16 in plan function?
image

Thank you~

@yzh119
Copy link
Collaborator

yzh119 commented Nov 25, 2024

I think currently vllm uses the v0.1.5 style api and you can specify the q_data_type in the begin_forward function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants