Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support huggingface transformer style rope interface #568

Merged
merged 1 commit into from
Oct 29, 2024

Conversation

yzh119
Copy link
Collaborator

@yzh119 yzh119 commented Oct 29, 2024

Previously our rope apis assume the position indices of each request is contiguous, which is not appropriate for applications such as speculative decoding, this PR fixes the issue by supporting the huggingface transformer-style API which use pos_ids argument to specify positions.

This PR implements parts of the feature of #530 , other requests are coming in later PRs.

cc @dreaming-panda @abcdabcd987 @ByronHsu

@yzh119 yzh119 merged commit 4f40420 into main Oct 29, 2024
yzh119 pushed a commit that referenced this pull request Oct 30, 2024
Fix after changes made in #568 

torch.compile doesn't like returning input arguments. So, change the
return type of pybind fns to `void`, given that it's already an inplace
op.

PyTorch Library annotation is applied to a wrapper function for each
pybind op. Python API doesn't change. Both inplace and non-inplace
versions calls the annotated wrapper function.
tsu-bin added a commit to tsu-bin/flashinfer_dev that referenced this pull request Nov 5, 2024
tsu-bin added a commit to tsu-bin/flashinfer_dev that referenced this pull request Nov 5, 2024
yzh119 pushed a commit that referenced this pull request Nov 5, 2024
Hi, now tvm wrapper build failed cause by #568.
I noticed that the new `BatchQKApplyRotary` interface removed
`__restrict__` modifier from `DType* q, DType* k, DType* q_rope, DType*
k_rope`, so it's trivial to just add one adapter function to fix this
issue.

Co-authored-by: tsu-bin <[email protected]>
@yzh119 yzh119 mentioned this pull request Nov 10, 2024
@yzh119 yzh119 deleted the rope_pos_ids branch November 10, 2024 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant