Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LlamaTokenizer] tokenize nits. #25793

Merged
merged 5 commits into from
Aug 29, 2023

Conversation

ArthurZucker
Copy link
Collaborator

What does this PR do?

Fixes #25769 by making sure "" is encoded to [].

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 28, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks for fixing this!

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Thanks @ArthurZucker.

@ArthurZucker ArthurZucker merged commit 5b5ee23 into huggingface:main Aug 29, 2023
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
* return when length is zero

* Add tests

Co-authored-by:  Avnish Narayan <[email protected]>

* Co-authored-by: avnishn
<[email protected]>

* codeLlama doc should not be on Main

* update test

---------

Co-authored-by: Avnish Narayan <[email protected]>
blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023
* return when length is zero

* Add tests

Co-authored-by:  Avnish Narayan <[email protected]>

* Co-authored-by: avnishn
<[email protected]>

* codeLlama doc should not be on Main

* update test

---------

Co-authored-by: Avnish Narayan <[email protected]>
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 18, 2023
* return when length is zero

* Add tests

Co-authored-by:  Avnish Narayan <[email protected]>

* Co-authored-by: avnishn
<[email protected]>

* codeLlama doc should not be on Main

* update test

---------

Co-authored-by: Avnish Narayan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Local variable 'tokens' referenced before assignment error in tokenization_llama.py
4 participants