-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable using kwargs for selecting pad-to-max-length strategy for tokenizer in embeddings #393
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! I'm going to defer full review to @gkumbhat, but there's one little formatting NIT I noticed when glancing through it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution.
Looks like DCO is failing on this PR. Can you do a --signoff
and push again?
@gkumbhat I also wrote a unit test so let me push that as well. I suspect the In my test case, I originally wanted to show that the results stays the same but there will be a change to the tokenizer |
…nizer in embeddings Signed-off-by: kcirred <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks for making updates
Signed-off-by: kcirred <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Allows users to pass tokenizer keyword argument to select the tokenizer settings desired. In this case, enabling for pad_to_max_length.