-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add initial design for uniform processors + align model #31197
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
cc @amyeroberts for a review with a narrower scope than the parent PR 😁 |
cc @amyeroberts, sorry to divide my pings 😅 PR is big and I wanted to split it up, this one should be merge-able and be the basis for the rest, I'll rebase afterwards (and @qubvel welcome if you want to take a look!) It includes the kwargs merging just mentioned in the other PR, moved them to processing common! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Same comment as for the other PR - it would be good to move out the kwarg prep logic to out of the config and tests to make sure we can properly control tokenizer kwargs with tokenizer.init_kwargs
and the input kwargs.
cc @qubvel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see _merge_kwargs
as a separate method, this is exactly what came to my mind while reviewing the first PR 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a lot cleaner!
} | ||
|
||
``` | ||
""" | ||
|
||
_defaults = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_defaults = { | |
padding: "max_length" | |
max_lenght: 64 |
should work no? Or does it not update the default for type-hints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it works for sure, this was to have a structured dict for defaults. Can change :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, now I remember, it actually can't work like that since Typed Dicts don't support default values, they are made to hold the layout. They can have any attributes however, but it won't pass a value as default -like a dataclass
would, but in this case we'd lose typing-, hence the manual operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok got it thanks! Let's maybe comment about this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a comment for future code inspectors? I'm assuming here isn't the best place (we don't want it for all models) but didn't find a corresponding one elsewhere on a quick skim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On that: there's doc in processing_utils.ProcessingKwargs
, I added a comment nudging users to check there for documentation!
I have updated the PR description to be more self-contained: by using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Just a few small comments
} | ||
|
||
``` | ||
""" | ||
|
||
_defaults = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a comment for future code inspectors? I'm assuming here isn't the best place (we don't want it for all models) but didn't find a corresponding one elsewhere on a quick skim
@amyeroberts FYI Kept digging for the kwargs merging logic, and found an edge case that was giving unreliable results in the tokenizer. Refactored and doubled number of tests to avoid further trickery (including an edge case found earlier by @qubvel ), logic should be easier to read now. Nothing else changed, and tests should pass reliably. |
…31197) * add initial design for uniform processors + align model * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align
* add initial design for uniform processors + align model * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align
* add initial design for uniform processors + align model * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align
* add initial design for uniform processors + align model * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align
* add initial design for uniform processors + align model * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align
* add initial design for uniform processors + align model * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align
What does this PR do?
Adds a uniform signature for processors. This PR adds the initial design + one model for the larger #30511.
Usage
As before, kwargs that are passed to processors at
__call__
time take priority. However, per-modality processors can be instantiated with their own kwargs, and if they are not overriden at call time, they will serve as defaults.Type hinting of kwargs is preserved if they are passed as structured dictionary entries
It also works with kwargs passed without nesting:
Merging of kwargs and handling priority order is done in
processing_utils
through a dedicated method.The order of operations is as follows:
Missing: