enable correct padding_idx for embedding layers #1527

gupta-abhay · 2024-09-16T20:08:28Z

This functionality is primarily needed when you call the resize_position_embeddings call for a HF model (ie, adding new tokens to vocab). If the pad_token already exists for the config, you need to ensure that it gets set correctly post _init_weights call.
By default, the behavior remains unchanged from today, since the pad_token_id - if not specified by the config, defaults to None from here: https://github.com/huggingface/transformers/blob/174890280b340b89c5bfa092f6b4fb0e2dc2d7fc/src/transformers/configuration_utils.py#L326 and the padding_idx is equivalently set to None

dakinggg

can you add a simple test for the new init function?

enable correct padding_idx for embedding layers

aab26db

gupta-abhay requested a review from a team as a code owner September 16, 2024 20:08

dakinggg reviewed Sep 16, 2024

View reviewed changes

gupta-abhay added 2 commits September 16, 2024 23:17

adding init test

273231b

working changes for tests

79064a3

dakinggg approved these changes Sep 17, 2024

View reviewed changes

gupta-abhay merged commit 0114f33 into main Sep 17, 2024
9 checks passed

Provide feedback