TF: GPT2 with native embedding layers #23436

gante · 2023-05-17T17:56:54Z

What does this PR do?

This PR continues the (paused) goal of deprecating our custom TF embedding layers and related code. Previously, we have converted encoder-decoder models (e.g. here), removing TFSharedEmbeddings there and making the necessary adaptations.

In this PR, I make the necessary adaptations for GPT2. The goal is for you, the reviewers, to raise objections in this PR :D All slow tests for TF GPT2 are passing.

Then, the following sequence of PRs will be opened:

Remove TFSharedEmbeddings from the other decoder-only models
Remove other uses of TFSharedEmbeddings in the codebase (e.g. in tests)
Remove resize_token_embeddings and all related functions (it is only used to resize our models' embeddings instantiated with TFSharedEmbeddings)
Remove the slow decorator from test_save_load_after_resize_token_embeddings, which will be fixed as a consequence of these changes 🙌

src/transformers/modeling_tf_utils.py

src/transformers/models/gpt2/modeling_tf_gpt2.py

HuggingFaceDocBuilderDev · 2023-05-17T18:14:36Z

The documentation is not available anymore as the PR was closed or merged.

amyeroberts

Nice - thanks for adding!

src/transformers/modeling_tf_utils.py

Rocketknight1

Looks clean to me!

arivero · 2024-04-04T15:23:19Z

I noticed this change one year later, while doing a demo of LoRA in the embedding layer :-D Nice, but I am always afraid of doing tensorflow operations outside of keras layers. Usually masks are lost, as well as any subtle things that depend of chaining layers. It is not a problem in transformers because the -100 trick is the standard mask and the loss is aware of it.

TF GPT2 with native embeddings

1aad0b9

gante requested review from amyeroberts and Rocketknight1 May 17, 2023 17:56

gante commented May 17, 2023

View reviewed changes

src/transformers/modeling_tf_utils.py Outdated Show resolved Hide resolved

Update src/transformers/modeling_tf_utils.py

95352df

gante commented May 17, 2023

View reviewed changes

src/transformers/models/gpt2/modeling_tf_gpt2.py Outdated Show resolved Hide resolved

Update src/transformers/models/gpt2/modeling_tf_gpt2.py

af5d995

amyeroberts approved these changes May 18, 2023

View reviewed changes

src/transformers/modeling_tf_utils.py Outdated Show resolved Hide resolved

Rocketknight1 approved these changes May 18, 2023

View reviewed changes

specify version in deprecation warning

1395019

gante merged commit db13634 into huggingface:main May 18, 2023

gante deleted the tf_resize branch May 18, 2023 13:46

gante mentioned this pull request May 18, 2023

TF: CTRL with native embedding layers #23456

Merged

sheonhan pushed a commit to sheonhan/transformers that referenced this pull request Jun 1, 2023

TF: GPT2 with native embedding layers (huggingface#23436)

c1e313a

gojiteji pushed a commit to gojiteji/transformers that referenced this pull request Jun 5, 2023

TF: GPT2 with native embedding layers (huggingface#23436)

763f05a

novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023

TF: GPT2 with native embedding layers (huggingface#23436)

0446ded

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF: GPT2 with native embedding layers #23436

TF: GPT2 with native embedding layers #23436

gante commented May 17, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented May 17, 2023 •

edited

Loading

amyeroberts left a comment

Rocketknight1 left a comment

arivero commented Apr 4, 2024

TF: GPT2 with native embedding layers #23436

TF: GPT2 with native embedding layers #23436

Conversation

gante commented May 17, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented May 17, 2023 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

Rocketknight1 left a comment

Choose a reason for hiding this comment

arivero commented Apr 4, 2024

gante commented May 17, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented May 17, 2023 •

edited

Loading