Added resources on albert model #20697

JuheonChu · 2022-12-09T05:15:37Z

What does this PR do?

Co-author: @adia Wu [email protected]
Fixes #20055

Before submitting

[o] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[o] Did you read the contributor guideline,
Pull Request section?
[o] Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
[o] Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
[o] Did you write any new necessary tests?

Who can review?

@stevhliu @younesbelkada

@Batese2001

Successful raising errors and exceptions on the revised code in test_modeling_distilbert.py . Co-credit: @Batese2001

…y to defined condition that asserts statements (Co-author: Batese2001)

… having the even number of multi heads

Co-authored-by: [email protected]

Co-authored-by: Adia Wu <[email protected]>

younesbelkada

Thanks so much @JuheonChu for adding the resources for ALBERT !
I left a couple of comments, the main ones being reverting the changes that you did by mistake on modeling_distilbert.py. Also make sure that the text you added is rendered correctly! Let us know if you need any help

younesbelkada · 2022-12-09T09:26:17Z

docs/source/en/model_doc/albert.mdx

+_Increasing model size when pretraining natural language representations often results in improved performance on
 downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations,
 longer training times, and unexpected model degradation. To address these problems, we present two parameter-reduction
 techniques to lower memory consumption and increase the training speed of BERT. Comprehensive empirical evidence shows
 that our proposed methods lead to models that scale much better compared to the original BERT. We also use a
 self-supervised loss that focuses on modeling inter-sentence coherence, and show it consistently helps downstream tasks
 with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and
-SQuAD benchmarks while having fewer parameters compared to BERT-large.*
+SQuAD benchmarks while having fewer parameters compared to BERT-large._


Could you revert these changes? 🙏

Does that mean deleting "_"?

It means you can leave the asterisks * instead of using an underscore _

docs/source/en/model_doc/albert.mdx

younesbelkada · 2022-12-09T09:28:22Z

src/transformers/models/distilbert/modeling_distilbert.py

-        # Have an even number of multi heads that divide the dimensions
-        if self.dim % self.n_heads != 0:
-            # Raise value errors for even multi-head attention nodes
-            raise ValueError(f"self.n_heads: {self.n_heads} must divide self.dim: {self.dim} evenly")
+        assert self.dim % self.n_heads == 0


Could you also revert the changes here ? It seems that you have deleted this by mistake

Co-authored-by: Younes Belkada <[email protected]>

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

…JuheonChu/transformers into added-resources-on-ALBERT-model

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

JuheonChu · 2022-12-09T10:25:56Z

Thank you @younesbelkada ! Would you mind if I ask you how I can pass the
ci/circleci: tests_pipelines_tf tests?

I tried
pip3 install --upgrade pip
pip3 install --upgrade tensorflow

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

sgugger · 2022-12-09T14:08:30Z

Thanks for your PR! Could you focus it solely on the new resources added? There are multiple changes that are not desired.

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

JuheonChu · 2022-12-09T17:35:02Z

Thank you! Will try!
So, I deleted those undesired behaviors, and now I will look for more resources to add!

stevhliu

Thanks for your contribution!

I think it might be easier to open a new PR with changes only to the albert.mdx file because now the modeling_distilbert.py has been deleted and we don't want that!

stevhliu · 2022-12-09T18:35:06Z

docs/source/en/model_doc/albert.mdx

+_Increasing model size when pretraining natural language representations often results in improved performance on
 downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations,
 longer training times, and unexpected model degradation. To address these problems, we present two parameter-reduction
 techniques to lower memory consumption and increase the training speed of BERT. Comprehensive empirical evidence shows
 that our proposed methods lead to models that scale much better compared to the original BERT. We also use a
 self-supervised loss that focuses on modeling inter-sentence coherence, and show it consistently helps downstream tasks
 with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and
-SQuAD benchmarks while having fewer parameters compared to BERT-large.*
+SQuAD benchmarks while having fewer parameters compared to BERT-large._


It means you can leave the asterisks * instead of using an underscore _

stevhliu · 2022-12-09T18:40:00Z

docs/source/en/model_doc/albert.mdx

@@ -67,104 +110,84 @@ This model was contributed by [lysandre](https://huggingface.co/lysandre). This

 ## AlbertModel

-[[autodoc]] AlbertModel
-    - forward
+[[autodoc]] AlbertModel - forward


You can leave this alone as well and allow the forward method to be listed under the AlbertModel object. Same comment applies to all the other objects changed below :)

JuheonChu · 2022-12-09T18:47:33Z

Do you mind if I open a new Pull Request in order to contain only meaningful commits?

stevhliu · 2022-12-09T19:11:00Z

Yes please, that'd be great!

JuheonChu and others added 28 commits November 24, 2022 01:34

Changed assert into 7-8 exceptions

e413124

updated syntax error

210c0af

updated error

e197dcb

updated file (Co-autho: Batese2001)

6d862a2

Successful test on test_modeling_distilbert.py

8d23a50

Successful raising errors and exceptions on the revised code in test_modeling_distilbert.py . Co-credit: @Batese2001

Delete test_modeling_distilbert.ipynb

1a6bcb7

Update modeling_distilbert.py

eef5428

Successful raising of exceptions with the conditions that are contrar…

8216cbc

…y to defined condition that asserts statements (Co-author: Batese2001)

Successful raising of exceptions with the conditions that are contrar…

e7135da

…y to defined condition that asserts statements (Co-author: Batese2001)

committing the reformatted distilbert model

8ce829f

reformatted distilbert model

0ad5940

reformatted distilbert model

155b696

reformatted distilbert model

e95bd4b

reformatted distilbert model with black

e7c695f

Changed comments that explain better about raising exceptions for not…

05e73a7

… having the even number of multi heads

Changed comments that explain better about raising exceptions for not…

f0d9366

… having the even number of multi heads

changed based on the feedback

3903799

Changed line 833 based on the suggestion made from @younesbelkada

79dc337

Changed line 833 based on the suggestion made from @younesbelkada draft2

4254060

reformatted file

14b8b03

First Commit

0b4322f

Added text-classification and token classification

c62f727

Added text-classification and token classifications

b5c1d51

Added resources on question-answering

df0dfe0

Co-authored-by: [email protected]

Added resources on question-answering

725ca57

Co-authored-by: Adia Wu <[email protected]>

Added resources on text-classification

770a444

Co-authored-by: Adia Wu <[email protected]>

Added resources on text-recognition and research

f5026c5

Co-authored-by: Adia Wu <[email protected]>

reformatted the file

fbf6d39

JuheonChu mentioned this pull request Dec 9, 2022

Albert resource #20667

Closed

reformatted the file

0f7f91c

Juheon Chu added 3 commits December 9, 2022 00:36

upgraded tensorflow

82d683f

retemplate albert.mdx

b8b1cb6

retemplate albert.mdx

13d7939

younesbelkada reviewed Dec 9, 2022

View reviewed changes

JuheonChu and others added 5 commits December 9, 2022 04:57

Update docs/source/en/model_doc/albert.mdx

af4e4c6

Co-authored-by: Younes Belkada <[email protected]>

revert modeling_distilbert.py

c219fb9

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

Merge branch 'added-resources-on-ALBERT-model' of https://github.com/…

70137c6

…JuheonChu/transformers into added-resources-on-ALBERT-model

Accepted changes to the suggestion from @younesbelkada to revert

e88b1c6

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

Accepted changes to the suggestion from @younesbelkada to revert

6a02453

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

Juheon Chu and others added 3 commits December 9, 2022 05:27

Reformatted

1ef5292

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

Reformatted

23cf018

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

Reformatted

00b326f

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

Juheon Chu and others added 2 commits December 9, 2022 12:09

Reformatted

8684466

Co-authored-by: Adia Wu <[email protected]> Co-authored-by: mollerup23 <[email protected]>

Delete modeling_distilbert.py

cc39d10

stevhliu reviewed Dec 9, 2022

View reviewed changes

JuheonChu closed this Dec 9, 2022

JuheonChu deleted the added-resources-on-ALBERT-model branch December 9, 2022 19:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added resources on albert model #20697

Added resources on albert model #20697

JuheonChu commented Dec 9, 2022

younesbelkada left a comment

younesbelkada Dec 9, 2022

JuheonChu Dec 9, 2022

stevhliu Dec 9, 2022

younesbelkada Dec 9, 2022

JuheonChu commented Dec 9, 2022

sgugger commented Dec 9, 2022

JuheonChu commented Dec 9, 2022

stevhliu left a comment

stevhliu Dec 9, 2022

stevhliu Dec 9, 2022

JuheonChu commented Dec 9, 2022

stevhliu commented Dec 9, 2022

Added resources on albert model #20697

Added resources on albert model #20697

Conversation

JuheonChu commented Dec 9, 2022

What does this PR do?

Before submitting

Who can review?

younesbelkada left a comment

Choose a reason for hiding this comment

younesbelkada Dec 9, 2022

Choose a reason for hiding this comment

JuheonChu Dec 9, 2022

Choose a reason for hiding this comment

stevhliu Dec 9, 2022

Choose a reason for hiding this comment

younesbelkada Dec 9, 2022

Choose a reason for hiding this comment

JuheonChu commented Dec 9, 2022

sgugger commented Dec 9, 2022

JuheonChu commented Dec 9, 2022

stevhliu left a comment

Choose a reason for hiding this comment

stevhliu Dec 9, 2022

Choose a reason for hiding this comment

stevhliu Dec 9, 2022

Choose a reason for hiding this comment

JuheonChu commented Dec 9, 2022

stevhliu commented Dec 9, 2022