Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update with forked from #1

Merged
merged 956 commits into from
Mar 20, 2023

Conversation

oushu1zhangxiangxuan1
Copy link
Owner

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

ydshieh and others added 30 commits February 23, 2023 09:41
* Update expect output values - as Hub repo. files are updated

* Update expect output values - as librosa is from 0.9.2 to 0.10.0 on CI docker

* fix

* update one more

---------

Co-authored-by: ydshieh <[email protected]>
* fix bug

* forward contrib credits from discussions

* change logic

---------

Co-authored-by: edbeeching <[email protected]>
* Ran Black formatting

* Added imports and reformatted

* Update src/transformers/models/encoder_decoder/modeling_tf_encoder_decoder.py

---------

Co-authored-by: Matt <[email protected]>
* troubleshooting guide: added an error description for missing auto-mapping

* minor polishing

* changed the example

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/troubleshooting.mdx

Co-authored-by: Sylvain Gugger <[email protected]>

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
* [deepspeed tests] fix issues introduced by #21700

* fix

* fix
* Removed useless check for backend

* fix style check for graphormer

* Reverted change and corrected requires_backend for cython

* code qual
#21612)

* fix: Change is_last chunk calc and add conditional break

* format fix

* account for 0 and full stride_rights, add comment

* add new test

* make style

* update slow whisper asr test timestamps

* use nested_simplify on output and round timestamp to hundreths place
* [flax] adding support for batch norm layers

* fixing bugs related to pt+flax integration

* cleanup, batchnorm support in sharded pt to flax

* support for batchnorm tests in pt+flax integration

* simplifying checking batch norm layer
…1756)

* [Examples] Generalise run audio classification for log-mel models

* batch feature extractor

* make style
* Different behavior in DistilBERT when using "inputs_embeds"
Fixes #21089

* fix failing test
* Return and rescale attention_mask

* Add SpecAugment to Whisper modeling

* Fix test

* Update docstring

* Add SpecAug related parameters to model config

* Add the _mask_input_features function to doc

* Fix quality

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* Remove dev comments

* Add test

* Resolve conflict

* feat: mask {feature, time} prob fast tests

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

---------

Co-authored-by: Arthur <[email protected]>
Co-authored-by: sanchit-gandhi <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
* fix history

* input_features instead of input ids for TFWhisport doctest

* use translate intead of transcribe
* updated expected

* prediction_length fix

* prediction_length default value

* default prediction_length 24

* revert back prediction_length default

* move prediction_length test
* fix gradient checkpointing bug

* fix gradient checkpointing bug

* ran make fix-copies

* fixed bug

* fixed bug
* Fix resume_from_checkpoint for deepspeed

Fix resume_from_checkpoint for deepspeed, by ensuring that the deepspeed engine is the one to load the checkpoint.

* Empty commit to trigger CI

* Removed deepspeed skipping 

Removed deepspeed skipping inside the _load_from_checkpoint function, as it is obsolete

* another adjustment

* Trigger CI

* trigger circleci

* style

---------

Co-authored-by: ydshieh <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
* Override the decoding parameters of Seq2SeqTrainer

* Fix quality

* Fix max_length parameter

* Fix quality

* Remove redundant parameter max_length

* Separate the preprocess of train and validation to use different max_target_length
* fix wrong url

* typos in english documentation
make concrete_args from outside available
* add pipeline

* update init

* add zero shot to init

* update inits and correct checkpoints

* update base to support input features

* add tests

* Update src/transformers/pipelines/zero_shot_audio_classification.py

Co-authored-by: Younes Belkada <[email protected]>

* Update src/transformers/pipelines/zero_shot_audio_classification.py

Co-authored-by: Younes Belkada <[email protected]>

* update pieline code

* use tiny checkpoint

* nits and expected value with tiny model

* style

* last nit on tests values

* fix styling

* fix collate fn that was casting t float

* update

---------

Co-authored-by: Younes Belkada <[email protected]>
* uint8 -> bool

* fix copies

* style

* update test modeling commen when checking attention buffers

* style

* use logical not on random mask instead of subtraction with 1

* remove torch uint8

* quality

* remove modified modeling utils

* Update based on review

Co-authored-by: sgugger <[email protected]>

---------

Co-authored-by: sgugger <[email protected]>
* add `accelerate` marker

* add to docs

* Update docs/source/en/testing.mdx
fix nn.init.trunc_normal_ call on half data
* Fix gradient checkpointing bug in gptneox

* Remove use_cache block
amyeroberts and others added 29 commits March 15, 2023 18:37
* Fix regression in pipeline when device=-1 is passed

* Add regression test
* Use return_loss for BridgeTowerForContrastiveLearning, add example

* fix tests

* Update example in BridgeTowerForContrastiveLearning

* Update test_modeling_bridgetower.py

* update model output format

* minor update

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

* make style

---------

Co-authored-by: Tiep Le <[email protected]>
Co-authored-by: Tiep Le <[email protected]>
Co-authored-by: Yih-Dar <[email protected]>
Co-authored-by: ydshieh <[email protected]>
* t5 remove data dependency

* make style

* make fix-copies

---------

Co-authored-by: Prathik Rao <[email protected]>
* Deal with torch-tensorrt

---------

Co-authored-by: ydshieh <[email protected]>
Fix align docs typo
* Tranlstion Italian: migration

* Update migration.mdx

minor fixes

* Update _toctree.yml

* Delete migration.mdx

* Add italian translation of migration.mdx

* Update of migration.mdx translation and toctree
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <[email protected]>
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <[email protected]>
* Update UNCONVERTIBLE_MODEL_ARCHITECTURES

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* make style and quality

---------

Co-authored-by: ydshieh <[email protected]>
* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143

* Reduced column width

* Fix formatting.

* Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143"

This reverts commit 6e95a10.

* Fix export error.

* Revert "Fix formatting."

This reverts commit 8310f60.

* Propagated changes made in SwinV2 to Swin2SR
* add `accelerate` support for XGLM

* fix order
* py38 + torch 2

* increment cache versions

---------

Co-authored-by: ydshieh <[email protected]>
Use dash 2.8.1 for now

Co-authored-by: ydshieh <[email protected]>
* added doc to toc, auto tip with  supported models, mention of task guide in model docs

* make style

* removed "see also"

* minor fix
* LLaMA house-keeping

* Doc links
* fix AutoTP in deepspeed could not work for bloom

Signed-off-by: Wang, Yi A <[email protected]>

* add a method in BloomModel to build ailib

Signed-off-by: Wang, Yi A <[email protected]>

---------

Signed-off-by: Wang, Yi A <[email protected]>
* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <[email protected]>

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <[email protected]>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------

Co-authored-by: Younes Belkada <[email protected]>
fix(docs): task guide links in model docs
* Add kernel size to NATTEN's QK arguments.

The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional
argument to the QK operation to allow optional RPBs.

This ends up failing NATTEN tests.

This commit adds NATTEN back to circleci and adds the arguments to get
it working again.

* Force NATTEN >= 0.14.5
Revert "Use `dash==2.8.1` for now for daily CI (#22227)"

This reverts commit 5321867.
@oushu1zhangxiangxuan1 oushu1zhangxiangxuan1 merged commit 22ead54 into oushu1zhangxiangxuan1:main Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.