Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MusicGen] Fix integration tests #25169

Merged
merged 4 commits into from
Jul 28, 2023

Conversation

sanchit-gandhi
Copy link
Contributor

@sanchit-gandhi sanchit-gandhi commented Jul 28, 2023

What does this PR do?

Fixes the integration tests for MusicGen:

  1. Places all input tensors on the correct device
  2. Updates expected values with those obtained on cuda
  3. Fixes for fp16 generation

cc @ydshieh

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 28, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @sanchit-gandhi

Confirmed it works on CI runner now.

@@ -773,7 +773,7 @@ def forward(
past_key_values_length = past_key_values[0][0].shape[2] if past_key_values is not None else 0

if inputs_embeds is None:
inputs_embeds = torch.zeros((bsz, seq_len, self.d_model), device=input_ids.device)
inputs_embeds = torch.zeros((bsz, seq_len, self.d_model), dtype=self.dtype, device=input_ids.device)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this, but @sgugger knows everything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will take the dtype of a random weight (not really random but the first one). Might be better to look for the dtype of the embeddings instead, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to avoid having to specify any dtype / device arguments: 6118fbf

@sanchit-gandhi sanchit-gandhi merged commit 03f98f9 into huggingface:main Jul 28, 2023
@sanchit-gandhi sanchit-gandhi deleted the musicgen-tests branch July 28, 2023 17:51
zachares added a commit to nplan-io/transformers that referenced this pull request Aug 11, 2023
* Enable `ZeroShotAudioClassificationPipelineTests::test_small_model_pt` (#24882)

fix

Co-authored-by: ydshieh <[email protected]>

* Add DINOv2 (#24016)

* First draft

* More improvements

* Convert patch embedding layer

* Convert all weights

* Make conversion work

* Improve conversion script

* Fix style

* Make all tests pass

* Add image processor to auto mapping

* Add swiglu ffn

* Add image processor to conversion script

* Fix conversion of giant model

* Fix documentation

* Fix style

* Fix tests

* Address comments

* Address more comments

* Remove unused arguments

* Remove more arguments

* Rename parameters

* Include mask token

* Address comments

* Add docstring

* Transfer checkpoints

* Empty commit

* [`InstructBlip`] Fix int8/fp4 issues (#24888)

* fix dtype issue

* revert `.float()`

* fix copies

* [`Blip`] Fix blip output name (#24889)

* fix blip output name

* add property

* oops

* fix failing test

* check if eval dataset is dict (#24877)

* check if eval dataset is dict

* formatting

* Separate CircleCI cache between `main` and `pull` (or other branches) (#24886)

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* [`Llama2`]  Add support for Llama 2 (#24891)

* add llama

* add other readmes

* update padding id in readme

* add link to paper

* fix paths and tokenizer

* more nits

* styling

* fit operation in 2 lines when possible

* nits

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* add form

* update reademe

* update readme, we don't have a default pad token

* update test and tokenization

* LLaMA instead of Llama

* nits

* add expected text

* add greeedy output

* styling

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Sylvain Gugger <[email protected]>

* sequential device map

* skip relevant changes

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Disable ipex env var if false (#24885)

Disable ipex if in use

* Check for accelerate env var when doing CPU only (#24890)

Check for use-cpu

* Avoid some pipeline tasks to use `use_cache=True` (#24893)

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Update tested versions in READMEs (#24895)

* Update supported Python and PyTorch versions in readme

* Update Python, etc. versions in non-English readmes

These were more out of date than in the English readme. This
updates all the versions the readmes claim the repository is tested
with to the same versions stated in the English readme.

Those versions are current at least in the case of the Python and
PyTorch versions (and less out of date for the others).

* Propagate trailing whitespace fix to model list

This runs "make fix-copies". The only change is the removal of
whitespace. No actual information or wording is changed.

* Update tested TensorFlow to 2.6 in all readmes

Per pinning in setup.py

Unlike Python and PyTorch, the minimum supported TensorFlow version
has not very recently changed, but old versions were listed in all
READMEs.

* Fix `test_model_parallelism` for `FalconModel` (#24914)

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) (#24907)

- This results in cpu mode on Apple Silicon mps

* fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST (#24902)

fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST

suno/barh should be suno/bark

* Fix minor llama2.md model doc typos (#24909)

Update llama2.md

 Fix typos in the llama2 model doc

* [`Llama2`] replace `self.pretraining_tp` with `self.config.pretraining_tp` (#24906)

* add possibility to disable TP

* fixup

* adapt from offline discussions

* [doc] `image_processing_vilt.py` wrong default documented (#24931)

[doc] image_processing_vilt.py wrong default

* 🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588)

* docs: ko: `document_question_answering.md`

* fix: resolve suggestions

Co-authored-by: Sohyun Sim <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>

---------

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>

* Add multi-label text classification support to pytorch example (#24770)

* Add text classification example

* set the problem type and finetuning task

* ruff reformated

* fix bug for unseting label_to_id for regression

* update README.md

* fixed finetuning task

* update comment

* check if label exists in feature before removing

* add useful logging

* Deprecate unused OpenLlama architecture (#24922)

* Resolve typo in check_repo.py

* Specify encoding when opening modeling files

* Deprecate the OpenLlama architecture

* Add disclaimer pointing to Llama

I'm open to different wordings here

* Match the capitalisation of LLaMA

* replace no_cuda with use_cpu in test_pytorch_examples (#24944)

* replace no_cuda with use_cpu in test_pytorch_examples

* remove codes that never be used

* fix style

* Generate: sequence bias can handle same terminations (#24822)

* Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949)

Bump pygments in /examples/research_projects/decision_transformer

Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update processing_vision_text_dual_encoder.py (#24950)

Fixing small typo: kwrags -> kwargs

* Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916)

fix

Co-authored-by: ydshieh <[email protected]>

* [DOCS] Example for `LogitsProcessor` class (#24848)

* make docs

* fixup

* resolved

* remove debugs

* Revert "fixup"

This reverts commit 5e0f636aae0bf8707bc8bdaa6a9427fbf66834ed.

* prev (ignore)

* fixup broke some files

* remove files

* reverting modeling_reformer

* lang fix

* fix type annotations for arguments in training_args (#24550)

* testing

* example script

* fix typehinting

* some tests

* make test

* optional update

* Union of arguments

* does this fix the issue

* remove reports

* set default to False

* documentation change

* None support

* does not need None

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574)

Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)"

This reverts commit c5e29d4381d4b9739e6cb427adbca87fbb43a3ad.

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* merge

* hacky fix

* fixup

---------

Co-authored-by: Max Ryabinin <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>

* Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954)

Bump aiohttp in /examples/research_projects/decision_transformer

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [`RWKV`] Add Gradient Checkpointing support for RWKV (#24955)

add GC support for RWKV

* Change logic for logging in the examples (#24956)

Change logic

* Contrastive Search peak memory reduction (#24120)

Co-authored-by: Joao Gante <[email protected]>

* Fallback for missing attribute `Parameter.ds_numel` (#24942)

* [trainer] fallback for deepspeed param count

* [trainer] more readable numel count

* fix fsdp checkpointing issues (#24926)

* fix fsdp load

* Update trainer.py

* remove saving duplicate state_dict

* fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947)

* fix: cast input pixels to appropriate dtype for image_to_text tasks

* fix: add casting to pixel inputs of additional models after running copy checks

* 🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664)

* fix: english/korean quicktour.md

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Kihoon Son <[email protected]>

* fix: follow glossary

* 파인튜닝 -> 미세조정

---------

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Kihoon Son <[email protected]>

* fsdp fixes and enhancements (#24980)

* fix fsdp prepare to remove the warnings and fix excess memory usage

* Update training_args.py

* parity for FSDP+XLA

* Update trainer.py

* Fix missing spaces in system prompt of Llama2 tokenizer (#24930)

* Update tokenization_llama.py

* Update tokenization_llama_fast.py

* Update src/transformers/models/llama/tokenization_llama_fast.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/llama/tokenization_llama.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/llama/tokenization_llama.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/llama/tokenization_llama_fast.py

Co-authored-by: Arthur <[email protected]>

---------

Co-authored-by: Arthur <[email protected]>

* [`LlamaConfig`] Nit: pad token should be None by default (#24958)

* pad token should be None by default

* fix tests

* nits

* Remove tokenizers from the doc table (#24963)

* Avoid importing all models when instantiating a pipeline (#24960)

* Avoid importing all models when instantiating a pipeline

* Remove sums that don't work

* Fix type annotation for deepspeed training arg (#24988)

* Use main_input_name for include_inputs_for_metrics (#24993)

* Fix `llama` tokenization doctest (#24990)

fix

Co-authored-by: ydshieh <[email protected]>

* [`bnb`] Add simple check for bnb import (#24995)

add simple check for bnb

* [`Llama`] remove persistent  `inv_freq` tensor (#24998)

remove persistent tensor

* improve from_pretrained for zero3 multi gpus mode (#24964)

* improve from_pretrained for zero3 multi gpus mode

* Add check if torch.distributed.is_initialized

* Revert torch.distributed

---------

Co-authored-by: Stas Bekman <[email protected]>

* Move template doc file to md (#25004)

* 🌐 [i18n-KO] Updated Korean `serialization.md` (#24686)

fix: update ko/serialization.md

* chatgpt draft

* [check_config_docstrings.py] improve diagnostics (#25012)

* [check_config_docstrings.py] improve diagnostics

* style

* rephrase

* fix

* [`logging.py`] set default `stderr`  path if `None` (#25033)

set default logger

* fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035)

fix: store training args to wandb config without sanitization.

Allows resuming runs by reusing the wandb config.

Co-authored-by: Bharat Ramanathan <[email protected]>

* [docs] Performance docs tidy up, part 1  (#23963)

* first pass at the single gpu doc

* overview: improved clarity and navigation

* WIP

* updated intro and deepspeed sections

* improved torch.compile section

* more improvements

* minor improvements

* make style

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

* feedback addressed

* mdx -> md

* link fix

* feedback addressed

---------

Co-authored-by: Steven Liu <[email protected]>

* Support GatedRepoError + use raise from (#25034)

* Support GatedRepoError + use raise from

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* Use token instead of use_auth_token in error messages

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Better handling missing SYS in llama conversation tokenizer (#24997)

* Better handling missing SYS in llama conversation tokenizer

The existing code failed to add SYS if the conversation has history
without SYS, but did modify the passed conversation as it did.

Rearrange the code so modification to the conversation object are taken
into account for token id generation.

* Fix formatting with black

* Avoid one-liners

* Also fix fast tokenizer

* Drop List decl

* 🌐[i18n-KO] Translated performance.md to Korean (#24883)

* dos: ko: performance.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* Update docs/source/ko/performance.md

Co-authored-by: Kihoon Son <[email protected]>

* Update docs/source/ko/performance.md

---------

Co-authored-by: Kihoon Son <[email protected]>

* 🌐 [i18n-KO] Translated `testing.md` to Korean (#24900)

* docs: ko: testing.md

* feat: draft

* fix: manual edits

* fix: edit ko/_toctree.yml

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

* Add dispatch_batches to training arguments (#25038)

* Dispatch batches

* Copy items

* Fix typo in LlamaTokenizerFast docstring example (#25018)

* Make more test models smaller (#25005)

* Make more test models tiny

* Make more test models tiny

* More models

* More models

* Comment again print statement

* Pvt model (#24720)

* pull and push updates

* add docs

* fix modeling

* Add and run test

* make copies

* add task

* fix tests and fix small issues

* Checks on a Pull Request

* fix docs

* add desc pvt.md

* compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044)

* added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict

* check for PEFT model in compute_loss section

---------

Co-authored-by: Nathan Brake <[email protected]>

* [`8bit`] Fix 8bit corner case with Blip2 8bit (#25047)

fix 8bit corner case with Blip2 8bit

* 🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911)

* dos: ko: perf_train_cpu.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* fix: manual edits

Co-authored-by: Haewon Kim <[email protected]>

---------

Co-authored-by: Haewon Kim <[email protected]>

* Better error message when signal is not supported on OS (#25049)

* Better error message when signal is not supported on OS

* Address review comments

* [`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055)

* Add note in doc on `RwkvStoppingCriteria`

* give some breathing space to the code

* Generate - add beam indices output in contrained beam search (#25042)

* [Docs] fix rope_scaling doc string (#25072)

fix rope_scaling doc string

* 🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean (#24904)

* docs: ko: tf_xla.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

* 🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966)

* docs: ko: perf_hardware.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Haewon Kim <[email protected]>

* Fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: fix rendering error of perf_hardware.md

---------

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Haewon Kim <[email protected]>

* Fix last models for common tests that are too big. (#25058)

* Fix last models for common tests that are too big.

* Remove print statement

* fix: add TOC anchor link (#25066)

* Set `TF32` flag for PyTorch cuDNN backend (#25075)

* Fix broken link in README_hd.md (#25067)

Update README_hd.md

* replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task (#25078)

replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size`
in readme of multiple-choice

* [`generate`]  Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030)

* check max length is default

* nit

* update warning: no-longer deprecate

* comment in the configuration_utils in case max length's default gets changed in the futur

* 🌐 [i18n-KO] Translated `hpo_train.md` to Korean (#24968)

* dos: ko: hpo_train.mdx

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* Fix: repeat per sample for SAM image embeddings (#25074)

Repeat per sample for SAM image embeddings

* [`MPT`] Add MosaicML's `MPT` model to transformers (#24629)

* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md

Co-authored-by: Arthur <[email protected]>

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py

Co-authored-by: Arthur <[email protected]>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Arthur <[email protected]>

* fix nit

* add another slow test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------

Co-authored-by: younesbelkada <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>

* [DOCS] add example NoBadWordsLogitsProcessor (#25046)

* add example NoBadWordsLogitsProcessor

* fix L764 & L767

* make style

* 🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean (#24920)

* docs: ko: perf_infer_cpu.md

* feat: chatgpt draft

* fix: manual edits

* Update docs/source/ko/_toctree.yml

* Update docs/source/ko/perf_infer_cpu.md

* Update docs/source/ko/perf_infer_cpu.md

이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

동의합니다! 제가 원본에 너무 얽매여 있었네요!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

말씀하신대로 원문에 너무 집착했던것 같습니다

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

더 나은 어휘 사용에 감사드립니다!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

이 당시 '주기'란 용어를 생각해내질 못했네요...

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

좀 더 자연스러운 문맥이 됐네요!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

굳이 원본 형식에 얽매일 필요가 없군요!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

Co-authored-by: Wonhyeong Seo <[email protected]>

---------

Co-authored-by: Wonhyeong Seo <[email protected]>

* Allow generic composite models to pass more kwargs (#24927)

* fix

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <[email protected]>

* update

---------

Co-authored-by: ydshieh <[email protected]>
Co-authored-by: Joao Gante <[email protected]>

* [ `ForSequenceClassification`] Support `left` padding (#24979)

* support left padding

* nit

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

* [`TF`]  Also apply patch to support left padding (#25085)

* tf versions

* apply changes to other models

* 3 models slipped through the cracks

* Edit err message and comment in `test_model_is_small` (#25087)

* Edit err message and comment in

* put back 80M comment

* [ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053)

* draft solution

* use `setdefault`

* nits

* add tests and fix truncation issue

* fix test

* test passes locally

* quality

* updates

* update tsets

* Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091)

Co-authored-by: ydshieh <[email protected]>

* [`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726)

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

* Fix doctest (#25031)

fix

Co-authored-by: ydshieh <[email protected]>

* Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096)

Bump certifi in /examples/research_projects/lxmert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098)

Bump certifi in /examples/research_projects/decision_transformer

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097)

Bump certifi in /examples/research_projects/visual_bert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix tied_params for meta tensor (#25101)

* fix tied_params for meta tensor

* remove duplicate

* documentation for llama2 models (#25102)

* fix documentation

* changes

* 🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828)

* translated pipeline_webserver.md

Co-Authored-By: Hyeonseo Yun <[email protected]>
Co-Authored-By: Wonhyeong Seo <[email protected]>
Co-Authored-By: Sohyun Sim <[email protected]>
Co-Authored-By: Gabriel Yang <[email protected]>
Co-Authored-By: Nayeon Han <[email protected]>
Co-Authored-By: Jungnerd <[email protected]>

* Update pipeline_webserver.md

* Apply suggestions from code review

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sangam Lee <[email protected]>
Co-authored-by: Kim haewon <[email protected]>

---------

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Gabriel Yang <[email protected]>
Co-authored-by: Nayeon Han <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: Sangam Lee <[email protected]>
Co-authored-by: Kim haewon <[email protected]>

* Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106)

update

Co-authored-by: ydshieh <[email protected]>

* Add descriptive docstring to TemperatureLogitsWarper (#24892)

* Add descriptive docstring to TemperatureLogitsWarper

It addresses https://github.com/huggingface/transformers/issues/24783

* Remove niche features

Co-authored-by: Joao Gante <[email protected]>

* Commit suggestion

Co-authored-by: Joao Gante <[email protected]>

* Refactor the examples to simpler ones

* Add a missing comma

Co-authored-by: Joao Gante <[email protected]>

* Make args description more compact

Co-authored-by: Joao Gante <[email protected]>

* Remove extra text after making description more compact

Co-authored-by: Joao Gante <[email protected]>

* Fix linter

---------

Co-authored-by: Joao Gante <[email protected]>

* fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772)

fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."

Co-authored-by: 刘长伟 <[email protected]>

* update `use_auth_token` -> `token` (#25083)

* update

---------

Co-authored-by: ydshieh <[email protected]>

* Fix past CI after #24334 (#25113)

update

Co-authored-by: ydshieh <[email protected]>

* Move common image processing methods to BaseImageProcessor (#25089)

Move out common methods

* Fix ViT docstring regarding default dropout values. (#25118)

Fix docstring for dropout.

* MaskFormer - enable return_dict in order to compile (#25052)

* Enable return_dict in order to compile

* Update tests

* Move center_crop to BaseImageProcessor (#25122)

* fix deepspeed load best model at end when the model gets sharded (#25057)

* fix delete all checkpoints when save_total_limit is set to 1 (#25136)

* [`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131)

default legacy to None

* Clarify 4/8 bit loading log message (#25134)

* clarify 4/8 bit loading log message

* make style

* 🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109)

* Change defaults

* Sylvain's comments

* [`MptConfig`] support from pretrained args (#25116)

* support from pretrained args

* draft addition of tests

* update test

* use parrent assert true

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Younes Belkada <[email protected]>

---------

Co-authored-by: Younes Belkada <[email protected]>

* Add offload support to Bark (#25037)

* initial Bark offload proposal

* use hooks instead of manually offloading

* add test of bark offload to cpu feature

* Apply nit suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* Update docstrings of offload

Co-authored-by: Sanchit Gandhi <[email protected]>

* remove unecessary set_seed in Bark tests

---------

Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Sanchit Gandhi <[email protected]>

* More `token` things (#25146)

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Add bloom flax (#25094)

* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <[email protected]>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <[email protected]>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py

Co-authored-by: Sylvain Gugger <[email protected]>

* per-weight check

* style

* line formats

---------

Co-authored-by: younesbelkada <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: haileyschoelkopf <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>

* Add new model in doc table of content (#25148)

* Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120)

* Fix .push_to_hub and cleanup get_full_repo_name usage

* Do not rely on Python bool conversion magic

* request changes

* Add test when downloading from gated repo (#25039)

* override .cuda() to check if model is already quantized (#25166)

* Represent query_length in a different way to solve jit issue (#25164)

Fix jit trace

* make run_generation more generic for other devices (#25133)

* make run_generation more generic for other devices

* use Accelerate to support any device type it supports.

* make style

* fix error usage of accelerator.prepare_model

* use `PartialState` to make sure everything is running on the right device

---------

Co-authored-by: statelesshz <[email protected]>

* added compiled model support for inference (#25124)

* added compiled model support for inference

* linter

* Fix tests

* linter

* linter

* remove inference mode from pipelines

* Linter

---------

Co-authored-by: amarkov <[email protected]>

* Update `use_auth_token` -> `token` in example scripts (#25167)

* pytorch examples

* tensorflow examples

* flax examples

---------

Co-authored-by: ydshieh <[email protected]>

* [`Mpt`] Fix mpt slow test (#25170)

fix mpt slow test

* [`InstructBlip`] Fix instructblip slow test (#25171)

* fix instruct blip slow test

* Update tests/models/instructblip/test_modeling_instructblip.py

* 🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881)

* docs: ko: transformers_agents.md

* docs: ko: transformers_agents.md

* feat: deepl draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Juntae <[email protected]>
Co-authored-by: Injin Paek <[email protected]>

---------

Co-authored-by: Juntae <[email protected]>
Co-authored-by: Injin Paek <[email protected]>

* Fix beam search to sample at least 1 non eos token (#25103) (#25115)

* [MusicGen] Fix integration tests (#25169)

* move to device

* update with cuda values

* fix fp16

* more rigorous

* 🚨🚨🚨  Fix rescale ViVit Efficientnet (#25174)

* Fix rescaling bug

* Add tests

* Update integration tests

* Fix up

* Update src/transformers/image_transforms.py

* Update test - new possible order in list

* Musicgen: CFG is manually added  (#25173)

* Better error message in `_prepare_output_docstrings` (#25202)

fix

Co-authored-by: ydshieh <[email protected]>

* [`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206)

wrap `cuda` and `to` method correctly

* Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211)

fix

Co-authored-by: ydshieh <[email protected]>

* [quantization.md] fix (#25190)

Update quantization.md

* [`pipeline`] revisit device check for pipeline (#25207)

* revisit device check for pipeline

* let's raise an error.

* Update tiny model info. and pipeline testing (#25213)

* update tiny_model_summary.json

* update

* update

* update

---------

Co-authored-by: ydshieh <[email protected]>

* Fix docker image build failure (#25214)

fix

Co-authored-by: ydshieh <[email protected]>

* make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193)

make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work

Signed-off-by: Wang, Yi A <[email protected]>

* [`Pix2Struct`] Fix pix2struct cross attention (#25200)

* fix pix2struct cross attention

* fix torchscript slow test

* [`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216)

* clearer explanation on how things works under the hood.

* Update docs/source/en/main_classes/quantization.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/main_classes/quantization.md

Co-authored-by: amyeroberts <[email protected]>

* add `load_in_4bit` in `from_pretrained`

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: amyeroberts <[email protected]>

* [`MPT`] Add  `require_bitsandbytes` on MPT integration tests (#25201)

* add  `require_bitsandbytes` on MPT integration tests

* add it on mpt as well

* [`Detr`] Fix detr BatchNorm replacement issue (#25230)

* fix detr weird issue

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: Sylvain Gugger <[email protected]>

* fix copies

* fix copies

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Move rescale dtype recasting to match torchvision ToTensor (#25229)

Move dtype recasting to match torchvision ToTensor

* Fix set of model parallel in the Trainer when no GPUs are available (#25239)

* fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105)

* add test for `get_keys_to_not_convert`

* add minimum patch to keep mpt lm_head from 8bit quantization

* add reivsion to

* add pathname and line number to logging formatter in debug mode (#25203)

* add pathname and lineno to logging formatter in debug mode

* use TRANSFORMERS_VERBOSITY="detail" to print pathname and lineno

* Add `token` arugment in example scripts (#25172)

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* resolving zero3 init when using accelerate config with Trainer (#25227)

* resolving zero3 init when using accelerate config with Trainer

* refactor

* fix

* fix import

* Update rescale tests - cast to float after rescaling to reflect #25229 (#25259)

Rescale tests - cast to float after rescaling to reflect #25229

* Fix some bugs for two stage training of deformable detr (#25045)

* Update modeling_deformable_detr.py

Fix bugs for two stage training

* Update modeling_deformable_detr.py

* Add test_two_stage_training to DeformableDetrModelTest

---------

Co-authored-by: yupeng.jia <[email protected]>

* [DOCS] Add example and modified docs of EtaLogitsWarper (#25125)

* added example and modified docs for EtaLogitsWarper

* make style

* fixed styling issue on 544

* removed error info and added set_seed

* Update src/transformers/generation/logits_process.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/generation/logits_process.py

Co-authored-by: amyeroberts <[email protected]>

* updated the results

---------

Co-authored-by: amyeroberts <[email protected]>

* Fix return_dict_in_generate bug in InstructBlip generate function (#25246)

Fix bug in InstructBlip generate function

Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`).

This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object.

* Remove `pytest_options={"rA": None}` in CI (#25263)

fix

Co-authored-by: ydshieh <[email protected]>

* 🌐 [i18n-KO] Translated `perf_infer_gpu_many.md` to Korean (#24943)

* doc: ko: perf_infer_gpu_many.mdx

* feat: chatgpt draft

* fix: manual edits

* Update docs/source/ko/perf_infer_gpu_many.md

Co-authored-by: Jungnerd <[email protected]>

---------

Co-authored-by: Jungnerd <[email protected]>

* recommend DeepSpeed's Argument Parsing documentation (#25268)

* [MMS] Fix mms (#25267)

* [MMS] Fix mms

* [MMS] Fix mms

* fix mms loading

* Apply suggestions from code review

* make style

* Update tests/models/wav2vec2/test_modeling_wav2vec2.py

* CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266)

* CI with layers=2

---------

Co-authored-by: ydshieh <[email protected]>

* CI with `pytest_num_workers=8` for torch/tf jobs (#25274)

n8

Co-authored-by: ydshieh <[email protected]>

* Docs: Update list of `report_to` logging integrations in docstring (#25281)

* Update list of logging integrations in docstring

Also update type hint

* Also add 'flyte' to report_to callback list

* Revert 'report_to' type hint update

Due to CLI breaking

* Update InstructBLIP & Align values after rescale update (#25209)

* Update InstructBLIP values
Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests

* Update test values after rescale update

* Remove left over commented out code

* Revert to previous rescaling logic

* Update rescale tests

* Docs: separate generate section (#25235)

Separate generate doc section

* Update bark doc (#25234)

* add mention to optimization in Bark docs

* add offload mention in docs

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <[email protected]>

* Update bark docs.

* Update bark.md

---------

Co-authored-by: Sanchit Gandhi <[email protected]>

* add generate method to SpeechT5ForTextToSpeech (#25233)

* add generate method to SpeechT5ForTextToSpeech

* update speecht5forTTS docstrings

* Remove defaults to None in generate docstrings

Co-authored-by: Sylvain Gugger <[email protected]>

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Add timeout parameter to load_image function (#25184)

* Add timeout parameter to load_image function.

* Remove line.

* Reformat code

Co-authored-by: amyeroberts <[email protected]>

* Add parameter to docs.

---------

Co-authored-by: amyeroberts <[email protected]>

* [JAX] Bump min version (#25286)

* [JAX] Bump min version

* make fixup

* [small] llama2.md typo (#25295)

`groupe` -> `grouped`

* Fix typo: Roberta -> RoBERTa (#25302)

* Move usage of deprecated logging.warn to logging.warning (#25310)

The former spelling is deprecated and has been discouraged for a
while. The latter spelling seems to be more common in this project
anyway, so this change ought to be safe.

Fixes https://github.com/huggingface/transformers/issues/25283

* Give more memory in test_disk_offload (#25315)

* Generate: get generation mode as an enum (#25292)

* Add offline mode for agents (#25226)

* Add offline mode for agents

* Disable second check too

* Deal with nested configs better in base class (#25237)

* Deal better with nested configs

* Fixes

* More fixes

* Fix last test

* Clean up existing configs

* Remove hack in MPT Config

* Update src/transformers/configuration_utils.py

Co-authored-by: Younes Belkada <[email protected]>

* Fix setting a nested config via dict in the kwargs

* Adapt common test

* Add test for nested config load with dict

---------

Co-authored-by: Younes Belkada <[email protected]>

* Document check copies (#25291)

* Document check copies better and add tests

* Include header in check for copies

* Manual fixes

* Try autofix

* Fixes

* Clean tests

* Finalize doc

* Remove debug print

* More fixes

* Make `bark` could have tiny model (#25290)

* temp

* update

* update

* update

* small dim

* small dim

* small dim

* fix

* update

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Document toc check and doctest check scripts (#25319)

* Clean doc toc check and make doctest list better

* Add to Makefile

* [Whisper] Better error message for outdated generation config (#25298)

* Remove jnp.DeviceArray since it is deprecated. (#24875)

* Remove jnp.DeviceArray since it is deprecated.

* Replace all instances of jnp.DeviceArray with jax.Array

* Update src/transformers/models/bert/modeling_flax_bert.py

---------

Co-authored-by: Sanchit Gandhi <[email protected]>

* add CFG for .generate() (#24654)

* 🌐 [i18n-KO] Translated `perf_infer_gpu_one.md` to Korean (#24978)

* docs: ko: perf_infer_gpu_one

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: TaeYupNoh <[email protected]>

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Younes Belkada <[email protected]>

---------

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: TaeYupNoh <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>

* Update TF pin in docker image (#25343)

fix

Co-authored-by: ydshieh <[email protected]>

* Generalize CFG to allow for positive prompts (#25339)

* Generalize CFG to allow for positive prompts

* Add documentation, fix the correct class

* Loosen output shape restrictions on GPT-style models (#25188)

* Loosen output shape restrictions on GPT-style models

* Use more self-explanatory variables

* Revert "Use more self-explanatory variables"

This reverts commit 5fd9ab39119558b7e750f61aa4a19014dccc5ed5.

* Allow `trust_remote_code` in example scripts (#25248)

* pytorch examples

* pytorch mim no trainer

* cookiecutter

* flax examples

* missed line in pytorch run_glue

* tensorflow examples

* tensorflow run_clip

* tensorflow run_mlm

* tensorflow run_ner

* tensorflow run_clm

* pytorch example from_configs

* pytorch no trainer examples

* Revert "tensorflow run_clip"

This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5.

* fix: duplicated argument

* Generate: remove Marian hack (#25294)

Remove Marian hack

* Fix more offload edge cases (#25342)

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Migrate Trainer from `Repository` to `upload_folder` (#25095)

* First draft

* Deal with progress bars

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <[email protected]>

* Address review comments

* Forgot one

* Pin hf_hub

* Add argument for push all and fix tests

* Fix tests

* Address review comments

---------

Co-authored-by: Lucain <[email protected]>

* Adding more information in help parser on train_file and validation_file (#25324)

chorse: adding new doc on train and val

* [DOCS] Add `NoRepeatNGramLogitsProcessor` Example for `LogitsProcessor` class (#25186)

* Add Description And Example to Docstring

* make style corrections

* make style

* Doc Style Consistent With HF

* Apply make style

* Modify Docstring

* Edit Type in Docstring

* Feedback Incorporated

* Edit Docstring

* make style

* Post Review Changes

* Review Feedback Incorporated

* Styling

* Formatting

* make style

* pep8

* Docs: Added benchmarks for `torch.compile()` for vision models (#24748)

* added benchmarks for compile

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Sayak Paul <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* added more models

* added more models fr

* added visualizations

* minor fix

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* Added links to models and put charts side by side

* Added batch comparisons

* Added more comparisons

* Fix table

* Added link to wheel

* Update perf_torch_compile.md

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: amyeroberts <[email protected]>

* Add mask2former fp16 support (#25093)

* Add mask2former fp16 support

* Clear consistency/quality issues

* Fix consistency/quality (2)

* Add integration test for mask2former (fp16 case)

* Fix code quality

* Add integration test for maskformer (fp16 case)

* Add integration test for oneformer (fp16 case)

* Remove slow decorator from fp16 tests

* Fix lint

* Remove usage of full inference and value checks for fp16

* Temporarily comment slow for {mask, mask2, one}former

* Add fp16 support to oneformer

* Revert "Temporarily comment slow for {mask, mask2, one}former"

This reverts commit e5371edabd301cf56079def0421a0a87df307cb0.

* Remove dtype conversion noop

* [DOCS] Add descriptive docstring to MinNewTokensLength (#25196)

* Add descriptive docstring to MinNewTokensLength

It addresses https://github.com/huggingface/transformers/issues/24783

* Refine the differences between `min_length` and `min_new_tokens`

* Remove extra line

* Remove extra arguments in generate

* Add a missing space

Co-authored-by: amyeroberts <[email protected]>

* Run the linter

* Add clarification comments

---------

Co-authored-by: amyeroberts <[email protected]>

* Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358)

* Register ModelOutput subclasses as supported torch.utils._pytree nodes

Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses

* Add test for torch pytree ModelOutput serialization and deserialization

* Fix `test_model_parallelism` (#25359)

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Add warning for missing attention mask when pad tokens are detected (#25345)

* Add attention mask and pad token warning to many of the models

* Remove changes under examples/research_projects

These files are not maintained by HG.

* Skip the warning check during torch.fx or JIT tracing

* Switch ordering for the warning and input shape assignment

This ordering is a little cleaner for some of the cases.

* Add missing line break in one of the files

* [ASR Pipeline] Clarify return timestamps (#25344)

* [ASR Pipeline] Clarify return timestamps

* fix indentation

* fix ctc check

* fix ctc error message!

* fix test

* fix other test

* add new tests

* final comment

* MaskFormer, Mask2Former - replace einsum for tracing (#25297)

* Replace einsum with ops for tracing

* Fix comment

* Load state in else (#25318)

* Load else

* New approach

* Propagate

* Fix `token` in example template (#25351)

fix

Co-authored-by: ydshieh <[email protected]>

* Enable tests to run on third-party devcies (#25327)

* enable unit tests to run on third-party devcies other than CUDA and CPU.

* remove the modification that enabled ut on MPS

* control test on third-party device by env variable

* update

---------

Co-authored-by: statelesshz <[email protected]>

* 🌐 [i18n-KO] Translated `add_tensorflow_model.md` to Korean (#25017)

* docs: ko: add_tensorflow_model.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

* fix: manual edits

* Fix `torch_job` worker(s) crashing (#25374)

fix

Co-authored-by: ydshieh <[email protected]>

* Generate: add config-level validation (#25381)

* Fix missing usage of `token` (#25382)

* add missing tokens

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Use small config for `OneFormerModelTest.test_model_with_labels` (#25383)

fix

Co-authored-by: ydshieh <[email protected]>

* Add copied from for image processor methods (#25121)

* Add copied from statements for image processors

* Move out rescale and normalize to base image processor

* Remove rescale and normalize from vit (post rebase)

* Update docstrings and tidy up

* PR comments

* change version (#25387)

* [DOCS] Add example for `TopPLogitsWarper`  (#25361)

* [DOCS] Add example for `TopPLogitsWarper`

* fix typo

* address review feedback

* address review nits

* 🌐 [i18n-KO] Translated `perf_train_cpu_many.md` to Korean (#24923)

* docs: ko: perf_train_cpu_many.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jungnerd <[email protected]>

---------

Co-authored-by: Jungnerd <[email protected]>

* 16059 - Add missing type hints for ASTModel (#25364)

* 16059 - Add missing type hints for ASTModel

* Add an additional type hint

Co-authored-by: Matt <[email protected]>

---------

Co-authored-by: Matt <[email protected]>

* rm useless condition since the previous condition contains it. (#25403)

* Fix path for dynamic module creation (#25402)

* YOLOS - Revert default return_pixel_mask value (#25404)

Revert default return_pixel_mask value

* Docs: introduction to generation with LLMs (#25240)

Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Steven Liu <[email protected]>

* Generate: length validation (#25384)

* Improve training args (#25401)

* enhanced tips for some training args

* make style

* Generate: generation config validation fixes in docs (#25405)

* 16059 - Add extra type hints for AltCLIPModel (#25399)

* Generate: lower severity of parameterization checks (#25407)

* VQA task guide (#25244)

* initial commit

* semi-finished task guide draft

* image link

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/tasks/visual_question_answering.md

Co-authored-by: NielsRogge <[email protected]>

* feedback addressed

* Apply suggestions from code review

Co-authored-by: amyeroberts <[email protected]>

* nits addressed

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: amyeroberts <[email protected]>

* 🌐 [i18n-KO] Translated `add_new_model.md` to Korean (#24957)

* docs: ko: add_new_model.md

* feat: chatgpt draft

* fix: manual edits

* fix: change document title

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: SeongWooChoi <[email protected]>

* fix: edit with reviewers

Co-authored-by: SeongWooChoi <[email protected]>

* fix: edit with reviewers

Co-authored-by: SeongWooChoi <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: add anchor to header

* Update docs/source/ko/add_new_model.md

Co-authored-by: 이서정 <[email protected]>

* Update docs/source/ko/add_new_model.md

Co-authored-by: 이서정 <[email protected]>

* Update docs/source/ko/add_new_model.md

Co-authored-by: 이서정 <[email protected]>

* fix: edit with reviews

* feat: edit toctree

---------

Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: SeongWooChoi <[email protected]>
Co-authored-by: 이서정 <[email protected]>

* 🌐 [i18n-KO] Translated `model_summary.md` to Korean (#24625)

* docs: ko: model_summary.md

* feat: nmt and manual edit model_summary.mdx

* fix: resolve suggestions

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>

* fix: resolve suggestions2

Co-authored-by: Sohyun Sim <[email protected]>

---------

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>

* Update Bark generation configs and tests (#25409)

* update bark generation configs for more coherent parameter

* make style

* update bark hub repo

* aligned sample_beam output selection with beam_search (#25375)

* aligned sample_beam specs with beam_search

* pull origin main

* Revert "pull origin main"

This reverts commit 06d356f1137bb52272e120a03636598c44449cf3.

* update test_utils.py

* fix format

* remove comment

---------

Co-authored-by: Shogo Fujita <[email protected]>

* Enable passing number of channels when inferring data format (#25412)

* Bark: flexible generation config overload (#25414)

* [DINOv2] Update pooler output (#25392)

Update pooler output

* 🌐 [i18n-KO] Translated `philosophy.md` to Korean (#25010)

* docs: ko: philosophy.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* Doc checks (#25408)

* Document check_dummies

* Type hints and doc in other files

* Document check inits

* Add documentation to

* Address review comments

* Generation: strict generation config validation at save time (#25411)

* strict gen config save; Add tests

* add note that the warning will be an exception in v4.34

* [WavLM] Fix Arxiv link and authors (#25415)

* [WavLM] Fix Arxiv link and authors

* make style

* Generate: Load generation config when `device_map` is passed (#25413)

* Fix rendering for `torch.compile()` docs (#25432)

fix rendering

* Add `examples`  to tests to run when `setup.py` is modified (#25437)

fix

Co-authored-by: ydshieh <[email protected]>

* Fix issue with ratio evaluation steps and auto find batch size (#25436)

* Fully rebased solution

* 500

* docs: add LLaMA-Efficient-Tuning to awesome-transformers (#25441)

Co-authored-by: statelesshz <[email protected]>

* GPTQ integration (#25062)

* GTPQ integration

* Add tests for gptq

* support for more quantization model

* fix style

* typo

* fix method

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <[email protected]>

* add dataclass and fix quantization_method

* fix doc

* Update tests/quantization/gptq/test_gptq.py

Co-authored-by: Younes Belkada <[email protected]>

* Apply suggestions from code review

Co-authored-by: Younes Belkada <[email protected]>

* modify dataclass

* add gtpqconfig import

* fix typo

* fix tests

* remove dataset as req arg

* remove tokenizer import

* add offload cpu quantization test

* fix check dataset

* modify dockerfile

* protect trainer

* style

* test for config

* add more log

* overwrite torch_dtype

* draft doc

* modify quantization_config docstring

* fix class name in docstring

* Apply suggestions from code review

Co-authored-by: Y…
blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023
* move to device

* update with cuda values

* fix fp16

* more rigorous
zachares added a commit to nplan-io/transformers that referenced this pull request Nov 17, 2023
…xt2graph) (#8)

* [`Llama2`]  Add support for Llama 2 (#24891)

* add llama

* add other readmes

* update padding id in readme

* add link to paper

* fix paths and tokenizer

* more nits

* styling

* fit operation in 2 lines when possible

* nits

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* add form

* update reademe

* update readme, we don't have a default pad token

* update test and tokenization

* LLaMA instead of Llama

* nits

* add expected text

* add greeedy output

* styling

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Sylvain Gugger <[email protected]>

* sequential device map

* skip relevant changes

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Disable ipex env var if false (#24885)

Disable ipex if in use

* Check for accelerate env var when doing CPU only (#24890)

Check for use-cpu

* Avoid some pipeline tasks to use `use_cache=True` (#24893)

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Update tested versions in READMEs (#24895)

* Update supported Python and PyTorch versions in readme

* Update Python, etc. versions in non-English readmes

These were more out of date than in the English readme. This
updates all the versions the readmes claim the repository is tested
with to the same versions stated in the English readme.

Those versions are current at least in the case of the Python and
PyTorch versions (and less out of date for the others).

* Propagate trailing whitespace fix to model list

This runs "make fix-copies". The only change is the removal of
whitespace. No actual information or wording is changed.

* Update tested TensorFlow to 2.6 in all readmes

Per pinning in setup.py

Unlike Python and PyTorch, the minimum supported TensorFlow version
has not very recently changed, but old versions were listed in all
READMEs.

* Fix `test_model_parallelism` for `FalconModel` (#24914)

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) (#24907)

- This results in cpu mode on Apple Silicon mps

* fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST (#24902)

fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST

suno/barh should be suno/bark

* Fix minor llama2.md model doc typos (#24909)

Update llama2.md

 Fix typos in the llama2 model doc

* [`Llama2`] replace `self.pretraining_tp` with `self.config.pretraining_tp` (#24906)

* add possibility to disable TP

* fixup

* adapt from offline discussions

* [doc] `image_processing_vilt.py` wrong default documented (#24931)

[doc] image_processing_vilt.py wrong default

* 🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588)

* docs: ko: `document_question_answering.md`

* fix: resolve suggestions

Co-authored-by: Sohyun Sim <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>

---------

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Hyeonseo Yun <[email protected]>

* Add multi-label text classification support to pytorch example (#24770)

* Add text classification example

* set the problem type and finetuning task

* ruff reformated

* fix bug for unseting label_to_id for regression

* update README.md

* fixed finetuning task

* update comment

* check if label exists in feature before removing

* add useful logging

* Deprecate unused OpenLlama architecture (#24922)

* Resolve typo in check_repo.py

* Specify encoding when opening modeling files

* Deprecate the OpenLlama architecture

* Add disclaimer pointing to Llama

I'm open to different wordings here

* Match the capitalisation of LLaMA

* replace no_cuda with use_cpu in test_pytorch_examples (#24944)

* replace no_cuda with use_cpu in test_pytorch_examples

* remove codes that never be used

* fix style

* Generate: sequence bias can handle same terminations (#24822)

* Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949)

Bump pygments in /examples/research_projects/decision_transformer

Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update processing_vision_text_dual_encoder.py (#24950)

Fixing small typo: kwrags -> kwargs

* Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916)

fix

Co-authored-by: ydshieh <[email protected]>

* [DOCS] Example for `LogitsProcessor` class (#24848)

* make docs

* fixup

* resolved

* remove debugs

* Revert "fixup"

This reverts commit 5e0f636aae0bf8707bc8bdaa6a9427fbf66834ed.

* prev (ignore)

* fixup broke some files

* remove files

* reverting modeling_reformer

* lang fix

* fix type annotations for arguments in training_args (#24550)

* testing

* example script

* fix typehinting

* some tests

* make test

* optional update

* Union of arguments

* does this fix the issue

* remove reports

* set default to False

* documentation change

* None support

* does not need None

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574)

Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)"

This reverts commit c5e29d4381d4b9739e6cb427adbca87fbb43a3ad.

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* merge

* hacky fix

* fixup

---------

Co-authored-by: Max Ryabinin <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>

* Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954)

Bump aiohttp in /examples/research_projects/decision_transformer

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [`RWKV`] Add Gradient Checkpointing support for RWKV (#24955)

add GC support for RWKV

* Change logic for logging in the examples (#24956)

Change logic

* Contrastive Search peak memory reduction (#24120)

Co-authored-by: Joao Gante <[email protected]>

* Fallback for missing attribute `Parameter.ds_numel` (#24942)

* [trainer] fallback for deepspeed param count

* [trainer] more readable numel count

* fix fsdp checkpointing issues (#24926)

* fix fsdp load

* Update trainer.py

* remove saving duplicate state_dict

* fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947)

* fix: cast input pixels to appropriate dtype for image_to_text tasks

* fix: add casting to pixel inputs of additional models after running copy checks

* 🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664)

* fix: english/korean quicktour.md

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Kihoon Son <[email protected]>

* fix: follow glossary

* 파인튜닝 -> 미세조정

---------

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Kihoon Son <[email protected]>

* fsdp fixes and enhancements (#24980)

* fix fsdp prepare to remove the warnings and fix excess memory usage

* Update training_args.py

* parity for FSDP+XLA

* Update trainer.py

* Fix missing spaces in system prompt of Llama2 tokenizer (#24930)

* Update tokenization_llama.py

* Update tokenization_llama_fast.py

* Update src/transformers/models/llama/tokenization_llama_fast.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/llama/tokenization_llama.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/llama/tokenization_llama.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/llama/tokenization_llama_fast.py

Co-authored-by: Arthur <[email protected]>

---------

Co-authored-by: Arthur <[email protected]>

* [`LlamaConfig`] Nit: pad token should be None by default (#24958)

* pad token should be None by default

* fix tests

* nits

* Remove tokenizers from the doc table (#24963)

* Avoid importing all models when instantiating a pipeline (#24960)

* Avoid importing all models when instantiating a pipeline

* Remove sums that don't work

* Fix type annotation for deepspeed training arg (#24988)

* Use main_input_name for include_inputs_for_metrics (#24993)

* Fix `llama` tokenization doctest (#24990)

fix

Co-authored-by: ydshieh <[email protected]>

* [`bnb`] Add simple check for bnb import (#24995)

add simple check for bnb

* [`Llama`] remove persistent  `inv_freq` tensor (#24998)

remove persistent tensor

* improve from_pretrained for zero3 multi gpus mode (#24964)

* improve from_pretrained for zero3 multi gpus mode

* Add check if torch.distributed.is_initialized

* Revert torch.distributed

---------

Co-authored-by: Stas Bekman <[email protected]>

* Move template doc file to md (#25004)

* 🌐 [i18n-KO] Updated Korean `serialization.md` (#24686)

fix: update ko/serialization.md

* chatgpt draft

* [check_config_docstrings.py] improve diagnostics (#25012)

* [check_config_docstrings.py] improve diagnostics

* style

* rephrase

* fix

* [`logging.py`] set default `stderr`  path if `None` (#25033)

set default logger

* fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035)

fix: store training args to wandb config without sanitization.

Allows resuming runs by reusing the wandb config.

Co-authored-by: Bharat Ramanathan <[email protected]>

* [docs] Performance docs tidy up, part 1  (#23963)

* first pass at the single gpu doc

* overview: improved clarity and navigation

* WIP

* updated intro and deepspeed sections

* improved torch.compile section

* more improvements

* minor improvements

* make style

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

* feedback addressed

* mdx -> md

* link fix

* feedback addressed

---------

Co-authored-by: Steven Liu <[email protected]>

* Support GatedRepoError + use raise from (#25034)

* Support GatedRepoError + use raise from

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* Use token instead of use_auth_token in error messages

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Better handling missing SYS in llama conversation tokenizer (#24997)

* Better handling missing SYS in llama conversation tokenizer

The existing code failed to add SYS if the conversation has history
without SYS, but did modify the passed conversation as it did.

Rearrange the code so modification to the conversation object are taken
into account for token id generation.

* Fix formatting with black

* Avoid one-liners

* Also fix fast tokenizer

* Drop List decl

* 🌐[i18n-KO] Translated performance.md to Korean (#24883)

* dos: ko: performance.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* Update docs/source/ko/performance.md

Co-authored-by: Kihoon Son <[email protected]>

* Update docs/source/ko/performance.md

---------

Co-authored-by: Kihoon Son <[email protected]>

* 🌐 [i18n-KO] Translated `testing.md` to Korean (#24900)

* docs: ko: testing.md

* feat: draft

* fix: manual edits

* fix: edit ko/_toctree.yml

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

* Add dispatch_batches to training arguments (#25038)

* Dispatch batches

* Copy items

* Fix typo in LlamaTokenizerFast docstring example (#25018)

* Make more test models smaller (#25005)

* Make more test models tiny

* Make more test models tiny

* More models

* More models

* Comment again print statement

* Pvt model (#24720)

* pull and push updates

* add docs

* fix modeling

* Add and run test

* make copies

* add task

* fix tests and fix small issues

* Checks on a Pull Request

* fix docs

* add desc pvt.md

* compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044)

* added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict

* check for PEFT model in compute_loss section

---------

Co-authored-by: Nathan Brake <[email protected]>

* [`8bit`] Fix 8bit corner case with Blip2 8bit (#25047)

fix 8bit corner case with Blip2 8bit

* 🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911)

* dos: ko: perf_train_cpu.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* fix: manual edits

Co-authored-by: Haewon Kim <[email protected]>

---------

Co-authored-by: Haewon Kim <[email protected]>

* Better error message when signal is not supported on OS (#25049)

* Better error message when signal is not supported on OS

* Address review comments

* [`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055)

* Add note in doc on `RwkvStoppingCriteria`

* give some breathing space to the code

* Generate - add beam indices output in contrained beam search (#25042)

* [Docs] fix rope_scaling doc string (#25072)

fix rope_scaling doc string

* 🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean (#24904)

* docs: ko: tf_xla.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

* 🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966)

* docs: ko: perf_hardware.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <[email protected]>

* fix: resolve suggestions

Co-authored-by: Haewon Kim <[email protected]>

* Fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: fix rendering error of perf_hardware.md

---------

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Haewon Kim <[email protected]>

* Fix last models for common tests that are too big. (#25058)

* Fix last models for common tests that are too big.

* Remove print statement

* fix: add TOC anchor link (#25066)

* Set `TF32` flag for PyTorch cuDNN backend (#25075)

* Fix broken link in README_hd.md (#25067)

Update README_hd.md

* replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task (#25078)

replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size`
in readme of multiple-choice

* [`generate`]  Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030)

* check max length is default

* nit

* update warning: no-longer deprecate

* comment in the configuration_utils in case max length's default gets changed in the futur

* 🌐 [i18n-KO] Translated `hpo_train.md` to Korean (#24968)

* dos: ko: hpo_train.mdx

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* Fix: repeat per sample for SAM image embeddings (#25074)

Repeat per sample for SAM image embeddings

* [`MPT`] Add MosaicML's `MPT` model to transformers (#24629)

* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md

Co-authored-by: Arthur <[email protected]>

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py

Co-authored-by: Arthur <[email protected]>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Arthur <[email protected]>

* fix nit

* add another slow test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------

Co-authored-by: younesbelkada <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>

* [DOCS] add example NoBadWordsLogitsProcessor (#25046)

* add example NoBadWordsLogitsProcessor

* fix L764 & L767

* make style

* 🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean (#24920)

* docs: ko: perf_infer_cpu.md

* feat: chatgpt draft

* fix: manual edits

* Update docs/source/ko/_toctree.yml

* Update docs/source/ko/perf_infer_cpu.md

* Update docs/source/ko/perf_infer_cpu.md

이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

동의합니다! 제가 원본에 너무 얽매여 있었네요!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

말씀하신대로 원문에 너무 집착했던것 같습니다

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

더 나은 어휘 사용에 감사드립니다!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

이 당시 '주기'란 용어를 생각해내질 못했네요...

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

좀 더 자연스러운 문맥이 됐네요!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

굳이 원본 형식에 얽매일 필요가 없군요!

Co-authored-by: Wonhyeong Seo <[email protected]>

* Update docs/source/ko/perf_infer_cpu.md

Co-authored-by: Wonhyeong Seo <[email protected]>

---------

Co-authored-by: Wonhyeong Seo <[email protected]>

* Allow generic composite models to pass more kwargs (#24927)

* fix

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <[email protected]>

* update

---------

Co-authored-by: ydshieh <[email protected]>
Co-authored-by: Joao Gante <[email protected]>

* [ `ForSequenceClassification`] Support `left` padding (#24979)

* support left padding

* nit

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

* [`TF`]  Also apply patch to support left padding (#25085)

* tf versions

* apply changes to other models

* 3 models slipped through the cracks

* Edit err message and comment in `test_model_is_small` (#25087)

* Edit err message and comment in

* put back 80M comment

* [ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053)

* draft solution

* use `setdefault`

* nits

* add tests and fix truncation issue

* fix test

* test passes locally

* quality

* updates

* update tsets

* Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091)

Co-authored-by: ydshieh <[email protected]>

* [`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726)

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

* Fix doctest (#25031)

fix

Co-authored-by: ydshieh <[email protected]>

* Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096)

Bump certifi in /examples/research_projects/lxmert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098)

Bump certifi in /examples/research_projects/decision_transformer

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097)

Bump certifi in /examples/research_projects/visual_bert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix tied_params for meta tensor (#25101)

* fix tied_params for meta tensor

* remove duplicate

* documentation for llama2 models (#25102)

* fix documentation

* changes

* 🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828)

* translated pipeline_webserver.md

Co-Authored-By: Hyeonseo Yun <[email protected]>
Co-Authored-By: Wonhyeong Seo <[email protected]>
Co-Authored-By: Sohyun Sim <[email protected]>
Co-Authored-By: Gabriel Yang <[email protected]>
Co-Authored-By: Nayeon Han <[email protected]>
Co-Authored-By: Jungnerd <[email protected]>

* Update pipeline_webserver.md

* Apply suggestions from code review

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Sangam Lee <[email protected]>
Co-authored-by: Kim haewon <[email protected]>

---------

Co-authored-by: Hyeonseo Yun <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Gabriel Yang <[email protected]>
Co-authored-by: Nayeon Han <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: Sangam Lee <[email protected]>
Co-authored-by: Kim haewon <[email protected]>

* Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106)

update

Co-authored-by: ydshieh <[email protected]>

* Add descriptive docstring to TemperatureLogitsWarper (#24892)

* Add descriptive docstring to TemperatureLogitsWarper

It addresses https://github.com/huggingface/transformers/issues/24783

* Remove niche features

Co-authored-by: Joao Gante <[email protected]>

* Commit suggestion

Co-authored-by: Joao Gante <[email protected]>

* Refactor the examples to simpler ones

* Add a missing comma

Co-authored-by: Joao Gante <[email protected]>

* Make args description more compact

Co-authored-by: Joao Gante <[email protected]>

* Remove extra text after making description more compact

Co-authored-by: Joao Gante <[email protected]>

* Fix linter

---------

Co-authored-by: Joao Gante <[email protected]>

* fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772)

fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."

Co-authored-by: 刘长伟 <[email protected]>

* update `use_auth_token` -> `token` (#25083)

* update

---------

Co-authored-by: ydshieh <[email protected]>

* Fix past CI after #24334 (#25113)

update

Co-authored-by: ydshieh <[email protected]>

* Move common image processing methods to BaseImageProcessor (#25089)

Move out common methods

* Fix ViT docstring regarding default dropout values. (#25118)

Fix docstring for dropout.

* MaskFormer - enable return_dict in order to compile (#25052)

* Enable return_dict in order to compile

* Update tests

* Move center_crop to BaseImageProcessor (#25122)

* fix deepspeed load best model at end when the model gets sharded (#25057)

* fix delete all checkpoints when save_total_limit is set to 1 (#25136)

* [`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131)

default legacy to None

* Clarify 4/8 bit loading log message (#25134)

* clarify 4/8 bit loading log message

* make style

* 🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109)

* Change defaults

* Sylvain's comments

* [`MptConfig`] support from pretrained args (#25116)

* support from pretrained args

* draft addition of tests

* update test

* use parrent assert true

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Younes Belkada <[email protected]>

---------

Co-authored-by: Younes Belkada <[email protected]>

* Add offload support to Bark (#25037)

* initial Bark offload proposal

* use hooks instead of manually offloading

* add test of bark offload to cpu feature

* Apply nit suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* Update docstrings of offload

Co-authored-by: Sanchit Gandhi <[email protected]>

* remove unecessary set_seed in Bark tests

---------

Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Sanchit Gandhi <[email protected]>

* More `token` things (#25146)

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Add bloom flax (#25094)

* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <[email protected]>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <[email protected]>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py

Co-authored-by: Sylvain Gugger <[email protected]>

* per-weight check

* style

* line formats

---------

Co-authored-by: younesbelkada <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: haileyschoelkopf <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>

* Add new model in doc table of content (#25148)

* Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120)

* Fix .push_to_hub and cleanup get_full_repo_name usage

* Do not rely on Python bool conversion magic

* request changes

* Add test when downloading from gated repo (#25039)

* override .cuda() to check if model is already quantized (#25166)

* Represent query_length in a different way to solve jit issue (#25164)

Fix jit trace

* make run_generation more generic for other devices (#25133)

* make run_generation more generic for other devices

* use Accelerate to support any device type it supports.

* make style

* fix error usage of accelerator.prepare_model

* use `PartialState` to make sure everything is running on the right device

---------

Co-authored-by: statelesshz <[email protected]>

* added compiled model support for inference (#25124)

* added compiled model support for inference

* linter

* Fix tests

* linter

* linter

* remove inference mode from pipelines

* Linter

---------

Co-authored-by: amarkov <[email protected]>

* Update `use_auth_token` -> `token` in example scripts (#25167)

* pytorch examples

* tensorflow examples

* flax examples

---------

Co-authored-by: ydshieh <[email protected]>

* [`Mpt`] Fix mpt slow test (#25170)

fix mpt slow test

* [`InstructBlip`] Fix instructblip slow test (#25171)

* fix instruct blip slow test

* Update tests/models/instructblip/test_modeling_instructblip.py

* 🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881)

* docs: ko: transformers_agents.md

* docs: ko: transformers_agents.md

* feat: deepl draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Juntae <[email protected]>
Co-authored-by: Injin Paek <[email protected]>

---------

Co-authored-by: Juntae <[email protected]>
Co-authored-by: Injin Paek <[email protected]>

* Fix beam search to sample at least 1 non eos token (#25103) (#25115)

* [MusicGen] Fix integration tests (#25169)

* move to device

* update with cuda values

* fix fp16

* more rigorous

* 🚨🚨🚨  Fix rescale ViVit Efficientnet (#25174)

* Fix rescaling bug

* Add tests

* Update integration tests

* Fix up

* Update src/transformers/image_transforms.py

* Update test - new possible order in list

* Musicgen: CFG is manually added  (#25173)

* Better error message in `_prepare_output_docstrings` (#25202)

fix

Co-authored-by: ydshieh <[email protected]>

* [`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206)

wrap `cuda` and `to` method correctly

* Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211)

fix

Co-authored-by: ydshieh <[email protected]>

* [quantization.md] fix (#25190)

Update quantization.md

* [`pipeline`] revisit device check for pipeline (#25207)

* revisit device check for pipeline

* let's raise an error.

* Update tiny model info. and pipeline testing (#25213)

* update tiny_model_summary.json

* update

* update

* update

---------

Co-authored-by: ydshieh <[email protected]>

* Fix docker image build failure (#25214)

fix

Co-authored-by: ydshieh <[email protected]>

* make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193)

make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work

Signed-off-by: Wang, Yi A <[email protected]>

* [`Pix2Struct`] Fix pix2struct cross attention (#25200)

* fix pix2struct cross attention

* fix torchscript slow test

* [`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216)

* clearer explanation on how things works under the hood.

* Update docs/source/en/main_classes/quantization.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/main_classes/quantization.md

Co-authored-by: amyeroberts <[email protected]>

* add `load_in_4bit` in `from_pretrained`

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: amyeroberts <[email protected]>

* [`MPT`] Add  `require_bitsandbytes` on MPT integration tests (#25201)

* add  `require_bitsandbytes` on MPT integration tests

* add it on mpt as well

* [`Detr`] Fix detr BatchNorm replacement issue (#25230)

* fix detr weird issue

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: Sylvain Gugger <[email protected]>

* fix copies

* fix copies

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Move rescale dtype recasting to match torchvision ToTensor (#25229)

Move dtype recasting to match torchvision ToTensor

* Fix set of model parallel in the Trainer when no GPUs are available (#25239)

* fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105)

* add test for `get_keys_to_not_convert`

* add minimum patch to keep mpt lm_head from 8bit quantization

* add reivsion to

* add pathname and line number to logging formatter in debug mode (#25203)

* add pathname and lineno to logging formatter in debug mode

* use TRANSFORMERS_VERBOSITY="detail" to print pathname and lineno

* Add `token` arugment in example scripts (#25172)

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* resolving zero3 init when using accelerate config with Trainer (#25227)

* resolving zero3 init when using accelerate config with Trainer

* refactor

* fix

* fix import

* Update rescale tests - cast to float after rescaling to reflect #25229 (#25259)

Rescale tests - cast to float after rescaling to reflect #25229

* Fix some bugs for two stage training of deformable detr (#25045)

* Update modeling_deformable_detr.py

Fix bugs for two stage training

* Update modeling_deformable_detr.py

* Add test_two_stage_training to DeformableDetrModelTest

---------

Co-authored-by: yupeng.jia <[email protected]>

* [DOCS] Add example and modified docs of EtaLogitsWarper (#25125)

* added example and modified docs for EtaLogitsWarper

* make style

* fixed styling issue on 544

* removed error info and added set_seed

* Update src/transformers/generation/logits_process.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/generation/logits_process.py

Co-authored-by: amyeroberts <[email protected]>

* updated the results

---------

Co-authored-by: amyeroberts <[email protected]>

* Fix return_dict_in_generate bug in InstructBlip generate function (#25246)

Fix bug in InstructBlip generate function

Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`).

This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object.

* Remove `pytest_options={"rA": None}` in CI (#25263)

fix

Co-authored-by: ydshieh <[email protected]>

* 🌐 [i18n-KO] Translated `perf_infer_gpu_many.md` to Korean (#24943)

* doc: ko: perf_infer_gpu_many.mdx

* feat: chatgpt draft

* fix: manual edits

* Update docs/source/ko/perf_infer_gpu_many.md

Co-authored-by: Jungnerd <[email protected]>

---------

Co-authored-by: Jungnerd <[email protected]>

* recommend DeepSpeed's Argument Parsing documentation (#25268)

* [MMS] Fix mms (#25267)

* [MMS] Fix mms

* [MMS] Fix mms

* fix mms loading

* Apply suggestions from code review

* make style

* Update tests/models/wav2vec2/test_modeling_wav2vec2.py

* CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266)

* CI with layers=2

---------

Co-authored-by: ydshieh <[email protected]>

* CI with `pytest_num_workers=8` for torch/tf jobs (#25274)

n8

Co-authored-by: ydshieh <[email protected]>

* Docs: Update list of `report_to` logging integrations in docstring (#25281)

* Update list of logging integrations in docstring

Also update type hint

* Also add 'flyte' to report_to callback list

* Revert 'report_to' type hint update

Due to CLI breaking

* Update InstructBLIP & Align values after rescale update (#25209)

* Update InstructBLIP values
Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests

* Update test values after rescale update

* Remove left over commented out code

* Revert to previous rescaling logic

* Update rescale tests

* Docs: separate generate section (#25235)

Separate generate doc section

* Update bark doc (#25234)

* add mention to optimization in Bark docs

* add offload mention in docs

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <[email protected]>

* Update bark docs.

* Update bark.md

---------

Co-authored-by: Sanchit Gandhi <[email protected]>

* add generate method to SpeechT5ForTextToSpeech (#25233)

* add generate method to SpeechT5ForTextToSpeech

* update speecht5forTTS docstrings

* Remove defaults to None in generate docstrings

Co-authored-by: Sylvain Gugger <[email protected]>

---------

Co-authored-by: Sylvain Gugger <[email protected]>

* Add timeout parameter to load_image function (#25184)

* Add timeout parameter to load_image function.

* Remove line.

* Reformat code

Co-authored-by: amyeroberts <[email protected]>

* Add parameter to docs.

---------

Co-authored-by: amyeroberts <[email protected]>

* [JAX] Bump min version (#25286)

* [JAX] Bump min version

* make fixup

* [small] llama2.md typo (#25295)

`groupe` -> `grouped`

* Fix typo: Roberta -> RoBERTa (#25302)

* Move usage of deprecated logging.warn to logging.warning (#25310)

The former spelling is deprecated and has been discouraged for a
while. The latter spelling seems to be more common in this project
anyway, so this change ought to be safe.

Fixes https://github.com/huggingface/transformers/issues/25283

* Give more memory in test_disk_offload (#25315)

* Generate: get generation mode as an enum (#25292)

* Add offline mode for agents (#25226)

* Add offline mode for agents

* Disable second check too

* Deal with nested configs better in base class (#25237)

* Deal better with nested configs

* Fixes

* More fixes

* Fix last test

* Clean up existing configs

* Remove hack in MPT Config

* Update src/transformers/configuration_utils.py

Co-authored-by: Younes Belkada <[email protected]>

* Fix setting a nested config via dict in the kwargs

* Adapt common test

* Add test for nested config load with dict

---------

Co-authored-by: Younes Belkada <[email protected]>

* Document check copies (#25291)

* Document check copies better and add tests

* Include header in check for copies

* Manual fixes

* Try autofix

* Fixes

* Clean tests

* Finalize doc

* Remove debug print

* More fixes

* Make `bark` could have tiny model (#25290)

* temp

* update

* update

* update

* small dim

* small dim

* small dim

* fix

* update

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Document toc check and doctest check scripts (#25319)

* Clean doc toc check and make doctest list better

* Add to Makefile

* [Whisper] Better error message for outdated generation config (#25298)

* Remove jnp.DeviceArray since it is deprecated. (#24875)

* Remove jnp.DeviceArray since it is deprecated.

* Replace all instances of jnp.DeviceArray with jax.Array

* Update src/transformers/models/bert/modeling_flax_bert.py

---------

Co-authored-by: Sanchit Gandhi <[email protected]>

* add CFG for .generate() (#24654)

* 🌐 [i18n-KO] Translated `perf_infer_gpu_one.md` to Korean (#24978)

* docs: ko: perf_infer_gpu_one

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: TaeYupNoh <[email protected]>

* fix: resolve suggestions

* fix: resolve suggestions

Co-authored-by: Younes Belkada <[email protected]>

---------

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: TaeYupNoh <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>

* Update TF pin in docker image (#25343)

fix

Co-authored-by: ydshieh <[email protected]>

* Generalize CFG to allow for positive prompts (#25339)

* Generalize CFG to allow for positive prompts

* Add documentation, fix the correct class

* Loosen output shape restrictions on GPT-style models (#25188)

* Loosen output shape restrictions on GPT-style models

* Use more self-explanatory variables

* Revert "Use more self-explanatory variables"

This reverts commit 5fd9ab39119558b7e750f61aa4a19014dccc5ed5.

* Allow `trust_remote_code` in example scripts (#25248)

* pytorch examples

* pytorch mim no trainer

* cookiecutter

* flax examples

* missed line in pytorch run_glue

* tensorflow examples

* tensorflow run_clip

* tensorflow run_mlm

* tensorflow run_ner

* tensorflow run_clm

* pytorch example from_configs

* pytorch no trainer examples

* Revert "tensorflow run_clip"

This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5.

* fix: duplicated argument

* Generate: remove Marian hack (#25294)

Remove Marian hack

* Fix more offload edge cases (#25342)

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Migrate Trainer from `Repository` to `upload_folder` (#25095)

* First draft

* Deal with progress bars

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <[email protected]>

* Address review comments

* Forgot one

* Pin hf_hub

* Add argument for push all and fix tests

* Fix tests

* Address review comments

---------

Co-authored-by: Lucain <[email protected]>

* Adding more information in help parser on train_file and validation_file (#25324)

chorse: adding new doc on train and val

* [DOCS] Add `NoRepeatNGramLogitsProcessor` Example for `LogitsProcessor` class (#25186)

* Add Description And Example to Docstring

* make style corrections

* make style

* Doc Style Consistent With HF

* Apply make style

* Modify Docstring

* Edit Type in Docstring

* Feedback Incorporated

* Edit Docstring

* make style

* Post Review Changes

* Review Feedback Incorporated

* Styling

* Formatting

* make style

* pep8

* Docs: Added benchmarks for `torch.compile()` for vision models (#24748)

* added benchmarks for compile

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Sayak Paul <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* added more models

* added more models fr

* added visualizations

* minor fix

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* Update docs/source/en/perf_torch_compile.md

Co-authored-by: amyeroberts <[email protected]>

* Added links to models and put charts side by side

* Added batch comparisons

* Added more comparisons

* Fix table

* Added link to wheel

* Update perf_torch_compile.md

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: amyeroberts <[email protected]>

* Add mask2former fp16 support (#25093)

* Add mask2former fp16 support

* Clear consistency/quality issues

* Fix consistency/quality (2)

* Add integration test for mask2former (fp16 case)

* Fix code quality

* Add integration test for maskformer (fp16 case)

* Add integration test for oneformer (fp16 case)

* Remove slow decorator from fp16 tests

* Fix lint

* Remove usage of full inference and value checks for fp16

* Temporarily comment slow for {mask, mask2, one}former

* Add fp16 support to oneformer

* Revert "Temporarily comment slow for {mask, mask2, one}former"

This reverts commit e5371edabd301cf56079def0421a0a87df307cb0.

* Remove dtype conversion noop

* [DOCS] Add descriptive docstring to MinNewTokensLength (#25196)

* Add descriptive docstring to MinNewTokensLength

It addresses https://github.com/huggingface/transformers/issues/24783

* Refine the differences between `min_length` and `min_new_tokens`

* Remove extra line

* Remove extra arguments in generate

* Add a missing space

Co-authored-by: amyeroberts <[email protected]>

* Run the linter

* Add clarification comments

---------

Co-authored-by: amyeroberts <[email protected]>

* Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358)

* Register ModelOutput subclasses as supported torch.utils._pytree nodes

Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses

* Add test for torch pytree ModelOutput serialization and deserialization

* Fix `test_model_parallelism` (#25359)

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Add warning for missing attention mask when pad tokens are detected (#25345)

* Add attention mask and pad token warning to many of the models

* Remove changes under examples/research_projects

These files are not maintained by HG.

* Skip the warning check during torch.fx or JIT tracing

* Switch ordering for the warning and input shape assignment

This ordering is a little cleaner for some of the cases.

* Add missing line break in one of the files

* [ASR Pipeline] Clarify return timestamps (#25344)

* [ASR Pipeline] Clarify return timestamps

* fix indentation

* fix ctc check

* fix ctc error message!

* fix test

* fix other test

* add new tests

* final comment

* MaskFormer, Mask2Former - replace einsum for tracing (#25297)

* Replace einsum with ops for tracing

* Fix comment

* Load state in else (#25318)

* Load else

* New approach

* Propagate

* Fix `token` in example template (#25351)

fix

Co-authored-by: ydshieh <[email protected]>

* Enable tests to run on third-party devcies (#25327)

* enable unit tests to run on third-party devcies other than CUDA and CPU.

* remove the modification that enabled ut on MPS

* control test on third-party device by env variable

* update

---------

Co-authored-by: statelesshz <[email protected]>

* 🌐 [i18n-KO] Translated `add_tensorflow_model.md` to Korean (#25017)

* docs: ko: add_tensorflow_model.md

* feat: chatgpt draft

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions

* fix: manual edits

* Fix `torch_job` worker(s) crashing (#25374)

fix

Co-authored-by: ydshieh <[email protected]>

* Generate: add config-level validation (#25381)

* Fix missing usage of `token` (#25382)

* add missing tokens

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* Use small config for `OneFormerModelTest.test_model_with_labels` (#25383)

fix

Co-authored-by: ydshieh <[email protected]>

* Add copied from for image processor methods (#25121)

* Add copied from statements for image processors

* Move out rescale and normalize to base image processor

* Remove rescale and normalize from vit (post rebase)

* Update docstrings and tidy up

* PR comments

* change version (#25387)

* [DOCS] Add example for `TopPLogitsWarper`  (#25361)

* [DOCS] Add example for `TopPLogitsWarper`

* fix typo

* address review feedback

* address review nits

* 🌐 [i18n-KO] Translated `perf_train_cpu_many.md` to Korean (#24923)

* docs: ko: perf_train_cpu_many.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Jungnerd <[email protected]>

---------

Co-authored-by: Jungnerd <[email protected]>

* 16059 - Add missing type hints for ASTModel (#25364)

* 16059 - Add missing type hints for ASTModel

* Add an additional type hint

Co-authored-by: Matt <[email protected]>

---------

Co-authored-by: Matt <[email protected]>

* rm useless condition since the previous condition contains it. (#25403)

* Fix path for dynamic module creation (#25402)

* YOLOS - Revert default return_pixel_mask value (#25404)

Revert default return_pixel_mask value

* Docs: introduction to generation with LLMs (#25240)

Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Steven Liu <[email protected]>

* Generate: length validation (#25384)

* Improve training args (#25401)

* enhanced tips for some training args

* make style

* Generate: generation config validation fixes in docs (#25405)

* 16059 - Add extra type hints for AltCLIPModel (#25399)

* Generate: lower severity of parameterization checks (#25407)

* VQA task guide (#25244)

* initial commit

* semi-finished task guide draft

* image link

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/tasks/visual_question_answering.md

Co-authored-by: NielsRogge <[email protected]>

* feedback addressed

* Apply suggestions from code review

Co-authored-by: amyeroberts <[email protected]>

* nits addressed

---------

Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: amyeroberts <[email protected]>

* 🌐 [i18n-KO] Translated `add_new_model.md` to Korean (#24957)

* docs: ko: add_new_model.md

* feat: chatgpt draft

* fix: manual edits

* fix: change document title

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: edit with reviewers

Co-authored-by: SeongWooChoi <[email protected]>

* fix: edit with reviewers

Co-authored-by: SeongWooChoi <[email protected]>

* fix: edit with reviewers

Co-authored-by: SeongWooChoi <[email protected]>

* fix: edit with reviewers

Co-authored-by: Jungnerd <[email protected]>

* fix: add anchor to header

* Update docs/source/ko/add_new_model.md

Co-authored-by: 이서정 <[email protected]>

* Update docs/source/ko/add_new_model.md

Co-authored-by: 이서정 <[email protected]>

* Update docs/source/ko/add_new_model.md

Co-authored-by: 이서정 <[email protected]>

* fix: edit with reviews

* feat: edit toctree

---------

Co-authored-by: Wonhyeong Seo <[email protected]>
Co-authored-by: Jungnerd <[email protected]>
Co-authored-by: SeongWooChoi <[email protected]>
Co-authored-by: 이서정 <[email protected]>

* 🌐 [i18n-KO] Translated `model_summary.md` to Korean (#24625)

* docs: ko: model_summary.md

* feat: nmt and manual edit model_summary.mdx

* fix: resolve suggestions

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>

* fix: resolve suggestions2

Co-authored-by: Sohyun Sim <[email protected]>

---------

Co-authored-by: Sohyun Sim <[email protected]>
Co-authored-by: Wonhyeong Seo <[email protected]>

* Update Bark generation configs and tests (#25409)

* update bark generation configs for more coherent parameter

* make style

* update bark hub repo

* aligned sample_beam output selection with beam_search (#25375)

* aligned sample_beam specs with beam_search

* pull origin main

* Revert "pull origin main"

This reverts commit 06d356f1137bb52272e120a03636598c44449cf3.

* update test_utils.py

* fix format

* remove comment

---------

Co-authored-by: Shogo Fujita <[email protected]>

* Enable passing number of channels when inferring data format (#25412)

* Bark: flexible generation config overload (#25414)

* [DINOv2] Update pooler output (#25392)

Update pooler output

* 🌐 [i18n-KO] Translated `philosophy.md` to Korean (#25010)

* docs: ko: philosophy.md

* feat: chatgpt draft

* fix: manual edits

* fix: resolve suggestions

* Doc checks (#25408)

* Document check_dummies

* Type hints and doc in other files

* Document check inits

* Add documentation to

* Address review comments

* Generation: strict generation config validation at save time (#25411)

* strict gen config save; Add tests

* add note that the warning will be an exception in v4.34

* [WavLM] Fix Arxiv link and authors (#25415)

* [WavLM] Fix Arxiv link and authors

* make style

* Generate: Load generation config when `device_map` is passed (#25413)

* Fix rendering for `torch.compile()` docs (#25432)

fix rendering

* Add `examples`  to tests to run when `setup.py` is modified (#25437)

fix

Co-authored-by: ydshieh <[email protected]>

* Fix issue with ratio evaluation steps and auto find batch size (#25436)

* Fully rebased solution

* 500

* docs: add LLaMA-Efficient-Tuning to awesome-transformers (#25441)

Co-authored-by: statelesshz <[email protected]>

* GPTQ integration (#25062)

* GTPQ integration

* Add tests for gptq

* support for more quantization model

* fix style

* typo

* fix method

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <[email protected]>

* add dataclass and fix quantization_method

* fix doc

* Update tests/quantization/gptq/test_gptq.py

Co-authored-by: Younes Belkada <[email protected]>

* Apply suggestions from code review

Co-authored-by: Younes Belkada <[email protected]>

* modify dataclass

* add gtpqconfig import

* fix typo

* fix tests

* remove dataset as req arg

* remove tokenizer import

* add offload cpu quantization test

* fix check dataset

* modify dockerfile

* protect trainer

* style

* test for config

* add more log

* overwrite torch_dtype

* draft doc

* modify quantization_config docstring

* fix class name in docstring

* Apply suggestions from code review

Co-authored-by: Younes Belkada <[email protected]>

* more warning

* fix 8bit kwargs tests

* peft compatibility

* remove var

* fix is_gptq_quantized

* remove is_gptq_quantized

* fix wrap

* Update src/transformers/modeling_utils.py

Co-authored-by: Younes Belkada <[email protected]>

* add exllama

* skip test

* overwrite float16

* style

* fix skip test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* fix docsting formatting

* add doc

* better test

---------

Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>

* Fix for #25437 (#25454)

* fix

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>

* not debugged code

* reference code so nothing is lost

* novelty

* added docstrings

* fixed some relative import errors

* fixed small bugs

* added linear layers to bloom

* removed impossible embedding method

* Update src/transformers/models/bloom/desequence_graph_ids.py

Co-au…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants