forked from huggingface/transformers
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update with forked from #1
Merged
oushu1zhangxiangxuan1
merged 956 commits into
oushu1zhangxiangxuan1:main
from
huggingface:main
Mar 20, 2023
Merged
Update with forked from #1
oushu1zhangxiangxuan1
merged 956 commits into
oushu1zhangxiangxuan1:main
from
huggingface:main
Mar 20, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Update expect output values - as Hub repo. files are updated * Update expect output values - as librosa is from 0.9.2 to 0.10.0 on CI docker * fix * update one more --------- Co-authored-by: ydshieh <[email protected]>
* fix bug * forward contrib credits from discussions * change logic --------- Co-authored-by: edbeeching <[email protected]>
* Ran Black formatting * Added imports and reformatted * Update src/transformers/models/encoder_decoder/modeling_tf_encoder_decoder.py --------- Co-authored-by: Matt <[email protected]>
* troubleshooting guide: added an error description for missing auto-mapping * minor polishing * changed the example * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/troubleshooting.mdx Co-authored-by: Sylvain Gugger <[email protected]> --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>
* Removed useless check for backend * fix style check for graphormer * Reverted change and corrected requires_backend for cython * code qual
#21612) * fix: Change is_last chunk calc and add conditional break * format fix * account for 0 and full stride_rights, add comment * add new test * make style * update slow whisper asr test timestamps * use nested_simplify on output and round timestamp to hundreths place
* [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * cleanup, batchnorm support in sharded pt to flax * support for batchnorm tests in pt+flax integration * simplifying checking batch norm layer
…1756) * [Examples] Generalise run audio classification for log-mel models * batch feature extractor * make style
* Different behavior in DistilBERT when using "inputs_embeds" Fixes #21089 * fix failing test
* Return and rescale attention_mask * Add SpecAugment to Whisper modeling * Fix test * Update docstring * Add SpecAug related parameters to model config * Add the _mask_input_features function to doc * Fix quality * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> * Remove dev comments * Add test * Resolve conflict * feat: mask {feature, time} prob fast tests * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> --------- Co-authored-by: Arthur <[email protected]> Co-authored-by: sanchit-gandhi <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>
* fix history * input_features instead of input ids for TFWhisport doctest * use translate intead of transcribe
…g and gradient checkpointing (#21759)
* updated expected * prediction_length fix * prediction_length default value * default prediction_length 24 * revert back prediction_length default * move prediction_length test
* fix gradient checkpointing bug * fix gradient checkpointing bug * ran make fix-copies * fixed bug * fixed bug
* Fix resume_from_checkpoint for deepspeed Fix resume_from_checkpoint for deepspeed, by ensuring that the deepspeed engine is the one to load the checkpoint. * Empty commit to trigger CI * Removed deepspeed skipping Removed deepspeed skipping inside the _load_from_checkpoint function, as it is obsolete * another adjustment * Trigger CI * trigger circleci * style --------- Co-authored-by: ydshieh <[email protected]> Co-authored-by: Stas Bekman <[email protected]>
* Override the decoding parameters of Seq2SeqTrainer * Fix quality * Fix max_length parameter * Fix quality * Remove redundant parameter max_length * Separate the preprocess of train and validation to use different max_target_length
Fix docstring gpt2 config
* fix wrong url * typos in english documentation
make concrete_args from outside available
* add pipeline * update init * add zero shot to init * update inits and correct checkpoints * update base to support input features * add tests * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <[email protected]> * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <[email protected]> * update pieline code * use tiny checkpoint * nits and expected value with tiny model * style * last nit on tests values * fix styling * fix collate fn that was casting t float * update --------- Co-authored-by: Younes Belkada <[email protected]>
* uint8 -> bool * fix copies * style * update test modeling commen when checking attention buffers * style * use logical not on random mask instead of subtraction with 1 * remove torch uint8 * quality * remove modified modeling utils * Update based on review Co-authored-by: sgugger <[email protected]> --------- Co-authored-by: sgugger <[email protected]>
…21787) * fix perceiver fp16 * hopefully fix tests
fix nn.init.trunc_normal_ call on half data
* Fix gradient checkpointing bug in gptneox * Remove use_cache block
Revert changes
* Fix regression in pipeline when device=-1 is passed * Add regression test
* Use return_loss for BridgeTowerForContrastiveLearning, add example * fix tests * Update example in BridgeTowerForContrastiveLearning * Update test_modeling_bridgetower.py * update model output format * minor update * Update src/transformers/models/bridgetower/modeling_bridgetower.py * make style --------- Co-authored-by: Tiep Le <[email protected]> Co-authored-by: Tiep Le <[email protected]> Co-authored-by: Yih-Dar <[email protected]> Co-authored-by: ydshieh <[email protected]>
* t5 remove data dependency * make style * make fix-copies --------- Co-authored-by: Prathik Rao <[email protected]>
* Deal with torch-tensorrt --------- Co-authored-by: ydshieh <[email protected]>
Fix align docs typo
Update values Co-authored-by: ydshieh <[email protected]>
* Tranlstion Italian: migration * Update migration.mdx minor fixes * Update _toctree.yml * Delete migration.mdx * Add italian translation of migration.mdx * Update of migration.mdx translation and toctree
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <[email protected]>
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <[email protected]>
* Update UNCONVERTIBLE_MODEL_ARCHITECTURES * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * make style and quality --------- Co-authored-by: ydshieh <[email protected]>
* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143 * Reduced column width * Fix formatting. * Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143" This reverts commit 6e95a10. * Fix export error. * Revert "Fix formatting." This reverts commit 8310f60. * Propagated changes made in SwinV2 to Swin2SR
* add `accelerate` support for XGLM * fix order
* fixes a typo * .
* py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <[email protected]>
fix Co-authored-by: ydshieh <[email protected]>
Co-authored-by: yue kun <[email protected]>
Use dash 2.8.1 for now Co-authored-by: ydshieh <[email protected]>
* added doc to toc, auto tip with supported models, mention of task guide in model docs * make style * removed "see also" * minor fix
* LLaMA house-keeping * Doc links
* fix AutoTP in deepspeed could not work for bloom Signed-off-by: Wang, Yi A <[email protected]> * add a method in BloomModel to build ailib Signed-off-by: Wang, Yi A <[email protected]> --------- Signed-off-by: Wang, Yi A <[email protected]>
* Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <[email protected]> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <[email protected]> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by: Younes Belkada <[email protected]>
removed .mdx extension
fix(docs): task guide links in model docs
* Add kernel size to NATTEN's QK arguments. The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional argument to the QK operation to allow optional RPBs. This ends up failing NATTEN tests. This commit adds NATTEN back to circleci and adds the arguments to get it working again. * Force NATTEN >= 0.14.5
[trainer] param count for zero3
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.