Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Musicgen #24109

Merged
merged 154 commits into from
Jun 29, 2023
Merged
Show file tree
Hide file tree
Changes from 142 commits
Commits
Show all changes
154 commits
Select commit Hold shift + click to select a range
ced7cb5
Add Audiocraft
Jun 8, 2023
55821f7
add cross attention
Jun 8, 2023
55a21f5
style
Jun 8, 2023
56cfadb
add for lm
Jun 9, 2023
8a9ddbf
convert and verify
Jun 9, 2023
9e70dfb
introduce t5
Jun 9, 2023
aa64846
split configs
Jun 9, 2023
9ec9c37
load t5 + lm
Jun 9, 2023
06e3538
clean conversion
Jun 10, 2023
5a6b390
copy from t5
Jun 10, 2023
79c3f0e
style
Jun 10, 2023
2a2b6e3
start pattern provider
Jun 10, 2023
2a16867
make generation work
Jun 11, 2023
07668b6
style
Jun 11, 2023
5db67b6
fix pos embs
Jun 12, 2023
8cdf496
propagate shape changes
Jun 12, 2023
a7b0c75
propagate shape changes
Jun 12, 2023
d84e1a5
style
Jun 12, 2023
04c1e49
delay pattern: pad tokens at end
Jun 12, 2023
c7a7833
audiocraft -> musicgen
Jun 12, 2023
aab9c0a
fix inits
Jun 12, 2023
0575c8c
add mdx
Jun 12, 2023
7fec1a2
style
Jun 12, 2023
20e1682
fix pad token in processor
Jun 12, 2023
526a99d
override generate and add todos
Jun 12, 2023
a3811e7
add init to test
Jun 12, 2023
b8e313b
undo pattern delay mask after gen
Jun 12, 2023
414b3d1
remove cfg logits processor
Jun 12, 2023
94dea27
remove cfg logits processor
Jun 12, 2023
04f8d58
remove logits processor in favour of mask
Jun 12, 2023
1a22c70
clean pos embs
Jun 12, 2023
1fe75b2
make fix copies
Jun 12, 2023
06ceac7
update readmes
Jun 12, 2023
1e4d56d
clean pos emb
Jun 13, 2023
9b7e66e
refactor encoder/decoder
Jun 13, 2023
e900cc3
make fix copies
Jun 13, 2023
df6e911
update conversion
Jun 13, 2023
30684ae
fix config imports
Jun 13, 2023
f4344ec
update config docs
Jun 13, 2023
e7a740f
make style
Jun 13, 2023
358e611
send pattern mask to device
Jun 13, 2023
e9a3739
pattern mask with delay
Jun 13, 2023
bd330e7
recover prompted audio tokens
Jun 13, 2023
3eb434c
fix docstrings
Jun 13, 2023
e837fe9
laydown test file
Jun 13, 2023
0ee70a2
pattern edge case
Jun 13, 2023
e29b7a1
remove t5 ref
Jun 13, 2023
d75cbd4
add processing class
Jun 14, 2023
75088f6
config refactor
Jun 14, 2023
faf84a2
better pattern comment
Jun 14, 2023
7a8a03a
check if mask is not present
Jun 14, 2023
8e62225
check if mask is not present
Jun 14, 2023
bfbf325
refactor to auto class
Jun 14, 2023
7d0562c
remove encoder configs
Jun 14, 2023
a17ca0c
fix processor
Jun 14, 2023
4c56042
processor import
Jun 14, 2023
583d994
start updating conversion
Jun 14, 2023
f20b654
start updating tests
Jun 14, 2023
2ae374d
make style
Jun 15, 2023
c74afc6
convert t5, encodec, lm
Jun 15, 2023
d710b73
convert as composite
Jun 15, 2023
8f8aa33
also convert processor
Jun 15, 2023
acb3b72
run generate
Jun 15, 2023
368a8a6
classifier free gen
Jun 16, 2023
aaeeff7
comments and clean up
Jun 16, 2023
89f061a
make style
Jun 16, 2023
fab7922
docs for logit proc
Jun 16, 2023
1cf8649
docstring for uncond gen
Jun 16, 2023
54b663d
start lm tests
Jun 16, 2023
a3c3cef
work tests
Jun 16, 2023
ac07aac
let the lm generate
Jun 16, 2023
34b52c9
refactor: reshape inside forward
Jun 19, 2023
242b2bc
undo greedy loop changes
Jun 19, 2023
293f44b
from_enc_dec -> from_sub_model
Jun 19, 2023
ccd3a99
fix input id shapes in docstrings
Jun 19, 2023
7d6ea11
Apply suggestions from code review
sanchit-gandhi Jun 19, 2023
b8d09cb
undo generate changes
Jun 19, 2023
84da8ac
from sub model config
Jun 20, 2023
6a76bb9
Update src/transformers/models/musicgen/modeling_musicgen.py
sanchit-gandhi Jun 19, 2023
da611d0
make generate work again
Jun 20, 2023
251a1b5
generate uncond -> get uncond inputs
Jun 20, 2023
5f78277
remove prefix allowed tokens fn
Jun 20, 2023
3b246c1
better error message
Jun 20, 2023
7d26e3e
logit proc checks
Jun 20, 2023
f7dd1c1
Apply suggestions from code review
sanchit-gandhi Jun 20, 2023
fe7d7b4
make decoder only tests work
Jun 20, 2023
445698c
composite fast tests
Jun 22, 2023
0cfe68f
make style
Jun 22, 2023
eb49e4a
uncond generation
Jun 22, 2023
35c1f36
feat extr padding
Jun 22, 2023
9f2a030
make audio prompt work
Jun 22, 2023
15cff55
fix inputs docstrings
Jun 23, 2023
db2b3b6
unconditional inputs: dict -> model output
Jun 23, 2023
0e73197
clean up tests
Jun 23, 2023
2a783de
more clean up tests
Jun 23, 2023
8fed510
make style
Jun 23, 2023
00640ee
t5 encoder -> auto text encoder
Jun 23, 2023
f1e51ba
remove comments
Jun 23, 2023
938e7c1
deal with frames
Jun 23, 2023
8475bb6
fix auto text
Jun 23, 2023
727d30b
slow tests
Jun 23, 2023
8152142
nice mdx
Jun 23, 2023
1e99722
remove can generate
Jun 23, 2023
e232053
todo - hub id
Jun 23, 2023
feabb25
convert m/l
Jun 23, 2023
f5ebc62
make fix copies
Jun 23, 2023
c905bf0
only import generation with torch
Jun 23, 2023
9115d0c
ignore decoder from tests
Jun 23, 2023
2be768c
don't wrap uncond inputs
Jun 23, 2023
cb0273d
make style
Jun 23, 2023
857e2c0
cleaner uncond inputs
Jun 23, 2023
637960e
add example to musicgen forward
Jun 23, 2023
d7f4538
fix docs
Jun 23, 2023
53de80f
ignore MusicGen Model/ForConditionalGeneration in auto mapping
Jun 23, 2023
6ac1ec8
add doc section to toctree
Jun 26, 2023
1a52522
add to doc tests
Jun 26, 2023
2f450dc
add processor tests
Jun 26, 2023
5a01be4
fix push to hub in conversion
Jun 26, 2023
2a93c32
tips for decoder only loading
Jun 26, 2023
04c19d3
Apply suggestions from code review
sanchit-gandhi Jun 26, 2023
f8fc3ae
fix conversion for s / m / l checkpoints
Jun 26, 2023
4ea636c
Merge remote-tracking branch 'origin/audiocraft' into audiocraft
Jun 26, 2023
a1bf8b0
import stopping criteria from module
Jun 26, 2023
e5607b8
remove from pipeline tests
Jun 26, 2023
9268223
fix uncond docstring
Jun 26, 2023
928813d
decode audio method
Jun 27, 2023
48e044f
fix docs
Jun 27, 2023
829f724
org: sanchit-gandhi -> facebook
Jun 27, 2023
082257b
fix max pos embeddings
Jun 27, 2023
523dd4a
remove auto doc (not compatible with shapes)
Jun 27, 2023
59fb3c1
bump max pos emb
Jun 27, 2023
11e62da
make style
Jun 27, 2023
f57b114
fix doc
Jun 27, 2023
eb6eb3d
fix config doc
Jun 27, 2023
90917c3
fix config doc
Jun 27, 2023
2e1ac25
ignore musicgen config from docstring
Jun 27, 2023
89ef58d
make style
Jun 27, 2023
dbd6c4d
fix config
Jun 27, 2023
7a47757
fix config for doctest
Jun 27, 2023
0e86fcd
consistent from_sub_models
Jun 27, 2023
789fd4d
don't automap decoder
Jun 27, 2023
20f3c35
Merge branch 'main' into audiocraft
sanchit-gandhi Jun 27, 2023
5132a88
fix mdx save audio file
Jun 28, 2023
0788d33
fix mdx save audio file
Jun 28, 2023
c0235d3
processor batch decode for audio
Jun 29, 2023
1e4b914
remove keys to ignore
Jun 29, 2023
27af227
update doc md
Jun 29, 2023
edb461e
update generation config
Jun 29, 2023
af2e3b2
allow changes for default generation config
Jun 29, 2023
028b8ef
update tests
Jun 29, 2023
1913771
make style
Jun 29, 2023
aa1eccd
fix docstring for uncond
Jun 29, 2023
c1a64ac
fix processor test
Jun 29, 2023
e87842e
fix processor test
Jun 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,7 @@ Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://h
1. **[MobileViTV2](https://huggingface.co/docs/transformers/model_doc/mobilevitv2)** (from Apple) released with the paper [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680) by Sachin Mehta and Mohammad Rastegari.
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[MusicGen](https://huggingface.co/docs/transformers/main/model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](https://huggingface.co/docs/transformers/model_doc/mvp)** (from RUC AI Box) released with the paper [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131) by Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen.
1. **[NAT](https://huggingface.co/docs/transformers/model_doc/nat)** (from SHI Labs) released with the paper [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) by Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi.
1. **[Nezha](https://huggingface.co/docs/transformers/model_doc/nezha)** (from Huawei Noah’s Ark Lab) released with the paper [NEZHA: Neural Contextualized Representation for Chinese Language Understanding](https://arxiv.org/abs/1909.00204) by Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu.
Expand Down
1 change: 1 addition & 0 deletions README_es.md
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,7 @@ Número actual de puntos de control: ![](https://img.shields.io/endpoint?url=htt
1. **[MobileViTV2](https://huggingface.co/docs/transformers/model_doc/mobilevitv2)** (from Apple) released with the paper [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680) by Sachin Mehta and Mohammad Rastegari.
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[MusicGen](https://huggingface.co/docs/transformers/main/model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](https://huggingface.co/docs/transformers/model_doc/mvp)** (from RUC AI Box) released with the paper [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131) by Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen.
1. **[NAT](https://huggingface.co/docs/transformers/model_doc/nat)** (from SHI Labs) released with the paper [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) by Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi.
1. **[Nezha](https://huggingface.co/docs/transformers/model_doc/nezha)** (from Huawei Noah’s Ark Lab) released with the paper [NEZHA: Neural Contextualized Representation for Chinese Language Understanding](https://arxiv.org/abs/1909.00204) by Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu.
Expand Down
1 change: 1 addition & 0 deletions README_hd.md
Original file line number Diff line number Diff line change
Expand Up @@ -358,6 +358,7 @@ conda install -c huggingface transformers
1. **[MobileViTV2](https://huggingface.co/docs/transformers/model_doc/mobilevitv2)** (Apple से) Sachin Mehta and Mohammad Rastegari. द्वाराअनुसंधान पत्र [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680) के साथ जारी किया गया
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (Google AI से) साथ वाला पेपर [mT5: एक व्यापक बहुभाषी पूर्व-प्रशिक्षित टेक्स्ट-टू-टेक्स्ट ट्रांसफॉर्मर]( https://arxiv.org/abs/2010.11934) लिंटिंग ज़ू, नोआ कॉन्सटेंट, एडम रॉबर्ट्स, मिहिर काले, रामी अल-रफू, आदित्य सिद्धांत, आदित्य बरुआ, कॉलिन रैफेल द्वारा पोस्ट किया गया।
1. **[MusicGen](https://huggingface.co/docs/transformers/main/model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](https://huggingface.co/docs/transformers/model_doc/mvp)** (from RUC AI Box) released with the paper [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131) by Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen.
1. **[NAT](https://huggingface.co/docs/transformers/model_doc/nat)** (from SHI Labs) released with the paper [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) by Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi.
1. **[Nezha](https://huggingface.co/docs/transformers/model_doc/nezha)** (हुआवेई नूह के आर्क लैब से) साथ में कागज़ [NEZHA: चीनी भाषा समझ के लिए तंत्रिका प्रासंगिक प्रतिनिधित्व](https :/ /arxiv.org/abs/1909.00204) जुन्किउ वेई, ज़ियाओज़े रेन, ज़िआओगुआंग ली, वेनयोंग हुआंग, यी लियाओ, याशेंग वांग, जियाशू लिन, शिन जियांग, जिओ चेन और कुन लियू द्वारा।
Expand Down
1 change: 1 addition & 0 deletions README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,7 @@ Flax、PyTorch、TensorFlowをcondaでインストールする方法は、それ
1. **[MobileViTV2](https://huggingface.co/docs/transformers/model_doc/mobilevitv2)** (Apple から) Sachin Mehta and Mohammad Rastegari. から公開された研究論文 [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680)
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (Microsoft Research から) Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu から公開された研究論文: [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297)
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (Google AI から) Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel から公開された研究論文: [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934)
1. **[MusicGen](https://huggingface.co/docs/transformers/main/model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](https://huggingface.co/docs/transformers/model_doc/mvp)** (RUC AI Box から) Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen から公開された研究論文: [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131)
1. **[NAT](https://huggingface.co/docs/transformers/model_doc/nat)** (SHI Labs から) Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi から公開された研究論文: [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143)
1. **[Nezha](https://huggingface.co/docs/transformers/model_doc/nezha)** (Huawei Noah’s Ark Lab から) Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu から公開された研究論文: [NEZHA: Neural Contextualized Representation for Chinese Language Understanding](https://arxiv.org/abs/1909.00204)
Expand Down
1 change: 1 addition & 0 deletions README_ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[MobileViTV2](https://huggingface.co/docs/transformers/model_doc/mobilevitv2)** (Apple 에서 제공)은 Sachin Mehta and Mohammad Rastegari.의 [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680)논문과 함께 발표했습니다.
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (Microsoft Research 에서) Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu 의 [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) 논문과 함께 발표했습니다.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (Google AI 에서) Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel 의 [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) 논문과 함께 발표했습니다.
1. **[MusicGen](https://huggingface.co/docs/transformers/main/model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](https://huggingface.co/docs/transformers/model_doc/mvp)** (RUC AI Box 에서) Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen 의 [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131) 논문과 함께 발표했습니다.
1. **[NAT](https://huggingface.co/docs/transformers/model_doc/nat)** (SHI Labs 에서) Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi 의 [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) 논문과 함께 발표했습니다.
1. **[Nezha](https://huggingface.co/docs/transformers/model_doc/nezha)** (Huawei Noah’s Ark Lab 에서) Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu 의 [NEZHA: Neural Contextualized Representation for Chinese Language Understanding](https://arxiv.org/abs/1909.00204) 논문과 함께 발표했습니다.
Expand Down
1 change: 1 addition & 0 deletions README_zh-hans.md
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,7 @@ conda install -c huggingface transformers
1. **[MobileViTV2](https://huggingface.co/docs/transformers/model_doc/mobilevitv2)** (来自 Apple) 伴随论文 [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680) 由 Sachin Mehta and Mohammad Rastegari 发布。
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (来自 Microsoft Research) 伴随论文 [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) 由 Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu 发布。
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (来自 Google AI) 伴随论文 [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) 由 Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel 发布。
1. **[MusicGen](https://huggingface.co/docs/transformers/main/model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](https://huggingface.co/docs/transformers/model_doc/mvp)** (来自 中国人民大学 AI Box) 伴随论文 [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131) 由 Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen 发布。
1. **[NAT](https://huggingface.co/docs/transformers/model_doc/nat)** (来自 SHI Labs) 伴随论文 [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) 由 Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi 发布。
1. **[Nezha](https://huggingface.co/docs/transformers/model_doc/nezha)** (来自华为诺亚方舟实验室) 伴随论文 [NEZHA: Neural Contextualized Representation for Chinese Language Understanding](https://arxiv.org/abs/1909.00204) 由 Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu 发布。
Expand Down
1 change: 1 addition & 0 deletions README_zh-hant.md
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,7 @@ conda install -c huggingface transformers
1. **[MobileViTV2](https://huggingface.co/docs/transformers/model_doc/mobilevitv2)** (from Apple) released with the paper [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680) by Sachin Mehta and Mohammad Rastegari.
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[MusicGen](https://huggingface.co/docs/transformers/main/model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](https://huggingface.co/docs/transformers/model_doc/mvp)** (from RUC AI Box) released with the paper [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131) by Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen.
1. **[NAT](https://huggingface.co/docs/transformers/model_doc/nat)** (from SHI Labs) released with the paper [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) by Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi.
1. **[Nezha](https://huggingface.co/docs/transformers/model_doc/nezha)** (from Huawei Noah’s Ark Lab) released with the paper [NEZHA: Neural Contextualized Representation for Chinese Language Understanding](https://arxiv.org/abs/1909.00204) by Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -549,6 +549,8 @@
title: MCTCT
- local: model_doc/mms
title: MMS
- local: model_doc/musicgen
title: MusicGen
- local: model_doc/sew
title: SEW
- local: model_doc/sew-d
Expand Down
2 changes: 2 additions & 0 deletions docs/source/en/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ The documentation is organized into five sections:
1. **[MobileViTV2](model_doc/mobilevitv2)** (from Apple) released with the paper [Separable Self-attention for Mobile Vision Transformers](https://arxiv.org/abs/2206.02680) by Sachin Mehta and Mohammad Rastegari.
1. **[MPNet](model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[MusicGen](model_doc/musicgen)** (from Meta) released with the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
1. **[MVP](model_doc/mvp)** (from RUC AI Box) released with the paper [MVP: Multi-task Supervised Pre-training for Natural Language Generation](https://arxiv.org/abs/2206.12131) by Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen.
1. **[NAT](model_doc/nat)** (from SHI Labs) released with the paper [Neighborhood Attention Transformer](https://arxiv.org/abs/2204.07143) by Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi.
1. **[Nezha](model_doc/nezha)** (from Huawei Noah’s Ark Lab) released with the paper [NEZHA: Neural Contextualized Representation for Chinese Language Understanding](https://arxiv.org/abs/1909.00204) by Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu.
Expand Down Expand Up @@ -380,6 +381,7 @@ Flax), PyTorch, and/or TensorFlow.
| MobileViTV2 | ❌ | ❌ | ✅ | ❌ | ❌ |
| MPNet | ✅ | ✅ | ✅ | ✅ | ❌ |
| MT5 | ✅ | ✅ | ✅ | ✅ | ✅ |
| MusicGen | ❌ | ❌ | ✅ | ❌ | ❌ |
| MVP | ✅ | ✅ | ✅ | ❌ | ❌ |
| NAT | ❌ | ❌ | ✅ | ❌ | ❌ |
| Nezha | ❌ | ❌ | ✅ | ❌ | ❌ |
Expand Down
Loading