Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Add Mamba] Adds support for the Mamba models #28094

Merged
merged 123 commits into from
Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
123 commits
Select commit Hold shift + click to select a range
81c642f
initial-commit
ArthurZucker Dec 16, 2023
c50602b
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Jan 31, 2024
00d3a6c
start cleaning
ArthurZucker Jan 31, 2024
921bb24
small nits
ArthurZucker Feb 1, 2024
b3f216d
small nits
ArthurZucker Feb 3, 2024
7235b57
current updates
ArthurZucker Feb 3, 2024
7a407a7
add kernels
ArthurZucker Feb 5, 2024
9f2a982
small refactoring little step
ArthurZucker Feb 5, 2024
04c991a
add comments
ArthurZucker Feb 5, 2024
aa7e8d2
styling
ArthurZucker Feb 5, 2024
26748c4
nit
ArthurZucker Feb 5, 2024
75e376a
nits
ArthurZucker Feb 14, 2024
1c104b5
Style
ArthurZucker Feb 14, 2024
0e90dae
Merge
ArthurZucker Feb 14, 2024
a804466
Small changes
ArthurZucker Feb 14, 2024
6b87ad2
Push dummy mambda simple slow
ArthurZucker Feb 14, 2024
a7ec8d6
nit
ArthurZucker Feb 14, 2024
5046451
Use original names
ArthurZucker Feb 14, 2024
b5831e3
Use original names and remove norm
ArthurZucker Feb 15, 2024
e9a80ad
Updates for inference params
ArthurZucker Feb 15, 2024
ee4a7ef
Style nd updates
ArthurZucker Feb 15, 2024
d8c195f
nits
ArthurZucker Feb 15, 2024
e64fedc
Match logits
ArthurZucker Feb 16, 2024
aee558f
Add a test
ArthurZucker Feb 16, 2024
eae5f45
Add expected generated text
ArthurZucker Feb 16, 2024
1f8e8d0
nits doc, imports and styling
ArthurZucker Feb 16, 2024
3cc06e5
style
ArthurZucker Feb 16, 2024
5a5324c
oups
ArthurZucker Feb 16, 2024
325b66b
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 16, 2024
81303f4
dont install kernels, invite users to install the required kernels
ArthurZucker Feb 19, 2024
1a10310
let use use the original packages
ArthurZucker Feb 19, 2024
89fb490
styling
ArthurZucker Feb 19, 2024
6cfe216
nits
ArthurZucker Feb 19, 2024
1ecbd22
fix some copieds
ArthurZucker Feb 19, 2024
b937122
update doc
ArthurZucker Feb 19, 2024
9752dd0
fix-copies
ArthurZucker Feb 19, 2024
a7881a3
styling done
ArthurZucker Feb 19, 2024
f445b0d
nits
ArthurZucker Feb 19, 2024
64ec8dd
fix import check
ArthurZucker Feb 19, 2024
e6e3ba8
run but wrong cuda ress
ArthurZucker Feb 19, 2024
ed4eb4c
mamba CUDA works :)
ArthurZucker Feb 19, 2024
4c8fc48
fix the fast path
ArthurZucker Feb 19, 2024
69e103f
config naming nits
ArthurZucker Feb 19, 2024
ba21ff2
conversion script is not required at this stage
ArthurZucker Feb 19, 2024
fe53728
finish fixing the fast path: generation make sense now!
ArthurZucker Feb 19, 2024
9411169
nit
ArthurZucker Feb 19, 2024
c2c7709
Let's start working on the CIs
ArthurZucker Feb 19, 2024
1e73ca9
style
ArthurZucker Feb 19, 2024
834f46f
git push Merge branch 'main' of github.com:huggingface/transformers i…
ArthurZucker Feb 19, 2024
a1a94f3
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 20, 2024
2213222
better style
ArthurZucker Feb 20, 2024
2a02006
more nits
ArthurZucker Feb 20, 2024
8b0412f
test nit
ArthurZucker Feb 20, 2024
fbd6a2c
quick fix for now
ArthurZucker Feb 20, 2024
823f11a
nits
ArthurZucker Feb 20, 2024
88896a9
nit
ArthurZucker Feb 20, 2024
7f72ee8
nit
ArthurZucker Feb 21, 2024
0555247
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 29, 2024
0072a6c
nit
ArthurZucker Feb 29, 2024
7f6c56f
nits
ArthurZucker Feb 29, 2024
f67c353
update test rest
ArthurZucker Feb 29, 2024
2ab5a86
fixup
ArthurZucker Feb 29, 2024
8920be3
update test
ArthurZucker Feb 29, 2024
87d0664
nit
ArthurZucker Feb 29, 2024
8b00d76
some fixes
ArthurZucker Feb 29, 2024
ca9835c
nits
ArthurZucker Feb 29, 2024
796ef3e
update test values
ArthurZucker Feb 29, 2024
170664a
fix styling
ArthurZucker Feb 29, 2024
92493a0
nit
ArthurZucker Feb 29, 2024
854ebad
support peft
ArthurZucker Feb 29, 2024
3bbd1b1
Merge branch 'main' of github.com:huggingface/transformers into add-m…
ArthurZucker Feb 29, 2024
aa0e6bb
integrations tests require torchg
ArthurZucker Feb 29, 2024
3c1537e
also add slow markers
ArthurZucker Feb 29, 2024
d06421a
styling
ArthurZucker Feb 29, 2024
5fb8062
chose forward wisely
ArthurZucker Feb 29, 2024
edb4e91
nits
ArthurZucker Feb 29, 2024
eb1fb64
update tests
ArthurZucker Feb 29, 2024
de4fe46
fix gradient checkpointing
ArthurZucker Feb 29, 2024
54ffaa3
fixup
ArthurZucker Feb 29, 2024
977d34f
nit
ArthurZucker Feb 29, 2024
0928453
fix doc
ArthurZucker Feb 29, 2024
2c90536
check copies
ArthurZucker Feb 29, 2024
4ba9c79
fix the docstring
ArthurZucker Feb 29, 2024
3651dba
fix some more tests
ArthurZucker Feb 29, 2024
426e6f3
style
ArthurZucker Feb 29, 2024
951b1aa
fix beam search
ArthurZucker Mar 1, 2024
4101369
add init schene
ArthurZucker Mar 1, 2024
65db96b
update
ArthurZucker Mar 1, 2024
0f3dfc7
nit
ArthurZucker Mar 1, 2024
f8bd0aa
fix
ArthurZucker Mar 1, 2024
b2bd0c7
fixup the doc
ArthurZucker Mar 1, 2024
cf58529
fix the doc
ArthurZucker Mar 1, 2024
e9c3447
fixup
ArthurZucker Mar 1, 2024
1282a75
tentative update but slow is no longer good
ArthurZucker Mar 1, 2024
fa561b2
nit
ArthurZucker Mar 1, 2024
91b8106
should we always use float32?
ArthurZucker Mar 1, 2024
e8142ca
nits
ArthurZucker Mar 1, 2024
623b636
revert wrong changes
ArthurZucker Mar 1, 2024
566c799
res in float32
ArthurZucker Mar 1, 2024
5d637d9
cleanup
ArthurZucker Mar 2, 2024
648a292
skip fmt for now
ArthurZucker Mar 2, 2024
e306e89
update generation values
ArthurZucker Mar 2, 2024
057d7a3
update test values running original model
ArthurZucker Mar 2, 2024
72f8936
fixup
ArthurZucker Mar 2, 2024
f415081
update tests + rename inference_params to cache_params + make sure tr…
ArthurZucker Mar 4, 2024
6bb659a
small nits
ArthurZucker Mar 4, 2024
178fe76
more nits
ArthurZucker Mar 4, 2024
3a46724
fix final CIs
ArthurZucker Mar 4, 2024
13204e0
style
ArthurZucker Mar 4, 2024
1608a90
nit doc
ArthurZucker Mar 4, 2024
99119ba
I hope final doc nits
ArthurZucker Mar 4, 2024
d6fb1ef
nit
ArthurZucker Mar 4, 2024
844530f
🫠
ArthurZucker Mar 4, 2024
52be018
final touch!
ArthurZucker Mar 4, 2024
d03de1c
fix torch import
ArthurZucker Mar 4, 2024
c0672a8
Apply suggestions from code review
ArthurZucker Mar 5, 2024
dfc1212
Apply suggestions from code review
ArthurZucker Mar 5, 2024
acd4ccf
fix fix and fix
ArthurZucker Mar 5, 2024
2ddd9aa
fix base model prefix!
ArthurZucker Mar 5, 2024
0c5d7ed
nit
ArthurZucker Mar 5, 2024
28e5ef0
Update src/transformers/models/mamba/__init__.py
ArthurZucker Mar 5, 2024
f963e38
Update docs/source/en/model_doc/mamba.md
ArthurZucker Mar 5, 2024
095dabd
nit
ArthurZucker Mar 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -415,6 +415,7 @@ Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://h
1. **[M-CTC-T](https://huggingface.co/docs/transformers/model_doc/mctct)** (from Facebook) released with the paper [Pseudo-Labeling For Massively Multilingual Speech Recognition](https://arxiv.org/abs/2111.00161) by Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert.
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (from Facebook) released with the paper [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125) by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
1. **[MADLAD-400](https://huggingface.co/docs/transformers/model_doc/madlad-400)** (from Google) released with the paper [MADLAD-400: A Multilingual And Document-Level Large Audited Dataset](https://arxiv.org/abs/2309.04662) by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
1. **[Mamba](https://huggingface.co/docs/transformers/main/model_doc/mamba)** (from Albert Gu and Tri Dao) released with the paper [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) by Albert Gu and Tri Dao.
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Machine translation models trained using [OPUS](http://opus.nlpl.eu/) data by Jörg Tiedemann. The [Marian Framework](https://marian-nmt.github.io/) is being developed by the Microsoft Translator Team.
1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (from Microsoft Research Asia) released with the paper [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding](https://arxiv.org/abs/2110.08518) by Junlong Li, Yiheng Xu, Lei Cui, Furu Wei.
1. **[Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former)** (from FAIR and UIUC) released with the paper [Masked-attention Mask Transformer for Universal Image Segmentation](https://arxiv.org/abs/2112.01527) by Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar.
Expand Down
1 change: 1 addition & 0 deletions README_es.md
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,7 @@ Número actual de puntos de control: ![](https://img.shields.io/endpoint?url=htt
1. **[M-CTC-T](https://huggingface.co/docs/transformers/model_doc/mctct)** (from Facebook) released with the paper [Pseudo-Labeling For Massively Multilingual Speech Recognition](https://arxiv.org/abs/2111.00161) by Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert.
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (from Facebook) released with the paper [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125) by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
1. **[MADLAD-400](https://huggingface.co/docs/transformers/model_doc/madlad-400)** (from Google) released with the paper [MADLAD-400: A Multilingual And Document-Level Large Audited Dataset](https://arxiv.org/abs/2309.04662) by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
1. **[Mamba](https://huggingface.co/docs/transformers/main/model_doc/mamba)** (from Albert Gu and Tri Dao) released with the paper [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) by Albert Gu and Tri Dao.
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Machine translation models trained using [OPUS](http://opus.nlpl.eu/) data by Jörg Tiedemann. The [Marian Framework](https://marian-nmt.github.io/) is being developed by the Microsoft Translator Team.
1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (from Microsoft Research Asia) released with the paper [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding](https://arxiv.org/abs/2110.08518) by Junlong Li, Yiheng Xu, Lei Cui, Furu Wei.
1. **[Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former)** (from FAIR and UIUC) released with the paper [Masked-attention Mask Transformer for Universal Image Segmentation](https://arxiv.org/abs/2112.01527) by Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar.
Expand Down
1 change: 1 addition & 0 deletions README_fr.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,7 @@ Nombre actuel de points de contrôle : ![](https://img.shields.io/endpoint?url=h
1. **[M-CTC-T](https://huggingface.co/docs/transformers/model_doc/mctct)** (de Facebook) a été publié dans l'article [Pseudo-Labeling For Massively Multilingual Speech Recognition](https://arxiv.org/abs/2111.00161) de Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve et Ronan Collobert.
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (de Facebook) a été publié dans l'article [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125) de Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
1. **[MADLAD-400](https://huggingface.co/docs/transformers/model_doc/madlad-400)** (de Google) a été publié dans l'article [MADLAD-400 : Un ensemble de données multilingue et de niveau document](https://arxiv.org/abs/2309.04662) de Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
1. **[Mamba](https://huggingface.co/docs/transformers/main/model_doc/mamba)** (de Albert Gu and Tri Dao) publié dans l'article [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) parAlbert Gu and Tri Dao.
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Des modèles de traduction automatique formés avec les données [OPUS](http://opus.nlpl.eu/) par Jörg Tiedemann. Le [cadre Marian](https://marian-nmt.github.io/) est en cours de développement par l'équipe Microsoft Translator.
1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (de Microsoft Research Asia) a été publié dans l'article [MarkupLM : Pré-entraînement de texte et de langage de balisage pour la compréhension visuellement riche de documents](https://arxiv.org/abs/2110.08518) de Junlong Li, Yiheng Xu, Lei Cui, Furu Wei.
1. **[Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former)** (de FAIR et UIUC) a été publié dans l'article [Masked-attention Mask Transformer for Universal Image Segmentation](https://arxiv.org/abs/2112.01527) de Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar.
Expand Down
1 change: 1 addition & 0 deletions README_hd.md
Original file line number Diff line number Diff line change
Expand Up @@ -362,6 +362,7 @@ conda install conda-forge::transformers
1. **[M-CTC-T](https://huggingface.co/docs/transformers/model_doc/mctct)** (from Facebook) released with the paper [Pseudo-Labeling For Massively Multilingual Speech Recognition](https://arxiv.org/abs/2111.00161) by Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert.
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (फेसबुक से) साथ देने वाला पेपर [बियॉन्ड इंग्लिश-सेंट्रिक मल्टीलिंगुअल मशीन ट्रांसलेशन](https://arxiv.org/एब्स/2010.11125) एंजेला फैन, श्रुति भोसले, होल्गर श्वेन्क, झी मा, अहमद अल-किश्की, सिद्धार्थ गोयल, मनदीप बैनेस, ओनूर सेलेबी, गुइल्लाम वेन्जेक, विश्रव चौधरी, नमन गोयल, टॉम बर्च, विटाली लिपचिंस्की, सर्गेई एडुनोव, एडौर्ड द्वारा ग्रेव, माइकल औली, आर्मंड जौलिन द्वारा पोस्ट किया गया।
1. **[MADLAD-400](https://huggingface.co/docs/transformers/model_doc/madlad-400)** (from Google) released with the paper [MADLAD-400: A Multilingual And Document-Level Large Audited Dataset](https://arxiv.org/abs/2309.04662) by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
1. **[Mamba](https://huggingface.co/docs/transformers/main/model_doc/mamba)** (Albert Gu and Tri Dao से) Albert Gu and Tri Dao. द्वाराअनुसंधान पत्र [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) के साथ जारी किया गया
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Jörg द्वारा [OPUS](http://opus.nlpl.eu/) डेटा से प्रशिक्षित मशीनी अनुवाद मॉडल पोस्ट किया गया टाइडेमैन द्वारा। [मैरियन फ्रेमवर्क](https://marian-nmt.github.io/) माइक्रोसॉफ्ट ट्रांसलेटर टीम द्वारा विकसित।
1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (माइक्रोसॉफ्ट रिसर्च एशिया से) साथ में पेपर [मार्कअपएलएम: विजुअली-रिच डॉक्यूमेंट अंडरस्टैंडिंग के लिए टेक्स्ट और मार्कअप लैंग्वेज का प्री-ट्रेनिंग](https://arxiv.org/abs/2110.08518) जुनलॉन्ग ली, यिहेंग जू, लेई कुई, फुरु द्वारा वी द्वारा पोस्ट किया गया।
1. **[Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former)** (FAIR and UIUC से) Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar. द्वाराअनुसंधान पत्र [Masked-attention Mask Transformer for Universal Image Segmentation](https://arxiv.org/abs/2112.01527) के साथ जारी किया गया
Expand Down
1 change: 1 addition & 0 deletions README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,7 @@ Flax、PyTorch、TensorFlowをcondaでインストールする方法は、それ
1. **[M-CTC-T](https://huggingface.co/docs/transformers/model_doc/mctct)** (Facebook から) Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert から公開された研究論文: [Pseudo-Labeling For Massively Multilingual Speech Recognition](https://arxiv.org/abs/2111.00161)
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (Facebook から) Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin から公開された研究論文: [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125)
1. **[MADLAD-400](https://huggingface.co/docs/transformers/model_doc/madlad-400)** (from Google) released with the paper [MADLAD-400: A Multilingual And Document-Level Large Audited Dataset](https://arxiv.org/abs/2309.04662) by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
1. **[Mamba](https://huggingface.co/docs/transformers/main/model_doc/mamba)** (Albert Gu and Tri Dao から) Albert Gu and Tri Dao. から公開された研究論文 [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752)
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Jörg Tiedemann から. [OPUS](http://opus.nlpl.eu/) を使いながら学習された "Machine translation" (マシントランスレーション) モデル. [Marian Framework](https://marian-nmt.github.io/) はMicrosoft Translator Team が現在開発中です.
1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (Microsoft Research Asia から) Junlong Li, Yiheng Xu, Lei Cui, Furu Wei から公開された研究論文: [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding](https://arxiv.org/abs/2110.08518)
1. **[Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former)** (FAIR and UIUC から) Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar. から公開された研究論文 [Masked-attention Mask Transformer for Universal Image Segmentation](https://arxiv.org/abs/2112.01527)
Expand Down
1 change: 1 addition & 0 deletions README_ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[M-CTC-T](https://huggingface.co/docs/transformers/model_doc/mctct)** (Facebook 에서) Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert 의 [Pseudo-Labeling For Massively Multilingual Speech Recognition](https://arxiv.org/abs/2111.00161) 논문과 함께 발표했습니다.
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (Facebook 에서) Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin 의 [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125) 논문과 함께 발표했습니다.
1. **[MADLAD-400](https://huggingface.co/docs/transformers/model_doc/madlad-400)** (from Google) released with the paper [MADLAD-400: A Multilingual And Document-Level Large Audited Dataset](https://arxiv.org/abs/2309.04662) by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
1. **[Mamba](https://huggingface.co/docs/transformers/main/model_doc/mamba)** (Albert Gu and Tri Dao 에서 제공)은 Albert Gu and Tri Dao.의 [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752)논문과 함께 발표했습니다.
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Machine translation models trained using [OPUS](http://opus.nlpl.eu/) data by Jörg Tiedemann. The [Marian Framework](https://marian-nmt.github.io/) is being developed by the Microsoft Translator Team.
1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (Microsoft Research Asia 에서) Junlong Li, Yiheng Xu, Lei Cui, Furu Wei 의 [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding](https://arxiv.org/abs/2110.08518) 논문과 함께 발표했습니다.
1. **[Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former)** (FAIR and UIUC 에서 제공)은 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar.의 [Masked-attention Mask Transformer for Universal Image Segmentation](https://arxiv.org/abs/2112.01527)논문과 함께 발표했습니다.
Expand Down
1 change: 1 addition & 0 deletions README_zh-hans.md
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,7 @@ conda install conda-forge::transformers
1. **[M-CTC-T](https://huggingface.co/docs/transformers/model_doc/mctct)** (来自 Facebook) 伴随论文 [Pseudo-Labeling For Massively Multilingual Speech Recognition](https://arxiv.org/abs/2111.00161) 由 Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert 发布。
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (来自 Facebook) 伴随论文 [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125) 由 Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin 发布。
1. **[MADLAD-400](https://huggingface.co/docs/transformers/model_doc/madlad-400)** (from Google) released with the paper [MADLAD-400: A Multilingual And Document-Level Large Audited Dataset](https://arxiv.org/abs/2309.04662) by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
1. **[Mamba](https://huggingface.co/docs/transformers/main/model_doc/mamba)** (来自 Albert Gu and Tri Dao) 伴随论文 [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) 由 Albert Gu and Tri Dao 发布。
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** 用 [OPUS](http://opus.nlpl.eu/) 数据训练的机器翻译模型由 Jörg Tiedemann 发布。[Marian Framework](https://marian-nmt.github.io/) 由微软翻译团队开发。
1. **[MarkupLM](https://huggingface.co/docs/transformers/model_doc/markuplm)** (来自 Microsoft Research Asia) 伴随论文 [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding](https://arxiv.org/abs/2110.08518) 由 Junlong Li, Yiheng Xu, Lei Cui, Furu Wei 发布。
1. **[Mask2Former](https://huggingface.co/docs/transformers/model_doc/mask2former)** (来自 FAIR and UIUC) 伴随论文 [Masked-attention Mask Transformer for Universal Image Segmentation](https://arxiv.org/abs/2112.01527) 由 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar 发布。
Expand Down
Loading
Loading