-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StableLM
#28810
Add StableLM
#28810
Conversation
Hey! Thanks for contributing! |
Hello!
The model is already on the hub here but uses custom modeling code. Is your suggestion to simply rename the |
No what I mean is I think it's fine to keep it on the hub! 🤗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks already mergeable! Good work there 😉
Hi, @ArthurZucker; thanks for the quick review! I'd like to point out that the recent commit 097272f removes a copied-from comment from |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work there !
Looks good to me, would you just mind me merging #27931 before?
This would mean you might have to use copied from mistral instead of Llama. Otherwise I'll merge this one and rebase on my side!
logger = logging.get_logger(__name__) | ||
|
||
STABLELM_PRETRAINED_CONFIG_ARCHIVE_MAP = { | ||
"jon-tow/stablelm-3b-4e1t-dev": "https://huggingface.co/jon-tow/stablelm-3b-4e1t-dev/resolve/main/config.json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not forget to use original repo here! (opening a PR to the repo to upload the new config etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reminder! I've opened draft PRs for the base models:
- https://huggingface.co/stabilityai/stablelm-3b-4e1t/discussions/10
- https://huggingface.co/stabilityai/stablelm-2-1_6b/discussions/6
At what point should these be merged? I assume after the next release of transformers
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
Could you rebase on main and make sure CIs are all green? 🤗 I can help if you can't finish all of them |
8704fac
to
21fe181
Compare
Can you please help with the |
004afb6
to
e923955
Compare
9986ad1
to
6a6a0ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checked, LGTM!
* Add `StableLM` * fix(model): re-create from `huggingface-cli add-new-model-like persimmon` * fix: re-add changes to address comments * fix(readme): add links to paper * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref * fix(tests): re-add `@slow` decorator to integration tests * fix(tests): import slow... * fix(readme_hd): remove whitespace edit * fix(tokenizer): auto tokenizer tuple * skip doctests for `modeling_stablelm`
* Add `StableLM` * fix(model): re-create from `huggingface-cli add-new-model-like persimmon` * fix: re-add changes to address comments * fix(readme): add links to paper * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref * fix(tests): re-add `@slow` decorator to integration tests * fix(tests): import slow... * fix(readme_hd): remove whitespace edit * fix(tokenizer): auto tokenizer tuple * skip doctests for `modeling_stablelm`
What does this PR do?
This PR adds modeling support for
StableLM 3B 4E1T
(as well asStableLM 2 1.6B
) based models.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker
Notes
TODO: The current online implementation uses an early naming scheme for the
model_type
I've temporarily created a development model repository https://huggingface.co/jon-tow/stablelm-3b-4e1t-dev for unit testing and config archive mapping which need to be updated before any merging.
Is there a better way to handle this? I've noticed a similar issue in this
Phi
model PR discussion.