Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPT models on the Hub not working with transformers main #703

Closed
younesbelkada opened this issue Oct 30, 2023 · 5 comments · Fixed by #704
Closed

MPT models on the Hub not working with transformers main #703

younesbelkada opened this issue Oct 30, 2023 · 5 comments · Fixed by #704
Labels
bug Something isn't working

Comments

@younesbelkada
Copy link

younesbelkada commented Oct 30, 2023

Hi there!

Currently with transformers main loading MPT models from the Hub fails because it tries to import some private method (such as _expand_mask ) that has been recently removed: huggingface/transformers#27086

The simple loading script below should work

from accelerate import init_empty_weights
from transformers import AutoModelForCausalLM, AutoConfig

model_id = "mosaicml/mpt-7b"
config = AutoConfig.from_pretrained(
    model_id, trust_remote_code=True
)
with init_empty_weights():
    model = AutoModelForCausalLM.from_config(
        config, trust_remote_code=True
    )
@dakinggg
Copy link
Collaborator

Thanks for letting us know Younes, will look into this ASAP

@younesbelkada
Copy link
Author

Thanks @dakinggg !

@dakinggg
Copy link
Collaborator

@younesbelkada this should be resolved in the foundry code now, and I'm uploading the updated code to the hf hub as we speak.

@dakinggg
Copy link
Collaborator

Ok, this should be resolved completely now. Let me know if you see otherwise! Thanks again for the report :)

@younesbelkada
Copy link
Author

Works now like charm! Thanks for the quick fix @dakinggg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants