-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading a .save()
'd model with revision
and trust_remote_code
re-downloads code at runtime?
#2613
Comments
Hello! I see you've already gone through a lot of steps to figure out what's happening here. Indeed, when saving a model whose modeling/configuration/tokenizers are custom and stored in a remote repository, these will not be saved as "local" in the resulting directory. Instead, the configuration files in this directory just point to the remote repository where these files can be found. This inhibits easy "out of the box" offline mode with these models, indeed. This is a design decision in As for For your specific case, we can resolve the problem like so:
With that recipe, you'll be able to "fully" download a model. After which, I think you should be able to load it with Also, there's only 2 models that are commonly used that rely on custom modeling files: Jina and Nomic, so you won't encounter this often at all. Hopefully you can apply this recipe to get it working nicely for your use case.
|
Due to a design choice in Transformers, remote-code embedded in a model on HF is downloaded at runtime and this cannot be disabled, despite our attempts to download it at build time. A practical result of this is that the built Docker container image needs networking access to download the remote code, which is not ideal. This fix is a workaround to download the remote code at build time, and embed it into the model by updating the `config.json` file appropriately. This should only be needed for Nomic-style models, apparently, since they are some of the only embedding models that use remote code. Many thanks to Tom Aarsen for providing me with the steps to fix this. Relevant issue: UKPLab/sentence-transformers#2613 (comment)
Really appreciate the help and pointers, with your steps I was quickly able to fix this using I'll try to take some time and write up a FR for |
Adding a simple code snippet implementing @tomaarsen's steps from above import json
from pathlib import Path
from huggingface_hub import hf_hub_download, snapshot_download
from sentence_transformers import SentenceTransformer
model_name = 'jinaai/jina-embeddings-v3'
# steps from https://github.com/UKPLab/sentence-transformers/issues/2613#issuecomment-2076964416
# 1. download model
model = SentenceTransformer(model_name, trust_remote_code=True)
# 2. save the model
local_model_path = Path('hf-models') / model_name.replace('/', '-')
model.save(str(local_model_path))
# 3. check config files
for f_path in local_model_path.iterdir():
if f_path.name in ['config.json', 'tokenizer_config.json']:
with open(f_path, 'r') as f:
config_dict = json.load(f)
# 4a. get repository and class_path from configs
auto_map_configs = config_dict.get('auto_map', {})
for config_name, config_value in auto_map_configs.items():
splits = config_value.split('--')
if len(splits) == 2:
repository, class_path = splits
elif len(splits) == 1:
repository = model_name
class_path = splits[0]
else:
raise ValueError(f'strange config {config_value}')
# 4b. download required configs from hf hub
snapshot_download(repo_id=repository, local_dir=local_model_path)
# 4c. update config files
config_dict['auto_map'][config_name] = class_path
with open(f_path, 'w') as f:
json.dump(config_dict, f) Sample test code for offline usage (recommended to test in a separate fresh environment, with newly installed import os
os.environ['HF_HUB_OFFLINE'] = '1'
from pathlib import Path
from sentence_transformers import SentenceTransformer
model_name = 'jinaai/jina-embeddings-v3'
local_model_path = Path('hf-models') / model_name.replace('/', '-')
model = SentenceTransformer(str(local_model_path), trust_remote_code=True, local_files_only=True)
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape) Note: @tomaarsen, thank you very much for describing the steps to implement this! I would be happy to raise it there properly if so. I have a simple real popular use case - Kaggle Notebook environments used in competitions. |
Nice work @VladKha! I think a feature request could still make some sense, but perhaps it's considered a bit too niche - very few models use third party repositories to host their code.
|
Hello,
Thanks for this library, it's pretty nice! I have a project using it that basically exposes sentence embeddings as a stateless HTTP API. In order to make distribution easy and consistent, it packages supported models directly with the code inside the distributed docker images I build. I use
.save()
in order to save all the models in a first step, then put them in the docker image in a second step. The intent is that the server will never need to download any new files and can run completely isolated from the public internet if you use the docker image, and all model versions will always be consistent.https://github.com/thoughtpolice/embedding-server
This is currently using sentence-transformers 2.6.1
One of the models I'm using is
nomic-ai/nomic-embed-text-v1
, which requirestrust_remote_code=True
. In order to keep things hermetic, I also provide arevision
argument for the HF repository to pin it to a specific revision.However, it seems like the
.save()
call does not actually save the runtime configuration code appropriately or something, so every subsequent run of the docker container immediately re-downloads the remote code and runs it.For example, try the following command:
You'll see this:
Looking at the provided directory, neither of these files are there:
So this means the download is incomplete, for some reason, i.e.
.save()
does not save all the needed files, in particular the two.py
files in the upstream HF repo: https://huggingface.co/nomic-ai/nomic-embed-text-v1/tree/mainThis is especially misleading because the warning message suggests to use
revision
to pin the version of the file, but it doesn't work in the case of.save()
I guess (maybe there are some call sites where therevision
argument was forgotten?)I actually tried to fix this but to no avail. It seems that just putting the files in the data directory next to the weights isn't enough; it will redownload those files to
$HF_HOME
, instead, and load them from there, so this fix does not work.I'm afraid I'm not very good at Python and don't know if this is a bug in sentence-transformers, my code, or the distributed nomic HF model.
The code for saving and loading models is done here. It basically just calls
.save
or the constructor with the provided model arguments, and the build system calls--save-models-to
in order to put them all in a directory. But I could have gotten it wrong: https://github.com/thoughtpolice/embedding-server/blob/main/embedding-server.py#L116-L148This isn't a huge deal for me right now, but it would be nice if this could work since I did like that the server could run 100% offline. Thanks.
The text was updated successfully, but these errors were encountered: