-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Error "model content changed" while loading pretrained models #1571
Comments
Dup of #844 ? |
@austintlee Although the message is same, I think the underlying issue is different. #844 only happens in MacOS, when you try to redeploy the model. But in this case, I was able to reproduce the issue in managed service when only the instance capacity is small: https://forum.opensearch.org/t/error-when-loading-embedding-model-into-memory/15351/6 For the bigger instances I didn't see this issue. |
using version 2.11 only happens when i try to upload url models. when i use pretrained from hugging faec works just fine. |
For your case, you faced this issue |
Duplicate issue as #844 (comment) , close this one |
What is the bug?
We are returned with error "model content changed" while loading pretrained models in smaller instances like t3.small due to insufficient memory.
How can one reproduce the bug?
Steps to reproduce the behavior:
Create an OpenSearch cluster with low memory instances like t3.small (2GB Memory) and perform the below steps.
Step 1) Register the model_id. This provides the task ID that registers the model.
POST /_plugins/_ml/models/_upload
{
"name": "huggingface/sentence-transformers/all-MiniLM-L12-v2",
"version": "1.0.1",
"model_format": "TORCH_SCRIPT"
}
Step 2) Verify the registration of model. This provides the registered model_id.
GET /_plugins/_ml/tasks/<task_id>
Step 3) Deploy the model_id. This provides the task ID that deploys the model.
POST /_plugins/_ml/models/<model_id>/_load
Step 4) Verify the deployment of model_id
GET /_plugins/_ml/tasks/<task_id>
What is the expected behavior?
For resolution, we need to upgrade to instances with more memory. However, the error message itself is not indicative of the cause. The error could be rephrased to some thing like "Insufficient memory" so that users know the cause of the error.
What is your host/environment?
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
The text was updated successfully, but these errors were encountered: