You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm encountering an issue while trying to load a text-generation model using the pipeline function. My code searches for models on Hugging Face by keyword, filters for "text-generation", "transformers.js", and "onnx" tags, and attempts to load the resulting models. When I attempt to load certain models, an error with a numeric code (e.g., 3330359752) is thrown. The numeric code changes each time, suggesting it may not hold specific meaning.
Notably, it seems that smaller models load successfully, while larger models may consistently throw this error, though I’m not entirely certain.
While this might be an out-of-memory issues, the models have been tested and work in Node.js, so maybe a runtime error occurring for WebGPU. cc @guschmue
Facing the same issue. I'm admittedly out my depth in terms of being able to figure out the issue however I've noticed the dtypes being important
e.g.
as per OPs message Llama-3.2-1B-Instruct will not load but the q4f16 version will.
onnx-community/AMD-OLMo-1B-SFT-DPO will not automatically load, but being explicit with {device: "webgpu", dtype: "q4" } it will load
Additional Context:
Also on an Apple-Silicone M1 chip.
I have been able to load even the Llama-3.1-8B-Instruct-q4f32_1-MLC model using WebLLM which makes me think memory is maybe not the issue
System Info
Environment/Platform
Description
I'm encountering an issue while trying to load a text-generation model using the pipeline function. My code searches for models on Hugging Face by keyword, filters for "text-generation", "transformers.js", and "onnx" tags, and attempts to load the resulting models. When I attempt to load certain models, an error with a numeric code (e.g., 3330359752) is thrown. The numeric code changes each time, suggesting it may not hold specific meaning.
Notably, it seems that smaller models load successfully, while larger models may consistently throw this error, though I’m not entirely certain.
Models that loaded successfully:
Models that failed to load:
Reproduction
The text was updated successfully, but these errors were encountered: