-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[js/web] revise backend registration #18715
[js/web] revise backend registration #18715
Conversation
As I commented in the PR, the description has problem with EP list like ['webnn', 'webgpu']. |
If we create session with The default EP list is registered from lib/index.ts, with the BUILD_DEFS set correspondingly. for example: registerBackend('webgpu', wasmBackend, 5); This registered object Technially they are all the same except webgl. For the "backend" concept defined in onnxruntime-common, there are only 2 backends implemented in onnxruntime-web: the web assembly backend and the webgl backend. Now in this PR, backend name is added into the The probject history may explain why the backend registery/resolve concept is confusing: when we migrated onnx.js to onnxruntime-web, the new concept "execution provider" came and it is similar to but yet different from the old "backend" concept. Now, "backend" is an internal concept and "execution provider" is the public term and used in API. Also, since the new onnxruntime-web is based on web assembly, for long term, every backend will be wasm backend. |
…xible-webgpu-backend-selection
…e-webgpu-backend-selection
This PR now includes all changes from #18756. Initializations are now splitted into 3 steps for wasm. The last step is for initializations for EP speicific ( currently webgpu ) The wasm initialization steps are all combined and put into backend.init(), guarded as call_once. The EP initialization step may be call multiple times, but for each EP name, it will also be called once. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I like this refactor.
…e-webgpu-backend-selection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with 2 nits. Thanks for working on this!
…e-webgpu-backend-selection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on top of this we need to also implement onnxruntime.get_available_providers()
because depending on the provider the app might want to load a different model, say for wasm quantized, for webgpu and webnn fp16 so it needs to know before it creates the session.
EP is available doesn't mean it can fully support a specific model. We may suppose developers would test their models with a specific version of onnxruntime-web, and set their EP list correctly. |
yes, and not all gpu's come equal even if the model works well on gpu in general. |
### Description This PR revises the backend registration. The following describes the expected behavior after this change: (**bolded are changed behavior**) - (ort.min.js - built without webgpu support) - loading: do not register 'webgpu' backend - creating session without EP list: use default EP list ['webnn', 'cpu', 'wasm'] - creating session with ['webgpu'] as EP list: should fail with backend not available - (ort.webgpu.min.js - built with webgpu support) - loading: **always register 'webgpu' backend** ( previous behavior: only register 'webgpu' backend when `navigator.gpu` is available) - creating session without EP list: use default EP list ['webgpu', 'webnn', 'cpu', 'wasm'] - when WebGPU is available (win): use WebGPU backend - when WebGPU is unavailable (android): **should fail backend init,** and try to use next backend in the list, 'webnn' (previous behavior: does not fail backend init, but fail in JSEP init, which was too late to switch to next backend) - creating session with ['webgpu'] as EP list - when WebGPU is available (win): use WebGPU backend - when WebGPU is unavailable (android): **should fail backend init, and because no more EP listed, fail. related PRs: microsoft#18190 microsoft#18144
Description
This PR revises the backend registration.
The following describes the expected behavior after this change: (bolded are changed behavior)
( previous behavior: only register 'webgpu' backend when
navigator.gpu
is available)(previous behavior: does not fail backend init, but fail in JSEP init, which was too late to switch to next backend)
related PRs: #18190 #18144