Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[js/web] revise backend registration #18715

Merged
merged 14 commits into from
Dec 20, 2023

Conversation

fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Dec 6, 2023

Description

This PR revises the backend registration.

The following describes the expected behavior after this change: (bolded are changed behavior)

  • (ort.min.js - built without webgpu support)
    • loading: do not register 'webgpu' backend
    • creating session without EP list: use default EP list ['webnn', 'cpu', 'wasm']
    • creating session with ['webgpu'] as EP list: should fail with backend not available
  • (ort.webgpu.min.js - built with webgpu support)
    • loading: always register 'webgpu' backend
      ( previous behavior: only register 'webgpu' backend when navigator.gpu is available)
    • creating session without EP list: use default EP list ['webgpu', 'webnn', 'cpu', 'wasm']
      • when WebGPU is available (win): use WebGPU backend
      • when WebGPU is unavailable (android): should fail backend init, and try to use next backend in the list, 'webnn'
        (previous behavior: does not fail backend init, but fail in JSEP init, which was too late to switch to next backend)
    • creating session with ['webgpu'] as EP list
      • when WebGPU is available (win): use WebGPU backend
      • when WebGPU is unavailable (android): **should fail backend init, and because no more EP listed, fail.

related PRs: #18190 #18144

@gyagp
Copy link

gyagp commented Dec 6, 2023

As I commented in the PR, the description has problem with EP list like ['webnn', 'webgpu'].
So the only diff between ort.min.js and ort.webgpu.min.js is webgpu support, due to concern of package size. For any release, a list of EPs will be served (designated by default or provided by developers), and webgpu will only be initialized when it's chosen.
BTW, for the default EP list, do we have any spec to describe the sequence? What's difference between cpu EP and wasm EP? Why do we put cpu prior to wasm?

@fs-eire
Copy link
Contributor Author

fs-eire commented Dec 6, 2023

As I commented in the PR, the description has problem with EP list like ['webnn', 'webgpu']. So the only diff between ort.min.js and ort.webgpu.min.js is webgpu support, due to concern of package size. For any release, a list of EPs will be served (designated by default or provided by developers), and webgpu will only be initialized when it's chosen. BTW, for the default EP list, do we have any spec to describe the sequence? What's difference between cpu EP and wasm EP? Why do we put cpu prior to wasm?

If we create session with executionProviders, the names will be used as EP list.
If we create session without executionProviders in session options, the default EP list will be used.

The default EP list is registered from lib/index.ts, with the BUILD_DEFS set correspondingly.

for example:

    registerBackend('webgpu', wasmBackend, 5);

This registered object wasmBackend with backend name "webgpu" and priority 5. The lower number means higher priority.

Technially they are all the same except webgl. For the "backend" concept defined in onnxruntime-common, there are only 2 backends implemented in onnxruntime-web: the web assembly backend and the webgl backend. Now in this PR, backend name is added into the init() function to allow the web assembly can do things differently when being called for a specific backend name.

The probject history may explain why the backend registery/resolve concept is confusing: when we migrated onnx.js to onnxruntime-web, the new concept "execution provider" came and it is similar to but yet different from the old "backend" concept. Now, "backend" is an internal concept and "execution provider" is the public term and used in API. Also, since the new onnxruntime-web is based on web assembly, for long term, every backend will be wasm backend.

@fs-eire
Copy link
Contributor Author

fs-eire commented Dec 12, 2023

This PR now includes all changes from #18756.

Initializations are now splitted into 3 steps for wasm. The last step is for initializations for EP speicific ( currently webgpu )

The wasm initialization steps are all combined and put into backend.init(), guarded as call_once.

The EP initialization step may be call multiple times, but for each EP name, it will also be called once.

Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I like this refactor.

js/web/lib/wasm/proxy-messages.ts Show resolved Hide resolved
Copy link

@gyagp gyagp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with 2 nits. Thanks for working on this!

js/web/lib/wasm/jsep/init.ts Outdated Show resolved Hide resolved
js/web/lib/wasm/jsep/init.ts Outdated Show resolved Hide resolved
satyajandhyala
satyajandhyala previously approved these changes Dec 19, 2023
Copy link
Contributor

@guschmue guschmue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on top of this we need to also implement onnxruntime.get_available_providers()
because depending on the provider the app might want to load a different model, say for wasm quantized, for webgpu and webnn fp16 so it needs to know before it creates the session.

@gyagp
Copy link

gyagp commented Dec 20, 2023

on top of this we need to also implement onnxruntime.get_available_providers() because depending on the provider the app might want to load a different model, say for wasm quantized, for webgpu and webnn fp16 so it needs to know before it creates the session.

EP is available doesn't mean it can fully support a specific model. We may suppose developers would test their models with a specific version of onnxruntime-web, and set their EP list correctly.

@fs-eire fs-eire merged commit 9a61388 into main Dec 20, 2023
92 of 100 checks passed
@fs-eire fs-eire deleted the fs-eire/allow-flexible-webgpu-backend-selection branch December 20, 2023 22:45
@guschmue
Copy link
Contributor

yes, and not all gpu's come equal even if the model works well on gpu in general.
Hard for developers to deal with this - longer term we should have some utility functions to help app developers with it.

siweic0 pushed a commit to siweic0/onnxruntime-web that referenced this pull request May 9, 2024
### Description
This PR revises the backend registration.

The following describes the expected behavior after this change:
(**bolded are changed behavior**)

- (ort.min.js - built without webgpu support)
    - loading: do not register 'webgpu' backend
- creating session without EP list: use default EP list ['webnn', 'cpu',
'wasm']
- creating session with ['webgpu'] as EP list: should fail with backend
not available
- (ort.webgpu.min.js - built with webgpu support)
    - loading: **always register 'webgpu' backend**
( previous behavior: only register 'webgpu' backend when `navigator.gpu`
is available)
- creating session without EP list: use default EP list ['webgpu',
'webnn', 'cpu', 'wasm']
        - when WebGPU is available (win): use WebGPU backend
- when WebGPU is unavailable (android): **should fail backend init,**
and try to use next backend in the list, 'webnn'
(previous behavior: does not fail backend init, but fail in JSEP init,
which was too late to switch to next backend)
    - creating session with ['webgpu'] as EP list
        - when WebGPU is available (win): use WebGPU backend
- when WebGPU is unavailable (android): **should fail backend init, and
because no more EP listed, fail.


related PRs: microsoft#18190 microsoft#18144
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants