Smart execution providers #35

DavidGOrtega · 2023-03-20T18:40:51Z

closes Current use of execution providers is suboptimal #30

The purpose of this PR is:

Best effort to use the gpu over cpu if not fallbacks to wasm
Use onnxruntime-node in case is installed as dep in any node app using transformers.js (5x faster than WASM). If not installed it will falback to WASM provider.
Adds the hack for this issue setting numThreads to one
Use onnx executorProviders fallback chain to improve faster inference in browser if the model supports it.
Wraps multiple onnxruntime-web requirement dependancy under one file

DavidGOrtega · 2023-03-20T18:50:39Z

@xenova This should do the trick. I need to test the web and check that the fallback is proceeding fine.

xenova · 2023-03-20T18:57:55Z

README.md

dependency is correct :)

xenova · 2023-03-20T19:01:34Z

This is great! Thanks for putting the time in to get it working. I'll test on my side and merge as soon as I can :)

xenova · 2023-03-20T22:05:32Z

src/tensor_utils.js

Does the execution provided switch to wasm if the webgl backend fails? If so, then this is alright. If not, then I am slightly worried about using backends (webgl/cuda/webgpu) that do not fully support the necessary operations.

Can you confirm?

ONNX will fallback to the next one if its fails. However this behaviour is flaky and because f that I have not included all the backends. With the ones selected we should be ok.

xenova · 2023-03-20T22:08:12Z

src/env.js

I see you're accessing ONNX through the tensor utils file, but I think it would be better if we create a separate file (e.g., backend.js or onnx.js) which handles the loading and fallbacks of the various imports.

I had the same feeling but I did not want to be too big or out of the scope.
I think that we need also to review tensor_utils, i.e. I do not think that we have to implement softmax ourselves

xenova

Looks good overall - just some questions about how fallbacks are handled and organization details

DavidGOrtega · 2023-03-21T21:35:09Z

@xenova Im not totally happy with the PR unless we remove all the backends and allow the user to install node bindings, which are much faster than wasm.

Also the fallback is not working if the model do not support a layer 🤦

I think that we should provide a way to expose the desired executor.
So I would change the PR to do this.

let ONNX;
let executionProviders = [ 'wasm' ];

try {
    ONNX = require('onnxruntime-node');
    executionProviders = [ 'cuda', 'cpu' ];
} catch (err) {
    ONNX = require('onnxruntime-web');
    if(typeof process === 'object') {
        // https://github.com/microsoft/onnxruntime/issues/10311
        ONNX.env.wasm.numThreads = 1;
    }
}

With the code below we have at least fixed a rough edge and allow the user to use node bindings if desired.

What do you think?

xenova · 2023-03-21T23:27:10Z

Yeah I agree 👍 Once WebGPU releases, I'll be more focused on getting GPU support working (both for node and browser).

I'll merge this main into this PR, make some edits, then merge the PR back into main (hopefully soon haha). Thanks again for your contributions!

xenova · 2023-03-22T02:34:30Z

I ran some tests to see what kind of speedup these changes make, and it's amazing!

Task	Speedup
Text classification	1400%
Question answering	600%
Image-to-text	400%
Text-to-text generation	350%
Code generation	350%
Embeddings	325%
Masked language modelling	300%
Translation	225%
Summarization	200%
Zero-shot image classification	200%
Image classification	175%
Text generation	150%

🎉

I'm doing some final merging and will hopefully get it published soon :)

xenova · 2023-03-22T02:55:46Z

@DavidGOrtega Can you grant me write access to your fork? I would like to push the changes without having to create a new fork and make a PR. (I think you can just add me as a collaborator?)

Smart execution providers (Merges #35 into main)

xenova · 2023-03-22T13:07:06Z

Got it working, PR merged! 🎉 Thanks again for your contributions!

DavidGOrtega added 5 commits March 20, 2023 19:28

Smart execution providers

48293d3

update README

1ebbfba

fix README

5e90047

small README clarification

a75901d

fix README typo

e6d8abd

DavidGOrtega marked this pull request as draft March 20, 2023 18:47

xenova reviewed Mar 20, 2023

View reviewed changes

README.md

Copy link

Collaborator

xenova Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dependency is correct :)

xenova reviewed Mar 20, 2023

View reviewed changes

xenova requested changes Mar 20, 2023

View reviewed changes

catch wasm safety

7fa458e

xenova added a commit that referenced this pull request Mar 22, 2023

Merge pull request #46 from xenova/smart-execution-providers

b936cb8

Smart execution providers (Merges #35 into main)

xenova merged commit cd6aafe into huggingface:main Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smart execution providers #35

Smart execution providers #35

DavidGOrtega commented Mar 20, 2023 •

edited

Loading

DavidGOrtega commented Mar 20, 2023

xenova Mar 20, 2023

xenova commented Mar 20, 2023

xenova Mar 20, 2023

DavidGOrtega Mar 21, 2023 •

edited

Loading

xenova Mar 20, 2023

DavidGOrtega Mar 21, 2023

xenova left a comment

DavidGOrtega commented Mar 21, 2023

xenova commented Mar 21, 2023

xenova commented Mar 22, 2023

xenova commented Mar 22, 2023

xenova commented Mar 22, 2023

Smart execution providers #35

Smart execution providers #35

Conversation

DavidGOrtega commented Mar 20, 2023 • edited Loading

DavidGOrtega commented Mar 20, 2023

xenova Mar 20, 2023

Choose a reason for hiding this comment

xenova commented Mar 20, 2023

xenova Mar 20, 2023

Choose a reason for hiding this comment

DavidGOrtega Mar 21, 2023 • edited Loading

Choose a reason for hiding this comment

xenova Mar 20, 2023

Choose a reason for hiding this comment

DavidGOrtega Mar 21, 2023

Choose a reason for hiding this comment

xenova left a comment

Choose a reason for hiding this comment

DavidGOrtega commented Mar 21, 2023

xenova commented Mar 21, 2023

xenova commented Mar 22, 2023

xenova commented Mar 22, 2023

xenova commented Mar 22, 2023

DavidGOrtega commented Mar 20, 2023 •

edited

Loading

DavidGOrtega Mar 21, 2023 •

edited

Loading