-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V3 proposal] Improved defaults for quantization and device selection #960
Comments
Current logic for session selection: https://github.com/xenova/transformers.js/blob/6505abb164a3eea1dd5e80e56a72f7d805715f0a/src/models.js#L148-L262 |
Some thoughts:
Currently, the distinction between our I propose:
This class will encapsulate all of the logic wrt to devices and converting a The We should create a |
Congrats with v3 release! I have a couple of questions about WebNN @xenova: I see that the respective device family is listed here transformers.js/src/utils/devices.js Line 14 in 6505abb
@FL33TW00D A follow-up to my above question: in your device selection diagram, there is no WebNN. As it's highly specialized and can leverage both NPUs and GPUs, could it be a first choice in the future? (after the API stabilizes and it becomes available for all mainstream browsers without the flags). What do you think? I'm asking because I have some tech sessions scheduled where I plan to present about "client-side AI" in the browsers with WebNN in the focus. And I plan to use Transformers.js v3 for the demo (started to experiment with WebNN in it right after it landed in 3-alpha version). Thanks! |
Feature request
Currently, Transformers.js V3 defaults to use CPU (WASM) instead of GPU (WebGPU) due to lack of support and instability across browsers (specifically Firefox and Safari, and Chrome in Ubuntu). However, this provides a poor user experience since is performance left on the table. As browser support for WebGPU increases (currently ~70%), this will become more important since users may experience poor performance when better settings are available.
A better proposal should be to use
device: "auto"
instead ofdevice: null
by default, which should select (1) quantization and (2) device) based on the following:Motivation
Improve user experience and performance with better defaults
Your contribution
Will work with @FL33TW00D on this
The text was updated successfully, but these errors were encountered: