You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Uncaught (in promise) Error: no available backend found. ERR: [webgpu]
TypeError: import() is disallowed on ServiceWorkerGlobalScope by the HTML specification.
See https://github.com/w3c/ServiceWorker/issues/1356.
If I'm doing something wrong, would love some help here; Otherwise this might be an issue with onnx or some other part of how transformersjs uses it.
Reproduction
You can put this in the extension's background.js, mostly copied from the phi3 WebGPU-chat and extension examples:
// background.js - Handles requests from the UI, runs the model, then sends back a responseimport{pipeline,env,AutoModelForCausalLM,AutoTokenizer,TextStreamer,StoppingCriteria,}from"@xenova/transformers";// Skip initial check for local models, since we are not loading any local models.env.allowLocalModels=false;// Due to a bug in onnxruntime-web, we must disable multithreading for now.// See https://github.com/microsoft/onnxruntime/issues/14445 for more information.env.backends.onnx.wasm.numThreads=1;// env.backends.onnx.wasm.wasmPaths =// "https://cdn.jsdelivr.net/npm/[email protected]/dist/";classCallbackTextStreamerextendsTextStreamer{constructor(tokenizer,cb){super(tokenizer,{skip_prompt: true,skip_special_tokens: true,});this.cb=cb;}on_finalized_text(text){this.cb(text);}}classInterruptableStoppingCriteriaextendsStoppingCriteria{constructor(){super();this.interrupted=false;}interrupt(){this.interrupted=true;}reset(){this.interrupted=false;}_call(input_ids,scores){returnnewArray(input_ids.length).fill(this.interrupted);}}conststopping_criteria=newInterruptableStoppingCriteria();asyncfunctionhasFp16(){try{constadapter=awaitnavigator.gpu.requestAdapter();returnadapter.features.has("shader-f16");}catch(e){returnfalse;}}classPipelineSingleton{statictask="feature-extraction";staticmodel_id="Xenova/Phi-3-mini-4k-instruct_fp16";staticmodel=null;staticinstance=null;staticasyncgetInstance(progress_callback=null){this.model_id??=(awaithasFp16())
? "Xenova/Phi-3-mini-4k-instruct_fp16"
: "Xenova/Phi-3-mini-4k-instruct";this.tokenizer??=AutoTokenizer.from_pretrained(this.model_id,{legacy: true,
progress_callback,});this.model??=AutoModelForCausalLM.from_pretrained(this.model_id,{dtype: "q4",device: "webgpu",use_external_data_format: true,
progress_callback,});returnPromise.all([this.tokenizer,this.model]);}}// Create generic classify function, which will be reused for the different types of events.constclassify=async(text)=>{// Get the pipeline instance. This will load and build the model when run for the first time.const[tokenizer,model]=awaitPipelineSingleton.getInstance((data)=>{// You can track the progress of the pipeline creation here.// e.g., you can send `data` back to the UI to indicate a progress barconsole.log("progress",data);// data logs as this:/** * * { "status": "progress", "name": "Xenova/Phi-3-mini-4k-instruct_fp16", "file": "onnx/model_q4.onnx", "progress": 99.80381792394503, "loaded": 836435968, "total": 838080131 } when complete, last status will be 'done' */});/////////////constinputs=tokenizer.apply_chat_template(text,{add_generation_prompt: true,return_dict: true,});letstartTime;letnumTokens=0;constcb=(output)=>{startTime??=performance.now();lettps;if(numTokens++>0){tps=(numTokens/(performance.now()-startTime))*1000;}self.postMessage({status: "update",
output,
tps,
numTokens,});};conststreamer=newCallbackTextStreamer(tokenizer,cb);// Tell the main thread we are startingself.postMessage({status: "start"});constoutputs=awaitmodel.generate({
...inputs,max_new_tokens: 512,
streamer,
stopping_criteria,});constoutputText=tokenizer.batch_decode(outputs,{skip_special_tokens: false,});// Send the output back to the main threadself.postMessage({status: "complete",output: outputText,});///////////////// Actually run the model on the input text// let result = await model(text);// return result;};////////////////////// 1. Context Menus ////////////////////////// Add a listener to create the initial context menu items,// context menu items only need to be created at runtime.onInstalledchrome.runtime.onInstalled.addListener(function(){// Register a context menu item that will only show up for selection text.chrome.contextMenus.create({id: "classify-selection",title: 'Classify "%s"',contexts: ["selection"],});});// Perform inference when the user clicks a context menuchrome.contextMenus.onClicked.addListener(async(info,tab)=>{// Ignore context menu clicks that are not for classifications (or when there is no input)if(info.menuItemId!=="classify-selection"||!info.selectionText)return;// Perform classification on the selected textletresult=awaitclassify(info.selectionText);// Do something with the resultchrome.scripting.executeScript({target: {tabId: tab.id},// Run in the tab that the user clicked inargs: [result],// The arguments to pass to the functionfunction: (result)=>{// The function to run// NOTE: This function is run in the context of the web page, meaning that `document` is available.console.log("result",result);console.log("document",document);},});});//////////////////////////////////////////////////////////////////////////////////// 2. Message Events ///////////////////////// Listen for messages from the UI, process it, and send the result back.chrome.runtime.onMessage.addListener((message,sender,sendResponse)=>{console.log("sender",sender);if(message.action!=="classify")return;// Ignore messages that are not meant for classification.// Run model prediction asynchronously(asyncfunction(){// Perform classificationletresult=awaitclassify(message.text);// Send response back to UIsendResponse(result);})();// return true to indicate we will send a response asynchronously// see https://stackoverflow.com/a/46628145 for more informationreturntrue;});
The text was updated successfully, but these errors were encountered:
It might also help for me to add this is originating in onnx-runtime resolveBackendAndExecutionProviders implementation in the InferenceSession class. This might ultimately boil down to being a onnx-runtime issue though I haven't seen any issues raised in that repo around service workers.
it looks like this really might be an onnx-runtime issue and it's being worked on over at that end; might needs a version bump on transformers.js when that gets resolved
System Info
MacOS 14.4.1, Chrome 125.
Environment/Platform
Description
I'm trying out the Phi3 WebGPU chat example based on transformers.js v3, but inside of the chrome extension example
But I keep getting this error, no matter what:
Which occurs here, after downloading the model:
If I remove the device param, it tries to use [wasm] as the backend, but this also fails.
Chrome recently fixed this issue and made the WebGPU API available to service workers..
Here is an example extension from the mlc-ai/web-llm package that implements WebGPU usage in service workers successfully:
https://github.com/mlc-ai/web-llm/tree/main/examples/chrome-extension-webgpu-service-worker
Here is some further discussion on this new support from Google itself:
https://groups.google.com/a/chromium.org/g/chromium-extensions/c/ZEcSLsjCw84/m/WkQa5LAHAQAJ
If I'm doing something wrong, would love some help here; Otherwise this might be an issue with onnx or some other part of how transformersjs uses it.
Reproduction
You can put this in the extension's background.js, mostly copied from the phi3 WebGPU-chat and extension examples:
The text was updated successfully, but these errors were encountered: