[Feature request] Add Support for vicuna-13b-delta-v1.1 #96

DK013 · 2023-04-21T06:25:42Z

**support for vicuna-13b-delta-v1.1
NOTE: It's not listed in transformers supported models list but it does work with transformers

Reason for request
With the upcoming WebGPU support in ONNXRuntime I believe it'll be really helpful to have an LLm support for browser based applications and this repo is the best solution we got so far.

Additional context
I've been working on an AI assistant made in electron and cordova for desktop and mobile platforms respectively. I'm already using TransformerJS with whisper for speech-to-text. I intend to switch to WebGPU with JSEP as soon it's available, so I can leverage the GPU compute capabilities to run larger models. I'm trying to build the project with as much opensource resources as possible and having an LLM support would be real nice instead of using openai apis. This keeps the project cost free for users and user's data-privacy is another benifit. I'm really looking forward to see if this is gonna be possible. I'm willing to contribute as much as I can being a complete novice to the ML community.

Thanks in advance

xenova · 2023-04-21T21:05:58Z

I believe it'll be really helpful to have an LLm support for browser based applications and this repo is the best solution we got so far.

Agreed! WebGPU for onnxruntime-web is almost here (see microsoft/onnxruntime#14579), and Transformers.js will support it when ready! There will be a massive announcement when it does drop! As for now, it is just a matter of waiting 😅 ...

DK013 · 2023-04-25T02:56:59Z

Agreed! WebGPU for onnxruntime-web is almost here (see microsoft/onnxruntime#14579), and Transformers.js will support it when ready! There will be a massive announcement when it does drop! As for now, it is just a matter of waiting 😅 ...

It's here !! They just merged the [js/web] WebGPU backend via JSEP #14579 few hours ago into the main brunch. No official release yet. Looks like @fs-eire openned another pull request for code cleanup and some small fixes. But we can build from the main brunch and start coding 😄

fs-eire · 2023-04-25T18:44:07Z

it takes some time and effort from enabling building from source, to including in the NPM package, to release as experimental feature, and then to final release.

I will keep working on stability, performance and coverage of the webgpu backend operator implementation in ort-web. this is going to be long-term work. Please feel free to "@" me or submit github issues to onnxruntime for feedback.

xenova · 2023-04-27T00:21:03Z

@fs-eire I have been following the build instructions from here, and I think I've sorted out most issues regarding dependencies, versions, etc.

However, when running in the browser, I get the error JS execution provider is not supported in this build. Understandably, the docs have not yet properly been updated, so, would it be possible for you to provide steps to build from source, including what build arguments I should use? Thanks!

DK013 · 2023-04-27T04:26:31Z

I've been looking at the updated files and under <ORT_ROOT>/tools/ci_build/build.py I can see there's arguments for use_jsep but it works within build_wasm from what I can tell checking the yml configs for CI.
So I guess the build command will look something like this: ./build.bat --build_wasm --enable_wasm_simd --use_jsep --target onnxruntime_webassembly.
However, if you've cloned the repo as instructed here, chances are you don't have the latest source and the --use_jsep argument will fail. Simply download zip from main brunch and replace the files with latest version in local.
@xenova Give it a try

[UPDATE]: Here's what worked for me...
Build 4 times with these args:

./build.bat --config Release --build_wasm --use_jsep --skip_tests --disable_wasm_exception_catching --disable_rtti
./build.bat --config Release --build_wasm --use_jsep --skip_tests --disable_wasm_exception_catching --disable_rtti --enable_wasm_threads
./build.bat --config Release --build_wasm --use_jsep --skip_tests --disable_wasm_exception_catching --disable_rtti --enable_wasm_simd
./build.bat --config Release --build_wasm --use_jsep --skip_tests --disable_wasm_exception_catching --disable_rtti --enable_wasm_threads --enable_wasm_simd

Then copy files as instructed in the documentation and build npm package.

I'm attaching my successful npm package build zip here just in case if you wanna be lazy. 😘
ort.zip

xenova · 2023-04-27T15:14:23Z

However, if you've cloned the repo as instructed here, chances are you don't have the latest source and the --use_jsep argument will fail. Simply download zip from main brunch and replace the files with latest version in local.

That was it! Commit history says that change was made yesterday, so I was one day out of date haha.

The package is being built now, and I will hopefully have more updates later today 🎉

I'm attaching my successful npm package build zip here just in case if you wanna be lazy. 😘
ort.zip

This will come in handy! Thanks!

xenova · 2023-04-27T17:00:49Z

So, I did that all, but it doesn't seem to run :/ Here's the error I get: microsoft/onnxruntime#15719

@DK013 If you're able to, could you try run the model linked here with the following input:

let input = {
    attention_mask: new Tensor(
        'int64',
        new BigInt64Array([1n, 1n, 1n, 1n, 1n, 1n, 1n, 1n, 1n, 1n, 1n, 1n]),
        [1, 12]
    ),
    input_ids: new Tensor(
        'int64',
        new BigInt64Array([13959n, 1566n, 12n, 2379n, 10n, 8774n, 6n, 149n, 33n, 25n, 58n, 1n]),
        [1, 12]
    )
}

or see here for a full demo to test with.

fs-eire · 2023-04-28T08:01:47Z

@DK013 please use the workaround as described in microsoft/onnxruntime#15719 (comment) . I am working on a solution to fix the issue.

DK013 · 2023-05-05T04:48:20Z

@xenova As I can see fs-eire is working on the issues we've encountered before, and I'm running behind schedule on my own project, I'm implementing transformers.js with cpu for now in my code. Mainly I need whisper (a little more than the base model hopefully) to work right now and a suitable LLM model later on. So I'm gonna go ahead and complete the basic codes for testing right now, and wait for once you guys are done with polishing webgpu.
If you need any input from me, just let me know. 😉

fs-eire · 2023-05-05T06:36:28Z

@DK013 The PR mentioned above is merged, and another bugfix is in PR: microsoft/onnxruntime#15819.

matthoffner · 2023-05-31T01:09:07Z

Catching up on this issue, does this mean there is conversational model support with the onnxruntime PRs? The README shows it as not supported yet. Thanks for any clarification!

xenova · 2023-05-31T13:25:52Z

does this mean there is conversational model support with the onnxruntime PRs?

The vicuna-13b-delta-v1.1 is categorized as a text-generation model (not a conversational model), which is supported by Transformers.js. The distinction (which mainly lies in how they are used) is subtle, as both can be used for "conversations". For more information, see:

DK013 · 2023-06-02T18:54:50Z

@xenova heyo, I've been a bit busy with my own projects and running the business and all. what's the status of webgpu? How are your tests going?

DK013 added the enhancement New feature or request label Apr 21, 2023

xenova mentioned this issue Apr 27, 2023

[Web] An error occurred during model execution: "TypeError: Cannot read properties of undefined (reading 'apply')". microsoft/onnxruntime#15719

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Add Support for vicuna-13b-delta-v1.1 #96

[Feature request] Add Support for vicuna-13b-delta-v1.1 #96

DK013 commented Apr 21, 2023

xenova commented Apr 21, 2023

DK013 commented Apr 25, 2023 •

edited

Loading

fs-eire commented Apr 25, 2023

xenova commented Apr 27, 2023

DK013 commented Apr 27, 2023 •

edited

Loading

xenova commented Apr 27, 2023

xenova commented Apr 27, 2023 •

edited

Loading

fs-eire commented Apr 28, 2023

DK013 commented May 5, 2023

fs-eire commented May 5, 2023

matthoffner commented May 31, 2023

xenova commented May 31, 2023

DK013 commented Jun 2, 2023

[Feature request] Add Support for vicuna-13b-delta-v1.1 #96

[Feature request] Add Support for vicuna-13b-delta-v1.1 #96

Comments

DK013 commented Apr 21, 2023

xenova commented Apr 21, 2023

DK013 commented Apr 25, 2023 • edited Loading

fs-eire commented Apr 25, 2023

xenova commented Apr 27, 2023

DK013 commented Apr 27, 2023 • edited Loading

xenova commented Apr 27, 2023

xenova commented Apr 27, 2023 • edited Loading

fs-eire commented Apr 28, 2023

DK013 commented May 5, 2023

fs-eire commented May 5, 2023

matthoffner commented May 31, 2023

xenova commented May 31, 2023

DK013 commented Jun 2, 2023

DK013 commented Apr 25, 2023 •

edited

Loading

DK013 commented Apr 27, 2023 •

edited

Loading

xenova commented Apr 27, 2023 •

edited

Loading