Web-LLM #10

pacoccino · 2024-02-13T12:51:11Z

What about using Web-LLM instead of running an ollama server ?

https://github.com/mlc-ai/web-llm

Runs models in the browser via WASM/WebGPU

jacoblee93 · 2024-03-11T16:59:32Z

Would be sweet! I tried with it initially but had some technical issues around caching - plus initial load time was pretty rough.

Would love to revisit though now especially now that stuff like Gemma is making waves.

pacoccino · 2024-03-11T17:23:53Z

I've done a small experiment creating a chrome extension that hosts and run local models: https://github.com/pacoccino/ai-mask

What issues did you face with web-llm ? I'll try forking your app and make it work with the extension

jacoblee93 · 2024-03-12T07:28:00Z

Oh neat!! Had intended to try the same actually. Will check it out.

It was ~8 months ago now but it had to do with lack of caching/docs on how to set it up. Every time I refreshed the app it would redownload a model until my computer ran out of memory.

pacoccino · 2024-03-12T08:04:21Z

I've drafted a PR just to show it works: #16

It works great through I got some technical issues with the web worker. Caching works and the extension store the models once for any app that needs them !

AI-mask is an experiment, I'd like to know what you think about it and if it could be interesting to push it forward 🤔

jacoblee93 · 2024-03-17T22:11:35Z

Added separately! Thank you for the issue!

Re: AI-mask - I have been meaning to try building something similar myself for a long time now, and I think WebLLM is getting good enough where it's useful.

My thought would be to basically expose the equivalent of a LangServe endpoint in the Chrome extension:

https://github.com/langchain-ai/langserve

So then a web developer could use a remote runnable to build chains with the familiar invoke/batch/stream/streamLog API in LangChain.js:

https://js.langchain.com/docs/ecosystem/langserve

I don't expect you to add this to AI-mask but would definitely encourage you to keep it up! I think there's really something there.

pacoccino · 2024-03-22T11:01:59Z

I've open a new PR #19 with better support for AI-Mask.

@jacoblee93 About your thought exposing an equivalent of Langserve endpoint from a chrome extension, could you elaborate ? What would be the difference for a web dev between using a Remote Runnable and a lib like ChatAIMask I'm using now ?

jacoblee93 · 2024-03-22T16:42:13Z

Ah cool! Will try to take a look this weekend.

In general, it'd allow for some common operations we've seen folks want when building with LangChain. But actually now that I think about it the only important one beyond simple .invoke is .stream since this would just make the model available!

By implementing the web client to an extension (like I think you've done with ChatAIMask?) as a runnable you'd basically be able to swap out e.g. ChatOpenAI for ChatAIMask and build things like this completely locally:

https://js.langchain.com/docs/use_cases/question_answering/

But yeah I think you are right in that the full LangServe suite is not necessary. Looking forward to digging in!

pacoccino changed the title ~~web-llm~~ Web-LLM Feb 13, 2024

jacoblee93 closed this as completed Mar 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web-LLM #10

Web-LLM #10

pacoccino commented Feb 13, 2024 •

edited

Loading

jacoblee93 commented Mar 11, 2024

pacoccino commented Mar 11, 2024

jacoblee93 commented Mar 12, 2024

pacoccino commented Mar 12, 2024

jacoblee93 commented Mar 17, 2024

pacoccino commented Mar 22, 2024 •

edited

Loading

jacoblee93 commented Mar 22, 2024

Web-LLM #10

Web-LLM #10

Comments

pacoccino commented Feb 13, 2024 • edited Loading

jacoblee93 commented Mar 11, 2024

pacoccino commented Mar 11, 2024

jacoblee93 commented Mar 12, 2024

pacoccino commented Mar 12, 2024

jacoblee93 commented Mar 17, 2024

pacoccino commented Mar 22, 2024 • edited Loading

jacoblee93 commented Mar 22, 2024

pacoccino commented Feb 13, 2024 •

edited

Loading

pacoccino commented Mar 22, 2024 •

edited

Loading