Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web-LLM #10

Closed
pacoccino opened this issue Feb 13, 2024 · 7 comments
Closed

Web-LLM #10

pacoccino opened this issue Feb 13, 2024 · 7 comments

Comments

@pacoccino
Copy link

pacoccino commented Feb 13, 2024

What about using Web-LLM instead of running an ollama server ?

https://github.com/mlc-ai/web-llm

Runs models in the browser via WASM/WebGPU

@pacoccino pacoccino changed the title web-llm Web-LLM Feb 13, 2024
@jacoblee93
Copy link
Owner

Would be sweet! I tried with it initially but had some technical issues around caching - plus initial load time was pretty rough.

Would love to revisit though now especially now that stuff like Gemma is making waves.

@pacoccino
Copy link
Author

I've done a small experiment creating a chrome extension that hosts and run local models: https://github.com/pacoccino/ai-mask

What issues did you face with web-llm ? I'll try forking your app and make it work with the extension

@jacoblee93
Copy link
Owner

Oh neat!! Had intended to try the same actually. Will check it out.

It was ~8 months ago now but it had to do with lack of caching/docs on how to set it up. Every time I refreshed the app it would redownload a model until my computer ran out of memory.

@pacoccino
Copy link
Author

I've drafted a PR just to show it works: #16

It works great through I got some technical issues with the web worker. Caching works and the extension store the models once for any app that needs them !

AI-mask is an experiment, I'd like to know what you think about it and if it could be interesting to push it forward 🤔

@jacoblee93
Copy link
Owner

Added separately! Thank you for the issue!

Re: AI-mask - I have been meaning to try building something similar myself for a long time now, and I think WebLLM is getting good enough where it's useful.

My thought would be to basically expose the equivalent of a LangServe endpoint in the Chrome extension:

https://github.com/langchain-ai/langserve

So then a web developer could use a remote runnable to build chains with the familiar invoke/batch/stream/streamLog API in LangChain.js:

https://js.langchain.com/docs/ecosystem/langserve

I don't expect you to add this to AI-mask but would definitely encourage you to keep it up! I think there's really something there.

@pacoccino
Copy link
Author

pacoccino commented Mar 22, 2024

I've open a new PR #19 with better support for AI-Mask.

@jacoblee93 About your thought exposing an equivalent of Langserve endpoint from a chrome extension, could you elaborate ? What would be the difference for a web dev between using a Remote Runnable and a lib like ChatAIMask I'm using now ?

@jacoblee93
Copy link
Owner

Ah cool! Will try to take a look this weekend.

In general, it'd allow for some common operations we've seen folks want when building with LangChain. But actually now that I think about it the only important one beyond simple .invoke is .stream since this would just make the model available!

By implementing the web client to an extension (like I think you've done with ChatAIMask?) as a runnable you'd basically be able to swap out e.g. ChatOpenAI for ChatAIMask and build things like this completely locally:

https://js.langchain.com/docs/use_cases/question_answering/

But yeah I think you are right in that the full LangServe suite is not necessary. Looking forward to digging in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants