Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RWKV-LM with ggml-js #1

Open
ansarizafar opened this issue Apr 28, 2023 · 13 comments
Open

RWKV-LM with ggml-js #1

ansarizafar opened this issue Apr 28, 2023 · 13 comments

Comments

@ansarizafar
Copy link

Is it possible to use RWKV-LM https://github.com/BlinkDL/RWKV-LM with ggml-js and Bunjs https://bun.sh/ Is there any example available?

@cztomsik
Copy link
Owner

Not yet, the problem is that there are no tokenizers in JS ecosystem yet. There are node.js bindings for huggingface tokenizers, but was not able to install them. So I don't know.

I hope I won't need to re-implement another BPE tokenizer just to run the example, but I definitely want to run RWKV from node. And if it runs in node, it should also run in Bun, because it's just N-API wrapper + a little bit of plain JS (no build step, etc).

@cztomsik
Copy link
Owner

BTW: the example is here https://github.com/cztomsik/ggml-js/blob/main/examples/rwkv.js
I'm not sure if it actually works correctly but I will figure out eventually :)

@cztomsik
Copy link
Owner

cztomsik commented Apr 28, 2023

updated the example, now it can generate some text

Screen.Recording.2023-04-28.at.14.24.30.mov

@ansarizafar
Copy link
Author

The example code is too difficult to understand for developers new to ML and LLM. Is it possible to created an abstraction/lib to make it easy to use?

@cztomsik
Copy link
Owner

cztomsik commented Apr 29, 2023

Yeah, it's just PoC :) do you have any specific API in mind? What are you trying to do?

BTW: bun does not work currently - it seems that napi_set_instance_data() is not implemented/exported in bun currently.
oven-sh/bun#158 (comment)

@ansarizafar
Copy link
Author

I want to use RWKV with langchainjs https://js.langchain.com/docs/ so an adapter for langchainjs would be great.

@cztomsik
Copy link
Owner

cztomsik commented Apr 30, 2023

I see, but that's unlikely, at least not in the short term. I can move the RWKV model from examples to the main package, but there is still a lot of functionality missing, to be practically useful (top_k/top_p sampling, repetition-penalty, fixing the mmap vs. no_alloc, async, etc.), and I will definitely rather focus on those.

Sorry. But you should be able to easily create your own package, and use this lib as dependency.

@BlinkDL
Copy link

BlinkDL commented Apr 30, 2023

JS tokenizer here https://github.com/josephrocca/rwkv-v4-web

@cztomsik
Copy link
Owner

cztomsik commented May 2, 2023

@BlinkDL thanks, but I couldn't get it working. But I did a quick and dirty impl here, it seems to work. 🤷
https://github.com/cztomsik/ggml-js/blob/main/lib/tokenizer.js#L63

@BlinkDL
Copy link

BlinkDL commented May 2, 2023

@cztomsik
Copy link
Owner

cztomsik commented May 3, 2023

It's broken, thanks 😆

@cztomsik
Copy link
Owner

cztomsik commented May 3, 2023

Tokenizer fixed, mmap/no_alloc fixed too (it can now load 1B raven model without having to copy the weights first)

image

The next one is sampling, stopping at end-token and async (I'm not really sure how that will map to ggml)

@cztomsik
Copy link
Owner

cztomsik commented May 4, 2023

RWKV example now works with f16 matrices. Run python rwkv-convert.py <model> --mtype f16 to generate smaller file.

For example, this is 3B Raven model.

Screen.Recording.2023-05-04.at.12.36.16.mov

Q4 and Q8 should work too but it's not supported in the conversion script yet.

As you can see the bigger problem right now is the sampling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants