Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use SentencepieceTokenizer in C#? #468

Closed
tylike opened this issue Jun 9, 2023 · 5 comments
Closed

Can I use SentencepieceTokenizer in C#? #468

tylike opened this issue Jun 9, 2023 · 5 comments

Comments

@tylike
Copy link

tylike commented Jun 9, 2023

hi!

I have an LLM in Onnx format and a sentencepiece.model, and I used HuggingFace and SententPiece together in Python. Now I plan to do inference in C# + OnnxRuntime, but I haven't found a suitable version of the SententPiece library in C#. I saw that there is a SentencepieceTokenizer here. Can I use SentencepieceTokenizer in C#?

My files were downloaded from here: https://huggingface.co/K024/ChatGLM-6b-onnx-u8s8/tree/main/chatglm-6b-int8-onnx-merged. Thank you."

@wenbingl
Copy link
Member

Yes, the Nuget package could be found here: https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.Extensions/0.8.0

@tylike
Copy link
Author

tylike commented Jul 18, 2023

I only saw the C# demo code for registering the extension.
In Python, I need to use the tokenizer to get the Ids of the user input text first when using the LLM model for inference, and then do the subsequent processing.
But I didn’t find a class that wraps this tokenizer as a C# version in this extension library, did I misunderstand it?
Can you give some examples?

For example: I defined it like this in python:
from sentencepiece import SentencePieceProcessor
sp_model = SentencePieceProcessor(model_file=model_path)
ids = sp_model.encode(s)

@wenbingl
Copy link
Member

@sayanshaw24 , can you add SPM tokenizer into our C# example?

@nshoman
Copy link

nshoman commented Sep 7, 2023

Perhaps this issue can be closed? I was looking for something similar on the decoder end and was able to develop what I think @tylike wanted.

I should note beforehand that this was done using v0.8.0; the function build_my_graph was renamed at some point to build_graph.

Here's a working solution I came up with:

##building the model
from onnxruntime_extensions._ortapi2 import make_onnx_model
from onnxruntime_extensions._cuops import SingleOpGraph
import onnx

kwargs={'model':open(path/to/model, 'rb').read()}
model = make_onnx_model(graph)
onnx.save_model(model,/outputpath/model.onnx)

##inference
import onnxruntime as _ort
from onnxruntime_extensions import get_library_path as _lib_path
import numpy as np

so = _ort.SessionOptions()
so.register_custom_ops_library(_lib_path())
sess = _ort.InferenceSession(/outputpath/model.onnx, so)

alpha = 0
nbest_size = 0
flags = 0

inp_dict = {'inputs': np.array(['your text here']),
            'nbest_size': np.array([nbest_size],dtype=np.int64),
            'alpha':np.array([alpha],np.float32),
            'add_bos':np.array([flags & 1], dtype=np.bool_),
            'add_eos':np.array([flags & 2], dtype=np.bool_),
            'reverse':np.array([flags & 4], dtype=np.bool_)}

outs = sess.run(None,input_feed=inp_dict)
token_array = outs[0]

While this isn't C#, hopefully it illustrates how to perform inference using the ONNX tokenizer; it should be relatively straightforward to implement from the python code.

Just make sure to load the extensions library when performing inference in C#:

SessionOptions options = new SessionOptions();
options.RegisterOrtExtensions();
session = new InferenceSession(model, options);

This test helped me quite a bit:

ofunc = OrtPyFunction.from_customop('SentencepieceDecoder', model=open(fullname, 'rb').read())

@GeorgeS2019
Copy link

@wenbingl

Where are the folder for c# Examples?

I see Java folder under the root but no trace of CSharp folder?

@tylike tylike closed this as completed May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants