Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Signed-off-by: Shengyu Fu <[email protected]>
  • Loading branch information
shengyfu authored Apr 16, 2024
1 parent da36634 commit 2d96434
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Tokenizer

This repo contains C# and Typescript implementation of byte pair encoding(BPE) tokenizer for OpenAI LLMs, it's based on open sourced rust implementation in the [OpenAI tiktoken](https://github.com/openai/tiktoken). Both implementation are valuable to run prompt tokenization in .NET and Nodejs environment before feeding prompt into a LLM.
This repo contains Typescript and C# implementation of byte pair encoding(BPE) tokenizer for OpenAI LLMs, it's based on open sourced rust implementation in the [OpenAI tiktoken](https://github.com/openai/tiktoken). Both implementation are valuable to run prompt tokenization in Nodejs and .NET environment before feeding prompt into a LLM.

## Typescript implementation

Please follow [README](tokenizer_ts/README.md).

## C# implementation

> [!IMPORTANT]
> Users of `Microsoft.DeepDev.TokenizerLib` should migrate to `Microsoft.ML.Tokenizers`. The functionality in `Microsoft.DeepDev.TokenizerLib` has been added to [`Microsoft.ML.Tokenizers`](https://www.nuget.org/packages/Microsoft.ML.Tokenizers). `Microsoft.ML.Tokenizers` is a tokenizer library being developed by the .NET team and going forward, the central place for tokenizer development in .NET. By using `Microsoft.ML.Tokenizers`, you should see improved performance over existing tokenizer library implementations, including `Microsoft.DeepDev.TokenizerLib`. A stable release of `Microsoft.ML.Tokenizers` is expected alongside the .NET 9.0 release (November 2024). Instructions for migration can be found at https://github.com/dotnet/machinelearning/blob/main/docs/code/microsoft-ml-tokenizers-migration-guide.md.
## Typescript implementation

Please follow [README](tokenizer_ts/README.md).

## Contributing

We welcome contributions. Please follow [this guideline](CONTRIBUTING.md).
Expand Down

0 comments on commit 2d96434

Please sign in to comment.