Harrison/llamacpp #5402

@hwchase17

# Adds support for counting tokens using the llama.cpp python interface rather than the default huggingface transformers library The current implementation of the `LlamaCpp` LLM defaults to the base `LLM` for token counting. This results in the need for the huggingface transformers library to be loaded. The Llama.cpp python interface provides a method for tokenizing a given string. This PR overloads the `get_num_tokens` method of the base class to use that instead. Using the native tokenizer should yield more accurate token counts dependent on the loaded model. For llama.cpp workflows this PR reduces dependencies.  ## Before submitting Wasn't sure how to setup a test for this without spinning up a particular model. But I have tested it in a project. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @agola11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harrison/llamacpp #5402

Harrison/llamacpp #5402

Commits on May 29, 2023