Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added tokenizer params to the listing #47

Merged
merged 15 commits into from
Nov 27, 2023

Conversation

adubovik
Copy link
Contributor

@adubovik adubovik commented Nov 22, 2023

Added params related to tokenization to the listing as part of #54

DIAL core config:

  • features.tokenizeEndpoint (URL optional) - the endpoint which allows to tokenize prompt
  • features.truncatePromptEndpoint (URL optional) - the endpoint which serves context trimming API call. It trims the chat history up to the point when it fit requested number of tokens specified in request field max_prompt_tokens.
  • tokenizerModel (string optional) - the reference model whose tokenization algorithm is matching the tokenization algorithm of the given model. It allows a user to make tokenization on their side. It's possible for models whose tokenization algorithm is publically known and implemented in a SDK (e.g. GPT and Anthropic). Models fall into families which share the same tokenization algorithm. tokenizer.referenceModel basically points to a representative model from a corresponding family. As a rule of thumb, choose the oldest representative from a family. See the families below.

Listing:

  • features.tokenize (boolean optional) - true means that the core has <server host>/v1/deployments/<deployment name>/tokenize endpoint
  • features.truncate_prompt (boolean optional) - true means that the core has <server host>/v1/deployments/<deployment name>/truncate_prompt endpoint
  • tokenizer_model (string optional) - same value as in the core config
Model families
_tokenization_families:
  - gpt:
    - family_1:
      - gpt-3.5-turbo-0613
      - gpt-3.5-turbo-16k-0613
      - gpt-4-0314
      - gpt-4-32k-0314
      - gpt-4-0613
      - gpt-4-32k-0613
      # tokens_per_message = 3, tokens_per_name = 1
    - family_2:
      - gpt-3.5-turbo-0301
      # tokens_per_message=4, tokens_per_name=-1
    - family_3:
      - text-embedding-ada-002
      # just a string
  - gpt_encoding: cl100k_base
  - gpt_refs:
    - https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
    - https://platform.openai.com/docs/models
    - https://tiktokenizer.vercel.app/
  - palm:
    - family_1:
      - chat-bison@001
      - codechat-bison@001
      - textembedding-gecko@001
  - anthropic:
    - family_1:
      - anthropic.claude-instant-v1
      - anthropic.claude-v1
      - anthropic.claude-v2
  - anthropic_refs:
    - https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/_tokenizers.py

@adubovik adubovik self-assigned this Nov 22, 2023
@adubovik adubovik added the enhancement New feature or request label Nov 22, 2023
artsiomkorzun
artsiomkorzun previously approved these changes Nov 22, 2023
@astsiapanay astsiapanay self-requested a review November 22, 2023 16:20
@adubovik adubovik linked an issue Nov 24, 2023 that may be closed by this pull request
@adubovik adubovik merged commit 5a07076 into development Nov 27, 2023
5 checks passed
@adubovik adubovik deleted the feat/add-tokenizer-param-to-listing branch November 27, 2023 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Extend model listing API with limits (tokenize, rate endpoint)
3 participants