Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document and re-define naming conventions in v3 #1123

Open
dlqqq opened this issue Nov 27, 2024 · 0 comments
Open

Document and re-define naming conventions in v3 #1123

dlqqq opened this issue Nov 27, 2024 · 0 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Milestone

Comments

@dlqqq
Copy link
Member

dlqqq commented Nov 27, 2024

Problem

It's unclear how local variables, functions, and classes should be named due to the lack of established & documented naming conventions. Some of the existing naming conventions were poorly chosen, and make it difficult to read existing code.

Although this issue may seem trivial, I believe that good, well-documented naming conventions can save days of effort for contributors when measured across years of development.

This issue serves two purposes:

  1. To track progress on adding contributor documentation regarding naming conventions in v3.
  2. To track proposals for new naming conventions in v3.

Contributors are absolutely welcome to offer feedback and contribute suggestions! Please leave them as comments here.

Proposed name changes

New term: chat model

In v2, "language model" generally referred to the model used in the chat. With the introduction of completion models, we need to reconsider the name "language model", as it's ambiguous whether the term refers to the LLM used in chat or the LLM used in completions.

For v3, we should prefer "chat model" as much as possible for the sake of clarity.

New terms: model IDs and model UIDs

In v2, model IDs ambiguously refer to either the values used by Jupyter AI (e.g. openai-chat:gpt-4o) or the arguments accepted by a provider class (gpt-4o). Previously, to distinguish this, we referred to the former as global model IDs (abbreviated as gmid or gid), and the latter as local model IDs (abbreviated as lid or lmid).

  • Furthermore, in v2, to indicate that a model ID referred to a language model, variables were named lm_id, lm_lid, lm_gid. Similarly so for embedding models (em_id, em_gid, em_lid) and completion models (cm_id, cm_gid, cm_lid).

I have found this very confusing (even though I set these conventions). The v2 definitions produces 9 different ways to label model IDs.

For v3, I propose new definitions to eliminate ambiguity in the term "model ID":

  • Model ID: the argument which identifies a model to a provider (e.g. gpt-4o)
  • Model UID (universal ID): the argument which identifies a model to Jupyter AI (e.g. openai-chat:gpt-4o).
  • The definition of provider ID remains unchanged.

Local variables should be renamed accordingly:

  • lm_gid => chat_model_uid
  • lm_lid => chat_model_id
  • em_gid => embedding_model_uid
  • em_lid => embedding_model_id
  • etc.
@dlqqq dlqqq added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 27, 2024
@dlqqq dlqqq added this to the v3.0.0 milestone Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant