Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added embedding endpoint and tests #36

Merged
merged 1 commit into from
Feb 3, 2023
Merged

Conversation

metjuperry
Copy link
Contributor

Added embedding endpoint and tests.

@gotmike

@gotmike
Copy link
Contributor

gotmike commented Feb 1, 2023

thx for the quick response. i'll be testing today...

@gotmike
Copy link
Contributor

gotmike commented Feb 1, 2023

some notes:

  • in the original repo, the namespace is not changed for folders. it uses the same namespace everywhere. i think that we should prob update to use namespaces that reflect the folder structure, so (for instance) classes inside the Completions folder would be in the OpenAI_API.Completions namespace rather than just the OpenAI_API namespace. this PR already does this for Embeddings and i think it should be replicated elsewhere.
  • instead of using string arrays to handle json arrays, prior code style was to use lists. prob better to stick with lists.
  • this pr includes text-embedding-ada-002 which is a 2nd gen model. not sure if we need to add the text-similarity, text-search and code-search models as well? i tested using a couple of these models and got valid results (albeit with diff float lengths), but they will need added to the Models class

passed all tests, i think this is good to go. i have some updates that go after this to fix a few minor issues.

@gotmike gotmike assigned gotmike and OkGoDoIt and unassigned gotmike Feb 1, 2023
@gotmike gotmike requested a review from OkGoDoIt February 1, 2023 16:35
@metjuperry
Copy link
Contributor Author

some notes:

  • in the original repo, the namespace is not changed for folders. it uses the same namespace everywhere. i think that we should prob update to use namespaces that reflect the folder structure, so (for instance) classes inside the Completions folder would be in the OpenAI_API.Completions namespace rather than just the OpenAI_API namespace. this PR already does this for Embeddings and i think it should be replicated elsewhere.
  • instead of using string arrays to handle json arrays, prior code style was to use lists. prob better to stick with lists.
  • this pr includes text-embedding-ada-002 which is a 2nd gen model. not sure if we need to add the text-similarity, text-search and code-search models as well? i tested using a couple of these models and got valid results (albeit with diff float lengths), but they will need added to the Models class

passed all tests, i think this is good to go. i have some updates that go after this to fix a few minor issues.

Thanks for the input! I have updated the array to list to keep the convention.

Regarding the model: According to this blog post the embedding V2 model should be used for majority of embedding use cases and is the one to use (It's supposed to be better AND cheaper, so I picked it as the main one). The previous models should still work as v1, but at the time writing this,I believe text-embedding-ada-002 is the only V2 model available.

@gotmike
Copy link
Contributor

gotmike commented Feb 1, 2023

ok makes sense. i didn't add them, i read through that post also and since it was all released at the same time (jan 25, 2023) wasn't entirely clear to me. not sure if we should add just for completeness or leave as-is with the v2. either way the one you used works well, i didn't actually test a "comparison" of two vectors though. that would be nice to have as part of the tests eventually.

@gotmike
Copy link
Contributor

gotmike commented Feb 1, 2023

after this is merged, i have a few minor tweaks, but will wait for merge first.

@OkGoDoIt
Copy link
Owner

OkGoDoIt commented Feb 3, 2023

Thanks @metjuperry for your help adding the embedding endpoint! I made some changes to align it with my updated endpoint and result inheritance style. Somehow it's now a new PR at #38 , I'm not sure why my additional commits didn't get added to this one.

@OkGoDoIt OkGoDoIt merged commit 030891a into OkGoDoIt:master Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants