-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.0 drops embeddings_util.py breaking semantic text search #676
Comments
Thanks for calling this out @mrbullwinkle, we're working on updating the cookbook repository to include the functions provided in embeddings_utils.py directly so that you can copy them into your own project. This is a better approach than the current embeddings_utils as you can just include the dependencies for the function you want whereas with the current approach you'll have to install dependencies you'll never use. |
@RobertCraigie, makes sense. Thank you for the super fast response! |
@mrbullwinkle def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def get_embedding(text, model="text-embedding-ada-002"): # model = "deployment_name"
return client.embeddings.create(input = [text], model=model).data[0].embedding
def search_docs(df, user_query, top_n=4, to_print=True):
embedding = get_embedding(
user_query,
model="text-embedding-ada-002" # model should be set to the deployment name you chose when you deployed the text-embedding-ada-002 (Version 2) model
)
df["similarities"] = df.ada_v2.apply(lambda x: cosine_similarity(x, embedding))
res = (
df.sort_values("similarities", ascending=False)
.head(top_n)
)
if to_print:
display(res)
return res
res = search_docs(df_bills, "Can I get information on cable company tax revenue?", top_n=4) |
@CristianPQ you beat me to adding a ref to that code sample to this issue after I added it earlier today. I am the author of that code sample/article. |
Very unprofessional that such functions just get removed. It is a moving API one cannot rely on, very annoying. |
@RobertCraigie , is the support added in cookbook repo for the functions provided in embeddings_utils.py ? |
@logankilpatrick , can you help here? |
Hi folks - any updates on the cookbook yet? |
Still can't see the updates on the cookbook |
Hey folks, we migrated these over to the cookbook's own utils folder ~3 months ago: https://github.com/openai/openai-cookbook/blob/main/examples/utils/embeddings_utils.py, if you find any notebooks that are out of sync and not using the built-in utils, please open an issue on the cookbook repo. |
Why the frick would you guys remove this. This API is god awful. |
FWIW, if you're using embeddings from the OpenAI API, you can get cosine similarity with just |
Also BTW there is still a reference to |
Describe the bug
The previous version of the OpenAI Python library contained
embeddings_utils.py
which provided functions likecosine_similarity
which are used for semantic text search with embeddings. Without this functionality existing code including OpenAI's cookbook example: https://cookbook.openai.com/examples/semantic_text_search_using_embeddings will fail due to this dependency.Are there plans to add this support back-in or should we just create our own cosine_similarity function based on the one that was present in
embeddings_utils
:To Reproduce
Cookbook example cannot be converted to use v1.0 without removing the dependency on
embeddings_utils.py
https://cookbook.openai.com/examples/semantic_text_search_using_embeddingsCode snippets
OS
Windows
Python version
Python v3.10.11
Library version
openai-python==1.0.0rc2
The text was updated successfully, but these errors were encountered: