Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling OpenAI 429's gracefully #4153

Closed
pascalwhoop opened this issue Jul 25, 2024 · 0 comments
Closed

Handling OpenAI 429's gracefully #4153

pascalwhoop opened this issue Jul 25, 2024 · 0 comments

Comments

@pascalwhoop
Copy link

pascalwhoop commented Jul 25, 2024

Expected Behavior (Mandatory)

Ability to control OpenAI backoff strategy for large volume of embeddings calls. This is standard practice in almost any library I've used because we cannot assume we have infinite capacity from our API providers.

Actual Behavior (Mandatory)

] version=71, last transaction in previous log=5140, rotation took 51 millis, started after 7843 millis."}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"Error during iterate.commit:"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"1887 times: org.neo4j.graphdb.QueryExecutionException: Failed to invoke procedure `apoc.ml.openai.embedding`: Caused by: java.io.IOException: Server returned HTTP response code: 429 for URL: https://api.openai.com/v1/embeddings"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"332 times: org.neo4j.graphdb.QueryExecutionException: Failed to invoke procedure `apoc.ml.openai.embedding`: Caused by: java.net.SocketTimeoutException: Connect timed out"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"Error during iterate.execute:"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"332 times: Connect timed out"}
{"time":"2024-07-24 22:57:06.966+0000","level":"WARN","category":"o.n.k.a.p.GlobalProcedures","message":"1887 times: Server returned HTTP response code: 429 for URL: https://api.openai.com/v1/embeddings"}

How to Reproduce the Problem

Try embedding 5M nodes at 2000 nodes batched per API request (to maximise throughput) so you end up hitting the 429 for too many tokens per minute

Specifications (Mandatory)

CALL apoc.periodic.iterate(
    'MATCH (p:`Entity`) RETURN p', 
    'CALL apoc.ml.openai.embedding([item in $_batch | item.p.name], $apiKey, {endpoint: $endpoint, model: $model}) YIELD index, text, embedding CALL apoc.create.setProperty($_batch[index].p, $attribute, embedding) YIELD node RETURN node', 
    {`batchMode`: 'BATCH_SINGLE', `batchSize`: 2000, `concurrency`: 50, `parallel`: 'true', `params`: {`apiKey`: 'KEY', `attribute`: 'embedding', `endpoint`: 'https://api.openai.com/v1', `model`: 'text-embedding-3-small'}}
) YIELD batch, operations

Currently used versions

# pypher
python-cypher==0.20.1

helm chart
- name: neo4j 
  version: 5.20.0
  repository: https://neo4j.github.io/helm-charts/

Versions

  • OS: GKE
  • Neo4j: 5.20.0
  • Neo4j-Apoc: 5.20.0
RobertoSannino pushed a commit that referenced this issue Dec 11, 2024
* Fixes #4153: Handling OpenAI 429's gracefully

* cleanup

* fix tests
vga91 added a commit that referenced this issue Dec 11, 2024
* Fixes #4153: Handling OpenAI 429's gracefully

* cleanup

* fix tests
@vga91 vga91 closed this as completed Dec 11, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done (check if cherry-pick) in APOC Extended Larus Dec 11, 2024
vga91 added a commit that referenced this issue Dec 11, 2024
* Fixes #4153: Handling OpenAI 429's gracefully

* cleanup

* fix tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done (check if cherry-pick)
Development

No branches or pull requests

3 participants