Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(backend): Fix conn_retry decorator possible incorrect behaviour on failed async function #8836

Merged
merged 3 commits into from
Nov 29, 2024

Conversation

majdyz
Copy link
Contributor

@majdyz majdyz commented Nov 28, 2024

This fix is triggered by an error observed on db connection failure on SupaBase:

2024-11-28 07:45:24,724 INFO  [DatabaseManager] Starting...
2024-11-28 07:45:24,726 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection started...
2024-11-28 07:45:24,726 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection completed successfully.
{"is_panic":false,"message":"Can't reach database server at `...pooler.supabase.com:5432`\n\nPlease make sure your database server is running at `....pooler.supabase.com:5432`.","meta":{"database_host":"...pooler.supabase.com","database_port":5432},"error_code":"P1001"}
2024-11-28 07:45:35,153 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection failed: Could not connect to the query engine. Retrying now...
2024-11-28 07:45:36,155 INFO  [PID-18|DatabaseManager|Redis-e14a33de-2d81-4536-b48b-a8aa4b1f4766] Acquiring connection started...
2024-11-28 07:45:36,181 INFO  [PID-18|DatabaseManager|Redis-e14a33de-2d81-4536-b48b-a8aa4b1f4766] Acquiring connection completed successfully.
2024-11-28 07:45:36,183 INFO  [PID-18|DatabaseManager|Pyro-2722cd29-4dbd-4cf9-882f-73842658599d] Starting Pyro Service started...
2024-11-28 07:45:36,189 INFO  [DatabaseManager] Connected to Pyro; URI = PYRO:[email protected]:8005
2024-11-28 07:46:28,241 ERROR  Error in get_user_integrations: All connection attempts failed

Where even

2024-11-28 07:45:35,153 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection failed: Could not connect to the query engine. Retrying now...

is present, the Redis connection is still proceeding without waiting for the retry to complete. This was likely caused by Tenacity not fully awaiting the DB connection acquisition command.

Changes 🏗️

  • Add special handling for the async function to explicitly await the function execution result on each retry.
  • Explicitly raise exceptions on db.connect() if the db is not connected even after prisma.connect() command.

Checklist 📋

For code changes:

  • I have clearly listed my changes in the PR description
  • I have made a test plan
  • I have tested my changes according to the test plan:
    • ...
Example test plan
  • Create from scratch and execute an agent with at least 3 blocks
  • Import an agent from file upload, and confirm it executes correctly
  • Upload agent to marketplace
  • Import an agent from marketplace and confirm it executes correctly
  • Edit an agent from monitor, and confirm it executes correctly

For configuration changes:

  • .env.example is updated or already compatible with my changes
  • docker-compose.yml is updated or already compatible with my changes
  • I have included a list of my configuration changes in the PR description (under Changes)
Examples of configuration changes
  • Changing ports
  • Adding new services that need to communicate with each other
  • Secrets or environment variable changes
  • New or infrastructure changes such as databases

@majdyz majdyz requested review from ntindle and aarushik93 November 28, 2024 10:06
@majdyz majdyz requested a review from a team as a code owner November 28, 2024 10:06
@github-actions github-actions bot added platform/backend AutoGPT Platform - Back end size/l labels Nov 28, 2024
Copy link

netlify bot commented Nov 28, 2024

Deploy Preview for auto-gpt-docs-dev canceled.

Name Link
🔨 Latest commit 96c6492
🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs-dev/deploys/674986923ba27500081d14a0

Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Error Handling
The new connection validation checks could potentially mask the original connection error. Consider preserving and logging the original exception details when raising the new ConnectionError.

Code Duplication
The sync_wrapper and async_wrapper functions contain duplicated logging and error handling logic. Consider extracting the common code into a shared helper function.

Test Coverage
The retry tests could be expanded to verify the exponential backoff behavior and validate the logging output.

Copy link

netlify bot commented Nov 28, 2024

Deploy Preview for auto-gpt-docs canceled.

Name Link
🔨 Latest commit 96c6492
🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/67498692da4f340008a04f1a

@majdyz majdyz requested review from Swiftyos and Pwuts and removed request for ntindle November 29, 2024 01:03
@majdyz majdyz enabled auto-merge November 29, 2024 08:48
@majdyz majdyz added this pull request to the merge queue Nov 29, 2024
Merged via the queue into dev with commit 63af42d Nov 29, 2024
19 checks passed
@majdyz majdyz deleted the zamilmajdy/fix-retry-decorator-async branch November 29, 2024 09:44
aarushik93 pushed a commit that referenced this pull request Dec 1, 2024
…n failed async function (#8836)

This fix is triggered by an error observed on db connection failure on
SupaBase:
```
2024-11-28 07:45:24,724 INFO  [DatabaseManager] Starting...
2024-11-28 07:45:24,726 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection started...
2024-11-28 07:45:24,726 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection completed successfully.
{"is_panic":false,"message":"Can't reach database server at `...pooler.supabase.com:5432`\n\nPlease make sure your database server is running at `....pooler.supabase.com:5432`.","meta":{"database_host":"...pooler.supabase.com","database_port":5432},"error_code":"P1001"}
2024-11-28 07:45:35,153 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection failed: Could not connect to the query engine. Retrying now...
2024-11-28 07:45:36,155 INFO  [PID-18|DatabaseManager|Redis-e14a33de-2d81-4536-b48b-a8aa4b1f4766] Acquiring connection started...
2024-11-28 07:45:36,181 INFO  [PID-18|DatabaseManager|Redis-e14a33de-2d81-4536-b48b-a8aa4b1f4766] Acquiring connection completed successfully.
2024-11-28 07:45:36,183 INFO  [PID-18|DatabaseManager|Pyro-2722cd29-4dbd-4cf9-882f-73842658599d] Starting Pyro Service started...
2024-11-28 07:45:36,189 INFO  [DatabaseManager] Connected to Pyro; URI = PYRO:[email protected]:8005
2024-11-28 07:46:28,241 ERROR  Error in get_user_integrations: All connection attempts failed
```

Where  even 
```
2024-11-28 07:45:35,153 INFO  [PID-18|DatabaseManager|Prisma-7f32369c-6432-4edb-8e71-ef820332b9e4] Acquiring connection failed: Could not connect to the query engine. Retrying now...
```
is present, the Redis connection is still proceeding without waiting for
the retry to complete. This was likely caused by Tenacity not fully
awaiting the DB connection acquisition command.

### Changes 🏗️

* Add special handling for the async function to explicitly await the
function execution result on each retry.
* Explicitly raise exceptions on `db.connect()` if the db is not
connected even after `prisma.connect()` command.

### Checklist 📋

#### For code changes:
- [ ] I have clearly listed my changes in the PR description
- [ ] I have made a test plan
- [ ] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [ ] ...

<details>
  <summary>Example test plan</summary>
  
  - [ ] Create from scratch and execute an agent with at least 3 blocks
- [ ] Import an agent from file upload, and confirm it executes
correctly
  - [ ] Upload agent to marketplace
- [ ] Import an agent from marketplace and confirm it executes correctly
  - [ ] Edit an agent from monitor, and confirm it executes correctly
</details>

#### For configuration changes:
- [ ] `.env.example` is updated or already compatible with my changes
- [ ] `docker-compose.yml` is updated or already compatible with my
changes
- [ ] I have included a list of my configuration changes in the PR
description (under **Changes**)

<details>
  <summary>Examples of configuration changes</summary>

  - Changing ports
  - Adding new services that need to communicate with each other
  - Secrets or environment variable changes
  - New or infrastructure changes such as databases
</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants