Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sqlite write contention #1045

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

charles-cooper
Copy link
Contributor

in the sqlite backend, the current strategy for writes is to write-with-retry. however, this can fail because our connection could be starved for a write if we are racing with other processes and sqlite keeps giving the write to other processes.

this commit addresses the issue by acquiring an exclusive lock from sqlite before any write to the database.

references:

if my understanding is correct, this closes #820

Checklist

  • Added docstrings and type annotations
  • Added test coverage
  • Updated changelog to describe any user-facing features or changed behavior

@charles-cooper
Copy link
Contributor Author

i'm looking at the test failures and they seem to be related to testing this specific functionality. @JWCook let me know if you would like me to fix them, but as i'm new to the test suite it might better if you take over from here!

@JWCook
Copy link
Member

JWCook commented Jan 16, 2025

Thank you for the PR! I think this will likely be a nice improvement.

Before I merge this, I do need to study up on this a bit more to refresh my memory on SQLite locking, transactions, and journaling modes. A few questions, if you happen to know:

  • How will this behave in WAL mode? Will the exclusive lock prevent concurrent reads while a write is in progress?
  • Some users set their own PRAGMA settings on the connection. Are there any PRAGMAs or other connection settings you're aware of that shouldn't be used with an exclusive lock?
  • In cases where the commit() fails with an OperationalError, is there a possibility of a rollback() failing as well? I seem to remember running into this situation in the past, but I can't remember any specifics.

For the test failures, one is because users can currently pass isolation_level as a cache backend kwarg. For example, this will now raise a TypeError

SQLiteCache('http_cache', isolation_level='EXCLUSIVE')

One way to handle this would be to log a warning and ignore isolation_level if it's explicitly passed to the backend. There is a slight chance that could be a breaking change for someone out there, but I think it would be reasonable.

For the other two failures (test_write_retry and test_write_retry__exceeded_retries), those were specific to the retry logic. You can remove those, but could you also add some new tests to cover the error handling in _acquire_sqlite_lock()?

@charles-cooper
Copy link
Contributor Author

Before I merge this, I do need to study up on this a bit more to refresh my memory on SQLite locking, transactions, and journaling modes. A few questions, if you happen to know:

  • How will this behave in WAL mode? Will the exclusive lock prevent concurrent reads while a write is in progress?

no, iiuc it's the other way around -- exclusive lock prevents concurrent reads unless wal mode is set.
see https://www.sqlite.org/lockingv3.html#writer_starvation and https://www.sqlite.org/wal.html#concurrency

  • Some users set their own PRAGMA settings on the connection. Are there any PRAGMAs or other connection settings you're aware of that shouldn't be used with an exclusive lock?

no, but there are a lot of pragmas and i can't say i'm familiar with all of them. i think the thing to do here is document that requests-cache acquires exclusive locks on the db during the normal course of its operation, and let users understand the implications here.
https://www.sqlite.org/pragma.html

  • In cases where the commit() fails with an OperationalError, is there a possibility of a rollback() failing as well? I seem to remember running into this situation in the past, but I can't remember any specifics.

hmm i've never run into this but it sounds plausible that it could happen. i guess the commit might fail for some reason like a disk failure, which would presumably also cause the rollback to fail as well. my instinct here is to let the exception bubble up here rather than catching it, since it's probably some unrecoverable error that the user should know about, but happy to go with whatever you want here.

For the test failures, one is because users can currently pass isolation_level as a cache backend kwarg. For example, this will now raise a TypeError

SQLiteCache('http_cache', isolation_level='EXCLUSIVE')

One way to handle this would be to log a warning and ignore isolation_level if it's explicitly passed to the backend. There is a slight chance that could be a breaking change for someone out there, but I think it would be reasonable.

For the other two failures (test_write_retry and test_write_retry__exceeded_retries), those were specific to the retry logic. You can remove those, but could you also add some new tests to cover the error handling in _acquire_sqlite_lock()?

i will check on these, probably this weekend!

@JWCook JWCook force-pushed the main branch 4 times, most recently from d82e01b to 1703d35 Compare January 17, 2025 19:04
Copy link

codecov bot commented Jan 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.13%. Comparing base (9b1d269) to head (84a5cfb).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1045      +/-   ##
==========================================
- Coverage   99.13%   99.13%   -0.01%     
==========================================
  Files          23       23              
  Lines        2084     2083       -1     
  Branches      332      331       -1     
==========================================
- Hits         2066     2065       -1     
  Misses          9        9              
  Partials        9        9              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JWCook
Copy link
Member

JWCook commented Feb 7, 2025

@charles-cooper Are there more changes you'd like to make here? Otherwise I should have some time this weekend to finish this up. I think all that's needed is a bit more error handling and test coverage.

@charles-cooper
Copy link
Contributor Author

@charles-cooper Are there more changes you'd like to make here? Otherwise I should have some time this weekend to finish this up. I think all that's needed is a bit more error handling and test coverage.

nope - have at it!

in the sqlite backend, the current strategy for writes is to
write-with-retry. however, this can fail because our connection could be
starved for a write if we are racing with other processes and sqlite
keeps giving the write to other processes.

this commit addresses the issue by acquiring an exclusive lock from
sqlite before any write to the database.

references:
- https://stackoverflow.com/a/21888761
- https://www.sqlite.org/lockingv3.html

update tests
@JWCook JWCook added this to the v1.3 milestone Feb 7, 2025
@JWCook JWCook added bug backends Features or changes related to specific backends labels Feb 7, 2025
@JWCook JWCook force-pushed the fix/sqlite-contention branch from 219f0d0 to d38a56d Compare February 7, 2025 18:30
@JWCook JWCook merged commit 02397fe into requests-cache:main Feb 7, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backends Features or changes related to specific backends bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

deadlock with threads
2 participants