Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache access fails after forking if multiple Cache instances are backed by the same database #325

Open
randomir opened this issue Jul 25, 2024 · 2 comments

Comments

@randomir
Copy link

Running:

import os
import diskcache

a = diskcache.Cache(directory='/tmp/cache')
b = diskcache.Cache(directory='/tmp/cache')

os.fork()

a.get('key')

on a MacOS machine, fails with:

Traceback (most recent call last):
  File "/Users/distiller/project/fork.py", line 9, in <module>
    a.get('key')
  File "/Users/distiller/project/env/lib/python3.12/site-packages/diskcache/core.py", line 1165, in get
    rows = self._sql(select, (db_key, raw, time.time())).fetchall()
           ^^^^^^^^^
  File "/Users/distiller/project/env/lib/python3.12/site-packages/diskcache/core.py", line 648, in _sql
    return self._con.execute
           ^^^^^^^^^
  File "/Users/distiller/project/env/lib/python3.12/site-packages/diskcache/core.py", line 623, in _con
    con = self._local.con = sqlite3.connect(
                            ^^^^^^^^^^^^^^^^
sqlite3.OperationalError: disk I/O error

(tested on CircleCI M1 medium instance)

AFAICT, all of the following conditions have to be met:

  • two (or more) Cache instances that use the same directory
  • fork before Cache.get()
  • MacOS

If any of the above is removed, the snippet works are expected.

SQLite threading mode (sqlite3.threadsafety) is set to multi-thread ("Threads may share the module, but not connections"), so I don't think that's causing this because diskcache reconnects on forking already.

$ python
Python 3.12.4 (main, Jul 18 2024, 14:14:06) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> sqlite3.threadsafety
1

Possibly related to #266.

@ddorian
Copy link
Contributor

ddorian commented Aug 9, 2024

I tested your code on Ubuntu 22.04 Python 3.12 x86 and it worked fine. This is (maybe) related to how fork works underneath in Python, though I used the same one:

import multiprocessing

multiprocessing.set_start_method("fork", force=True)

print(multiprocessing.get_start_method())
import os

import diskcache

a = diskcache.Cache(directory="/tmp/cache")
b = diskcache.Cache(directory="/tmp/cache")

os.fork()

a.get("key")

@randomir
Copy link
Author

randomir commented Aug 9, 2024

@ddorian, exactly, this works perfectly on Linux (as everything does, right?). Maybe I wasn't clear enough above, but MacOS is a necessary condition for reproduction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants