Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.sql.SQLException: database is locked #105

Closed
mpenet opened this issue Nov 24, 2012 · 5 comments
Closed

java.sql.SQLException: database is locked #105

mpenet opened this issue Nov 24, 2012 · 5 comments

Comments

@mpenet
Copy link

mpenet commented Nov 24, 2012

This happened once earlier, retrying just "fixed" it, but this might worth a report.

Here is the log:

Welcome to Clojars, mpenet!

Deploying jayq/jayq 0.3.0
Error: database is locked
java.sql.SQLException: database is locked
    at org.sqlite.DB.throwex(DB.java:399)
    at org.sqlite.DB.exec(DB.java:79)
    at org.sqlite.Conn.commit(Conn.java:390)
    at com.mchange.v2.c3p0.impl.NewProxyConnection.commit(NewProxyConnection.java:803)
    at clojure.java.jdbc.internal$transaction_STAR_.invoke(internal.clj:209)
    at clojars.db$add_jar$fn__1136.invoke(db.clj:287)
    at clojure.java.jdbc.internal$with_connection_STAR_.invoke(internal.clj:188)
    at clojars.db$add_jar.doInvoke(db.clj:287)
    at clojure.lang.RestFn.invoke(RestFn.java:425)
    at clojars.scp$finish_deploy.invoke(scp.clj:112)
    at clojars.scp$nail.invoke(scp.clj:140)
    at clojars.scp$_nailMain.invoke(scp.clj:158)
    at clojars.scp.nailMain(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at com.martiansoftware.nailgun.NGSession.run(Unknown Source)
@technomancy
Copy link
Collaborator

While switching to a DB that supports concurrency is probably the right long-term approach, a low-fi short-term fix might be to use a clojure.core/locking call around each place the promote field is read from or written to. Nothing granular, just a top-level lock.

If we were at all concerned about overall throughput this would be a disastrous suggestion, but as a mechanism for avoiding nasty exceptions while we regroup and come up with a better long-term solution, what do you think, @ato?

@pjstadig
Copy link

If it works on a retry, then another possibility might be adding automatic retries to promotion.

@technomancy
Copy link
Collaborator

OK, upon further digging it looks like SQLite can do the right thing without any special locking; it's just a case of bad defaults: http://sqlite.org/c3ref/busy_timeout.html

Unfortunately the while jdbc driver has a setBusyTimeout method, I can't figure out how get ahold of a Database object. The first step is to neuter Korma's annoying automatic connection pooling:

(swap! korma.db/_default assoc :pool (delay (:db config)))

But even at that point all we have is an org.sqlite.Conn connection object that neither exposes timeout methods nor a mechanism for getting at the underlying org.sqlite.Database.

http://sqlitejdbc.sourceforge.jp/org/sqlite/Database.html#setBusyTimeout(int)

Will continue digging.

@technomancy
Copy link
Collaborator

The plot thickens: there are actually two separate SQLite JDBC drivers that both use the org.sqlite package; the link above is to the one we're not using.

The one we are using does not expose a .setBusyTimeout method, but it does implement .setQueryTimeout incorrectly in a way that claimes to set the busy timeout:

https://bitbucket.org/xerial/sqlite-jdbc/issue/27/allow-user-to-specify-busy-timeout-when

However, setting it has no effect on the timeout; we still get a locking error about 50% of the time well before the timeout has expired.

@technomancy
Copy link
Collaborator

Aha; so it turns out that while .getQueryTimeout returns milliseconds, the .setQueryTimeout method interprets its arguments in terms of seconds. And if you set it to too large of a value (like 30,000 seconds) it's a no-op, but setting it to 30 actually makes the problem disappear for the repro case I manually constructed.

swr1bm86 pushed a commit to swr1bm86/clojars-web that referenced this issue Oct 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants