Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move away from sqlite #307

Closed
tobias opened this issue Mar 17, 2015 · 12 comments
Closed

Move away from sqlite #307

tobias opened this issue Mar 17, 2015 · 12 comments

Comments

@tobias
Copy link
Member

tobias commented Mar 17, 2015

This is to track moving away from sqlite to something we can access
concurrently. We've had quite a few bugs related to concurrency issues
with sqlite (#266, #115, #105), and we can't do automatic promotion
with it. It's also possible to lock the database from outside of the
app when doing maintenance (as touched on in #226).

@technomancy and @xeqi have done quite a bit of work on using a
log-based event stream system. The event logging is already happening
in production, and there is work on the less-sql branch to use the log
instead of sqlite for queries. You can read about that work on
the mailing list. If we want to continue with the event log approach,
at least the following things need to happen:

  • the less-sql branch needs to be brought up to date
  • command-line maintenance needs to write entries to the log - the
    current shell scripts only alter the sqlite db, so no deletion
    events get written to the log. It may be worthwhile to have lein
    aliases to write these deletion events to the log.
  • if we do have processes other than the app itself writing to the
    logs, we'll need to move to file-based locking instead of using
    clojure's locking macro to prevent log corruption
  • we'll also need some way to tell the app to reload the logs if other
    processes can write to them (a file watcher perhaps?)

We don't currently have the backup instance running, but if we do plan
to bring that back, we need to take that in to consideration (file
locking may be enough here if both instances share the same log file
set.

An alternative approach would be to move to a different database that
better handles concurrency (postgres, etc). That delegates locking and
corruption issues to the db, but adds infrastructure complexity.

@tobias
Copy link
Member Author

tobias commented Mar 23, 2015

I've rebased the work from the less-sql branch, and updated it to the point where all of the tests are at least passing. I've pushed that work to the event-log branch. It wasn't a clean rebase, given that the less-sql branch was a couple of years old, and lot has changed since then. So even though the tests are passing, my current confidence in this branch is low.

@tobias
Copy link
Member Author

tobias commented Apr 20, 2015

Other options:

  • Move to h2 or some other embedded java db that supports concurrency. This may be doable with minimal changes, but we would still need some way make admin updates. I believe h2 has a CLI client that can connect to the embedded instance.
  • Serialize writes to sqlite via an agent, queue, or similar. We would need a way to wait for and receive the result of the operation.

@tobias tobias self-assigned this Apr 27, 2015
tobias added a commit that referenced this issue Apr 29, 2015
This forces all updates/inserts to the db to go through a
single-threaded executor, which prevents concurrent db access and the
resulting lock exceptions. This is hopefully a stopgap measure until we
get rid of sqlite entirely.
@tobias
Copy link
Member Author

tobias commented Apr 29, 2015

That commit serializes db writes through a single thread, and 7fa5980 moves the admin script functionality to the clojars.admin ns so we won't collide with the app's db writes when doing admin tasks. Note that serializing the writes is a stop-gap until we move away from sqlite completely.

@thiagofm
Copy link

Why can't we use just a normal RDBMS as postgres? I find that more straightforward and extensible.

@tobias
Copy link
Member Author

tobias commented Jul 30, 2015

Postgres would be nice, but there are costs associated with moving to it:

  • it's an additional process to monitor on the server
  • dev environments would need a postgres running
  • backups may be a little trickier

The serialized-writes solution that we have currently is a bit gross,
but is working currently, and solves all of the issues we had with
talking directly to sqlite, so I'm hesitant to bring in a more complex
solution right now.

swr1bm86 pushed a commit to swr1bm86/clojars-web that referenced this issue Oct 3, 2015
…ojars#307]

This forces all updates/inserts to the db to go through a
single-threaded executor, which prevents concurrent db access and the
resulting lock exceptions. This is hopefully a stopgap measure until we
get rid of sqlite entirely.
@kirillsalykin
Copy link

@tobias what do you think about using datomic free (with h2 as store)?

@tobias
Copy link
Member Author

tobias commented Nov 21, 2015

Thanks for the suggestion, but our data isn't something that we need to maintain history of, generally, and I don't want to add the complexity of running a transactor.

@jkutner
Copy link

jkutner commented Dec 17, 2015

PR: #443

@thestinger
Copy link

You could at least turn on SQLite's WAL mode so readers aren't blocked by writers and vice versa: https://www.sqlite.org/wal.html. The timeout handling is configurable: https://sqlite.org/faq.html#q5. SQLite isn't going to be fast with more than one concurrent writer but there's no reason to be hacking around it: it does work fine, just perhaps not how you want it to by default.

@tobias
Copy link
Member Author

tobias commented Feb 25, 2016

@thestinger - we'd happily take a pull request that enables WAL via the java SQLite driver.

@danielcompton
Copy link
Member

danielcompton commented Jul 30, 2016

Just a followup on this, I think there were (at least!) two more reasons why we were getting concurrency errors:

1: jdbc-sqlite was compiled without HAVE_USLEEP up until xerial/sqlite-jdbc@4b2b2c3 (xerial/sqlite-jdbc#104).

Running

(require '[clojure.java.jdbc :as j])

(dorun (pmap #(time (let [conn (get-connection)]
                      (dotimes [n 20]
                        (j/with-db-transaction [conn conn]
                                               (j/insert! conn :test {:rand (* % n)}))))) (range 5)))

I get

"Elapsed time: 17.946152 msecs"
"Elapsed time: 1019.802661 msecs"
"Elapsed time: 2022.392828 msecs"
"Elapsed time: 3027.563515 msecs"
SQLException [SQLITE_BUSY]  The database file is locked (database is locked)  org.sqlite.core.DB.newSQLException (DB.java:890)

This looks like the same issue as http://beets.io/blog/sqlite-nightmare.html, where the sqlite driver is sleeping for 1 second at a time before retrying, meaning that concurrent writes could stack up, especially combined with:

2: There is also xerial/sqlite-jdbc#59 which would allow us to use sqlite with two different threading options: multi-threaded, or single threaded. Both of these would probably be faster than the default serialised option.

There hasn't been a release of jdbc-sqlite for a while, but we should look at upgrading and taking advantage of these features when there is.

@tobias
Copy link
Member Author

tobias commented Dec 31, 2019

We've moved to postgres (see #736), which has been released as part of Clojars 79.

@tobias tobias closed this as completed Dec 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants