-
Notifications
You must be signed in to change notification settings - Fork 625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve perf by compiling with no locking and allowing users to configure it #59
Comments
That's a good catch. OK. I'll create a non-thread safe version only for Win/Mac/Linux (x86, x86_64). We need to wait a contribution for the other platforms. |
Uploaded a snapshot jar: https://oss.sonatype.org/content/repositories/snapshots/org/xerial/sqlite-jdbc/3.9.0-SNAPSHOT/, which is built with SQLITE_THREADSAFE=0 We also need to add a configuration method, which calls sqlite3_config(flags) function in sqlite3 to select thread-safe mode. |
@headius By the way, did you try If |
@xerial I will give it a try on the benchmark from jruby/jruby#3398. |
@xerial Unfortunately, SQLITE_THREADSAFE=2 seems to have the same issues as =1 and the default mode (which I believe is also =1). It doesn't seem right that sqlite blocks all threads regardless of what database they're connecting to, but that seems to be the case. |
Thanks for testing. Then we should build multiple-versions of native libraries with different compilation options. The default should use a thread safe library, and by some configuration we need to be able to use non thread-safe version. A possible approach is using a JVM system property (e.g., |
A property would work but the problem with properties is making sure they get set early enough. A typical JRuby user just runs a Ruby script, which loads all the libraries needed. Those libraries would need a way to set the property such that it's picked up by the driver. Can we make it a parameter of the connection? |
Once the native code is loaded by JVM, we have no way to unload it. So the connection parameter would work only for the first time, and in the second time we need to use the same mode (thread-safe or unsafe) with the first connection, but this behavior might be acceptable since an application should not mix multiple running modes. |
@xerial Ahh yes, I understand. I have another suggestion that may be too complex to support...two libraries. If we had two jni libraries with different symbols and different thread-safe modes, it would simply be a matter of calling the correct one. So, two JNI interfaces, two C files, two native libraries. I can understand if that's too much change to introduce. The property will work ok, but we'll have to document it well (somewhere in JRuby or related libraries) so people aren't constantly reporting performance bugs to us. |
@xerial I think the property will be fine. My current mystery is why other Ruby implementations do not seem to have this threading bottleneck with the sqlite3-ruby gem. |
I have sent an email to the sqlite-users list to ask about threadsafety and why we might be seeing serialized performance regardless of mode or database independence. |
I'm reluctant to provide two types of API (because it increases the maintenance cost), rather I would choose to prepare two types of packages: sqlite-jdbd.jar and sqlite-jdbc-singlethread.jar. |
@xerial the two package approach is actually might be the best. All jars in a typical JRuby application get loaded dynamically at runtime rather than statically on a classpath. |
OK. I'll tweak packaging scripts so that we can prepare two types of binaries (normal jar and single-thread one). |
28 days later... :-) Anything I can do to help you with this? |
Alternatively, you could change SQLITE_DEFAULT_MEMSTATUS Inside sqlite.c, there is a comment saying:
|
@headius I am using stock JRuby 9.0.4.0 and sqlite-jdbc from master.
|
For whatever reason, on my machine, the JRuby test didn't show as much difference between all of them. I did a basic test: https://gist.github.com/patcheng/28580eed030361dc0f90 This show better results:
|
xerial#59 disabled memstatus and make the library thread-safe
How many cores do you run on? |
I think that commit a6a559e is bogus. If multiple connections are created to the same database file, the database may get corrupted if SQLite has been compiled with I propose to use |
@headius My tests were performed on a MacBook Pro with Core i7 2.8 Hz. 4 real cores, and 4 hyperthreaded cores. @mkauf I think it's safer to stick with When compiling |
What's the status of this? It looks like either mode 1 or 2 will work with |
@patcheng Can you turn your changes into a PR? I'd like to see the full patch and have some of our users test it out. |
@patcheng Nevermind, I realized it was trivial :-) diff --git a/Makefile b/Makefile
index de1f1d3..df440a4 100644
--- a/Makefile
+++ b/Makefile
@@ -62,6 +62,8 @@ $(SQLITE_OUT)/sqlite3.o : $(SQLITE_UNPACKED)
-DSQLITE_ENABLE_FTS3_PARENTHESIS \
-DSQLITE_ENABLE_RTREE \
-DSQLITE_ENABLE_STAT2 \
+ -DSQLITE_THREADSAFE=1 \
+ -DSQLITE_DEFAULT_MEMSTATUS=0 \
$(SQLITE_FLAGS) \
$(SQLITE_OUT)/sqlite3.c @chuckremes Maybe you could build the driver with this patch and see how perf looks for you? I'd like to put this one to bed. |
@headius I will try to do so today. All I can say is "wow" when looking at all the followup you have performed on this issue since Oct 2015. Thank you very much. |
Testing this patched version in a production app right now. I'll let you know what kind of performance increase I see. Last run with un-patched driver took 8 hours and 50 minutes. A good chunk of that was DB access. |
Surprisingly, the latest run just finished. It took 59m 56s to complete. So this change basically shaved 8 hours off of a 9 hour run. |
Well I'm convinced! How about you, @xerial? |
Fixes #59: Set THREADSAFE=1 and DEFAULT_MEMSTATUS=0
We were investigating a concurrent performance issue with JRuby at jruby/jruby#3398 and after a few hours of investigation figured out that sqlite-jdbc is compiling sqlite with the default configuration of always threadsafe. Unfortunately it seems like that default mode does not treat separate database connections (to separate databases) as their own mutexes. As a result, parallel performance did not scale properly even when inserting into completely separate databases.
The fix, which is trivial, was to compile sqlite with SQLITE_THREADSAFE=0.
Obviously this is not sufficient for the library. The full fix would be to build sqlite this way but when opening connections default to a thread-safe setup. As it turns out, sqlite can upgrade connections to be more threadsafe, but it can't downgrade them. The current configuration sets them by default to serialize all accesses.
The full doco for this is here: https://www.sqlite.org/threadsafe.html
Basically, sqlite-jdbc needs to configure the library to be thread-unsafe as a minimum, and then by default specify thread-safety when opening the connection. Users can then specify configuration parameters to open the connection as thread-unsafe (single thread access) giving them better concurrency if inserting into isolated databases.
I don't have a full patch for sqlite-jdbc, but I wanted to start the dialog now. We have a large number of JRuby users hitting sqlite via your library in heavily concurrent applications, and this would give them a better chance of having good scaling characteristics.
The text was updated successfully, but these errors were encountered: