Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite "cannot commit - no transaction is active" on large data sets #83

Closed
jetzerb opened this issue Oct 9, 2019 · 7 comments
Closed
Labels

Comments

@jetzerb
Copy link
Contributor

jetzerb commented Oct 9, 2019

When a sufficiently large amount of data is passed to trdsql configured to use SQLite, an error is thrown indicating that there is no active transaction to commit.

Here's a demonstration. Although pulling from /dev/urandom, the behavior was consistent. Worked fine for any number less than or equal to 78768, and threw the error for larger numbers. (I would not be surprised if the cutoff number were different in every environment though.)

for byteCnt in 78768 78769
do
	dd if=/dev/urandom bs=$byteCnt count=1 |od -tx1 |sed 's/ /\x0a/g;' |
	trdsql -oat 'select count(*) as RecCnt, c1 as Val from - group by c1 order by count(*) desc limit 5';
done;
1+0 records in
1+0 records out
78768 bytes (79 kB, 77 KiB) copied, 0.0055778 s, 14.1 MB/s
+--------+-----+
| RecCnt | Val |
+--------+-----+
|    353 | a1  |
|    348 |  52 |
|    348 | d8  |
|    347 | 8a  |
|    345 |  81 |
+--------+-----+
1+0 records in
1+0 records out
78769 bytes (79 kB, 77 KiB) copied, 0.0052549 s, 15.0 MB/s
+--------+-----+
| RecCnt | Val |
+--------+-----+
+--------+-----+
2019/10/09 16:26:31 ERROR(COMMIT):cannot commit - no transaction is active

I looked briefly at the relevant code but didn't see anything alarming...

  • trdsql.go: Exec()
  • importer.go: ImportFile()
  • database.go: Import(), insertImport()
@noborus
Copy link
Owner

noborus commented Oct 10, 2019

Thank you for a good issue.

I think this error occurs because the query is too long.
The upper limit of bulk insert was too large.
I think it will be fixed in the next patch.

diff --git a/database.go b/database.go
index 20bbb7f..c74dbbe 100644
--- a/database.go
+++ b/database.go
@@ -61,7 +61,7 @@ func Connect(driver, dsn string) (*DB, error) {
        switch driver {
        case "sqlite3":
                db.quote = "`"
-               db.maxBulk = 500
+               db.maxBulk = 400
        case "mysql":
                db.quote = "`"
                db.maxBulk = 1000

I will verify and correct the correct value.

@noborus noborus added the bug label Oct 10, 2019
@noborus
Copy link
Owner

noborus commented Oct 10, 2019

I'm sorry. It is a misunderstanding.
I will check again.

@jetzerb
Copy link
Contributor Author

jetzerb commented Oct 10, 2019

I added some debugging within database.go:

@@ -193,7 +193,7 @@ func (db *DB) insertImport(table *Table, reader Reader) error {
        bulk := make([]interface{}, 0, table.maxCap)

        pRows := reader.PreReadRow()
-       for eof := false; !eof; {
+       for eof, cnt := false, 0; !eof; {
                if len(pRows) > 0 {
                        for (table.count * len(table.row)) < table.maxCap {
                                if len(pRows) == 0 {
@@ -216,6 +216,8 @@ func (db *DB) insertImport(table *Table, reader Reader) error {
                        }
                }
                stmt, err = db.bulkStmtOpen(table, stmt)
+               cnt++
+               debug.Printf("Bulk Insert %4d of %d records (cap = %d)",cnt,table.count, table.maxCap)
                if err != nil {
                        return err
                }

and called trdsql with the -debug flag in the test script from above. Output looks like this:

2019/10/10 07:46:14 [select][ ][count(*)][ ][as][ ][RecCnt][,][ ][c1][ ][as][ ][Val][ ][from][ ][-][ ][group][ ][by][ ][c1][ ][order][ ][by][ ][count(*)][ ][desc][ ][limit][ ][5]                                                            
2019/10/10 07:46:14 Set in CSV because the extension is unknown: [-]
2019/10/10 07:46:14 Column Names: [c1]
2019/10/10 07:46:14 Column Types: [text]
2019/10/10 07:46:14 CREATE TEMPORARY TABLE `-` ( `c1` text );
2019/10/10 07:46:14 INSERT INTO `-` (`c1`) VALUES (?)
2019/10/10 07:46:14 Bulk Insert    1 of 1 records (cap = 500)
2019/10/10 07:46:14 INSERT INTO `-` (`c1`) VALUES (?),(?),(?). . . 500 total placeholders . . .
2019/10/10 07:46:14 Bulk Insert    2 of 500 records (cap = 500)                                                                                                                                                                               
2019/10/10 07:46:14 Bulk Insert    3 of 500 records (cap = 500)                                                                                                                                                                               
. . . snip . . .
2019/10/10 07:46:14 Bulk Insert  167 of 500 records (cap = 500)                                                                                                                                                                               
2019/10/10 07:46:14 Bulk Insert  168 of 500 records (cap = 500)                                                                                                                                                                               
2019/10/10 07:46:14 INSERT INTO `-` (`c1`) VALUES (?),(?),(?),. . . 193 total placeholders . . .
2019/10/10 07:46:14 Bulk Insert  169 of 193 records (cap = 500)
2019/10/10 07:46:14 select count(*) as RecCnt, c1 as Val from `-` group by c1 order by count(*) desc limit 5
+--------+-----+
| RecCnt | Val |
+--------+-----+
+--------+-----+
2019/10/10 07:46:14 ERROR(COMMIT):cannot commit - no transaction is active

The only difference between the first query that works and the second one that doesn't is the number of records inserted on the final iteration (Bulk insert 169).

This might be an issue with the go-sqlite3 driver.
The error message itself is coming out of the large switch statement in sqlite3-binding.c (search for "Opcode: AutoCommit P1 P2").

Question: Do you need to bother with a transaction at all for SQLite? Every execution of trdsql creates a new in-memory DB, right? So no need to worry about any other users.

@noborus
Copy link
Owner

noborus commented Oct 10, 2019

Can you try this patch?

diff --git a/importer.go b/importer.go
index f97a4ac..d2da0c2 100644
--- a/importer.go
+++ b/importer.go
@@ -1,9 +1,11 @@
 package trdsql
 
 import (
+	"bufio"
 	"compress/gzip"
 	"fmt"
 	"io"
+	"io/ioutil"
 	"log"
 	"os"
 	"path/filepath"
@@ -258,7 +260,8 @@ func importFileOpen(tableName string) (io.ReadCloser, error) {
 
 func singleFileOpen(fileName string) (io.ReadCloser, error) {
 	if len(fileName) == 0 || fileName == "-" || strings.ToLower(fileName) == "stdin" {
-		return os.Stdin, nil
+		reader := bufio.NewReader(os.Stdin)
+		return ioutil.NopCloser(reader), nil
 	}
 	fileName = trimQuote(fileName)
 	file, err := os.Open(fileName)

I don't know why, but I noticed that it works fine except for stdin.
I think this happens when the input is unstable from stdin.

@noborus
Copy link
Owner

noborus commented Oct 10, 2019

Question: Do you need to bother with a transaction at all for SQLite? Every execution of trdsql creates a new in-memory DB, right? So no need to worry about any other users.

Yes. sqlite3 requires a transaction.
If there is no transaction, go-sqlite3 may execute multiple SQLs in different sessions.
Therefore, the cache mode must be set to shared or executed in the same transaction.

@jetzerb
Copy link
Contributor Author

jetzerb commented Oct 10, 2019

Your patch works!

  • I verified your findings: I only got the error when piping data to trdsql. When querying a file I did not receive the error
  • After applying your patch, I no longer get error. Here are my results from a 10M row query:
dd if=/dev/urandom bs=10000000 count=1 |od -tx1 |sed 's/ /\x0a/g;' |
trdsql -oat 'select count(*) as RecCnt, c1 as Val from - group by c1 order by count(*) desc limit 5'

Results:

1+0 records in
1+0 records out
10000000 bytes (10 MB, 9.5 MiB) copied, 8.87951 s, 1.1 MB/s
+--------+-----+
| RecCnt | Val |
+--------+-----+
|  39544 |  60 |
|  39543 |  82 |
|  39536 |  94 |
|  39527 |  08 |
|  39496 | aa  |
+--------+-----+

@noborus
Copy link
Owner

noborus commented Oct 10, 2019

Thank you for trying!
Thank you for the report!

noborus added a commit that referenced this issue Oct 11, 2019
Wrap os.Stdin with bufio.NewReader (Fix #83).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants