Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to open database if opened at any point with a third party tool #119

Closed
coltoneshaw opened this issue Oct 2, 2023 · 10 comments
Closed

Comments

@coltoneshaw
Copy link

This may be an issue listed, i searched and couldn't find anything exact. Also posted in the sqlite library I use that pulls this package - m4heshd/better-sqlite3-multiple-ciphers#62

I seem to be having a problem when connecting to the database via SQLiteStudio, it breaks something. I assume it's my WAL archive.

Config when opening the db:

PRAGMA cipher='chacha20'
PRAGMA journal_mode = WAL
PRAGMA temp_store = memory
PRAGMA mmap_size = 30000000000
PRAGMA page_size = 32768
PRAGMA vacuum

Steps to break it:

  1. Start the electron application
  2. Set the password for the database
  3. Refresh the electron application
  4. Type the password in to unlock it, all works.
  5. close the electron application
  6. Connect to the database via SQLiteStudio
  7. Make a few queries on SQLiteStudio
  8. Disconnect from database in SQLiteStudio
  9. Start application again
  10. Type password to unlock (same as step 4 above)
  11. all queries from the application result in error: SqliteError: database disk image is malformed

Alternate steps to break (if the top has a fix, this is not relevant)

  1. Modify the config and remove the journal_mode = WAL
  2. Delete the database file, so it creates a new one
  3. Start the electron application
  4. Set the password, it fails here with error: SqliteError: database disk image is malformed`

Here it does not fail instantly, though. It actually does half the queries and then starts to fail.

BEGIN
[1] 12:08:43.585 › create table `peoplePossibleMatches` (`personMatchId` text not null, `peopleId` text not null, `confidence` integer not null);
[1] 12:08:43.585 › create unique index `peoplepossiblematches_personmatchid_peopleid_unique` on `peoplePossibleMatches` (`personMatchId`, `peopleId`)
[1] 12:08:43.586 › COMMIT
[1] 12:08:43.587 › select `name` from `sqlite_master` where `type` = 'table'
[1] 12:08:43.587 › select * from `migrations`
[1] 12:08:43.588 › [Database Knex Query] error: SqliteError: database disk image is malformed - Query String: 'select * from `migrations`'

Questions:

  1. How can I fix the above error, if it happens?
  2. Is there a way to not have SQLiteStudio break the database?
@utelle
Copy link
Owner

utelle commented Oct 2, 2023

This may be an issue listed, i searched and couldn't find anything exact. Also posted in the sqlite library I use that pulls this package - m4heshd/better-sqlite3-multiple-ciphers#62

Well, with several different components involved it is often difficult to track down where exactly the problem lies and where it needs to be resolved.

I seem to be having a problem when connecting to the database via SQLiteStudio, it breaks something. I assume it's my WAL archive.

Using WAL mode can be a bit tricky sometimes. First question: which version of SQLiteStudio are you using? There were WAL related problems in early SQLite3MultipleCiphers versions. Unfortunately, SQLiteStudio updated its SQLite3MultipleCiphers component only in version 3.4.0 (released in November 2022). Prior versions of SQLiteStudio used a SQLite3MultipleCiphers version with broken WAL support. That is, in case you still use an old SQLiteStudio version upgrading to the latest version maybe solves the problem.

Config when opening the db:

PRAGMA cipher='chacha20'
PRAGMA journal_mode = WAL
PRAGMA temp_store = memory
PRAGMA mmap_size = 30000000000
PRAGMA page_size = 32768
PRAGMA vacuum

Hm, there exists no PRAGMA vacuum inSQLite. Did you intend to set auto_vacuum mode?

Steps to break it:

1. Start the electron application

2. Set the password for the database

3. Refresh the electron application

4. Type the password in to unlock it, all works.

5. **close the electron application**

6. Connect to the database via SQLiteStudio

7. Make a few queries on SQLiteStudio

8. Disconnect from database in SQLiteStudio

9. Start application again

10. Type password to unlock (same as step 4 above)

11. all queries from the application result in `error: SqliteError: database disk image is malformed`

What happens, if you reopen the database repeatedly from your application (without using any other application to access the database)? Does that work?

In one of your posts in m4heshd/better-sqlite3-multiple-ciphers#62 you say that the .db-shm and .db-wal still exist after you closed your application. Usually those files are deleted as soon as the last connection to the database is closed. If you are sure that all applications have closed their database connections, but the WAL files still exist, maybe one of the closing operations failed for some reason. Details about WAL files can be found here.

Alternate steps to break (if the top has a fix, this is not relevant)

1. Modify the config and **remove** the `journal_mode = WAL`

2. Delete the database file, so it creates a new one

3. Start the electron application

4. Set the password, it fails here with error: SqliteError: database disk image is malformed`

This is a bit strange. Usually this only happens, if a database file was created without encryption and you then set a password for an already existing unencrypted database. Of course, this will fail, because no password is required to access an unencrypted database file. To encrypt an unencrypted database file you have to use PRAGMA rekey.

Here it does not fail instantly, though. It actually does half the queries and then starts to fail.

This is really weird. AFAIK this can only happen if different database connections interfer with each other.

BEGIN
[1] 12:08:43.585 › create table `peoplePossibleMatches` (`personMatchId` text not null, `peopleId` text not null, `confidence` integer not null);
[1] 12:08:43.585 › create unique index `peoplepossiblematches_personmatchid_peopleid_unique` on `peoplePossibleMatches` (`personMatchId`, `peopleId`)
[1] 12:08:43.586 › COMMIT
[1] 12:08:43.587 › select `name` from `sqlite_master` where `type` = 'table'
[1] 12:08:43.587 › select * from `migrations`
[1] 12:08:43.588 › [Database Knex Query] error: SqliteError: database disk image is malformed - Query String: 'select * from `migrations`'

Questions:

1. How can I fix the above error, if it happens?

2. Is there a way to not have SQLiteStudio break the database?

At the moment I have not enough information to answer these questions.

@coltoneshaw
Copy link
Author

Thanks for your reply! I'm not sure what layer the problem exists at, or if it's just me being silly. However, I very much so appreciate your help! I've spent way too much time trying to solve this before opening github issues.

I'm on v3.4.4 of SQLiteStudio.

Hm, there exists no PRAGMA vacuum inSQLite. Did you intend to set auto_vacuum mode?

I guess I need to validate things more. I was going based on https://gist.github.com/phiresky/978d8e204f77feaa0ab5cca08d2d5b27 which suggests pragma vacuum to actually run a vacuum.

I modified the statements to have pragma auto_vacuum = FULL and to run a vacuum on start.

So, I believe I've solved that?

In one of your posts in m4heshd/better-sqlite3-multiple-ciphers#62 you say that the .db-shm and .db-wal still exist after you closed your application. Usually those files are deleted as soon as the last connection to the database is closed. If you are sure that all applications have closed their database connections, but the WAL files still exist, maybe one of the closing operations failed for some reason. Details about WAL files can be found here.

I managed to fix the persistence of .db-shm and .db-wal to correctly remove these files. I believe this may have been an alternate issue within the application that may have been a contributing factor here in masking the issue originally. If you don't remove those files ever, everything just works.

What's interesting is I may have uncovered some additional things that are breaking this, that didn't break it when I was using the non-crypto lib of SQLite. It could be all user error.

When starting the application, I run page_size = 32768, mmap_size = 300000000, and temp_store = memory every time I start a connection. Specifically, in the order I sent above. If you reorder these and have page_size and mmap_size come first BEFORE you set the journal_mode it seems to work, but not consistently.

So, based on the above, I made a change to how the application opens. It only runs the commands if it's a new database and has not had journal_mode = WAL set yet. This seems to have fixed the problem, but it's interesting because it did not happen in the original SQLite3 build I was using.

After a bit more testing, it seems that with the original SQLite3 build they were just ignored because they were set after the journal_mode, but now it's causing some kind of inconsistency. I reordered the way they're set, because Google seems to say you cannot set these AFTER the journal_mode has been set to wal.

mmap_size is ignored, page_size is set properly this way and persists on reconnecting to the database.

    const pragma = this.db.pragma('journal_mode');
    if (pragma !== 'wal') {
      this.db.pragma('mmap_size = 3000000');
      this.db.pragma('page_size = 32768');
      this.db.pragma('temp_store = memory');
      this.db.pragma('journal_mode = WAL');
    }

This is a bit strange. Usually this only happens, if a database file was created without encryption and you then set a password for an already existing unencrypted database. Of course, this will fail, because no password is required to access an unencrypted database file. To encrypt an unencrypted database file you have to use PRAGMA rekey.

Good note on this, I actually have to build a migration to move existing databases to this encrypted format.

@coltoneshaw
Copy link
Author

I THINK I'VE FOUND IT.

A bit more testing, the mmap_size seems to exclusively be the issue. But it's ONLY an issue when paired with encryption.

Breaks

Generate a new primary.db file. I let the application do this, just to keep everything the same. I then close the application and manually set the below commands, NO WAL involved here to remove that variable.

PRAGMA cipher='chacha20';
PRAGMA key ='test';
PRAGMA mmap_size = 30000000000;
PRAGMA page_size = 32768;
PRAGMA temp_store = memory;
PRAGMA auto_vacuum = FULL;

Then I run the below query, and it crashes SQLiteStudio and prevents the db from ever being read again.

select `name` from `sqlite_master` where `type` = 'table'

When you close / reopen the file it returns:

[14:25:58] Could not add database ./primary.db: unsupported file format

Works

Same as above, but removing the mmap_size

PRAGMA cipher='chacha20';
PRAGMA key ='test';
PRAGMA page_size = 32768;
PRAGMA temp_store = memory;
PRAGMA auto_vacuum = FULL;

So, I guess my question really boils down to am i using mmap_size wrong or is it not supported?

@utelle
Copy link
Owner

utelle commented Oct 2, 2023

Thanks for your reply! I'm not sure what layer the problem exists at, or if it's just me being silly.

I'm always glad to get feedback and/or bug reports - it's quite often a chance to enhance the software or to fix subtle bugs. Yes, sometimes it's simply a misuse of a feature, but sometimes we find real bugs.

I'm on v3.4.4 of SQLiteStudio.

Good. So this should not be the cause of trouble.

Hm, there exists no PRAGMA vacuum inSQLite. Did you intend to set auto_vacuum mode?

I guess I need to validate things more. I was going based on https://gist.github.com/phiresky/978d8e204f77feaa0ab5cca08d2d5b27 which suggests pragma vacuum to actually run a vacuum.

Maybe a typo. Just VACUUM would be correct.

I modified the statements to have pragma auto_vacuum = FULL and to run a vacuum on start.

Personally, I would leave the auto commit mode at its default. If you have lots of data and massive changes (really many insertions and deletions) then it is better to explicitly run VACUUM from time to time.

I managed to fix the persistence of .db-shm and .db-wal to correctly remove these files. I believe this may have been an alternate issue within the application that may have been a contributing factor here in masking the issue originally. If you don't remove those files ever, everything just works.

There are occasions when WAL files remain on disk - for example, if an application crashes without closing. Removing those WAL files is calling for trouble. You may end up with corrupted database files or lost data.

What's interesting is I may have uncovered some additional things that are breaking this, that didn't break it when I was using the non-crypto lib of SQLite. It could be all user error.

Well, there may exist SQLite features that can't be combined with encryption or need special handling. There are so many SQLite options that it is hardly possible to test all possible combinations.

When starting the application, I run page_size = 32768, mmap_size = 300000000, and temp_store = memory every time I start a connection. Specifically, in the order I sent above. If you reorder these and have page_size and mmap_size come first BEFORE you set the journal_mode it seems to work, but not consistently.

I have to admit that I have never tested PRAGMA mmap_size in conjunction with encryption. It may well be that this combination is simply not possible. I will check the code whether enabling memory mapping could cause problems for encrypted databases.

So, based on the above, I made a change to how the application opens. It only runs the commands if it's a new database and has not had journal_mode = WAL set yet. This seems to have fixed the problem, but it's interesting because it did not happen in the original SQLite3 build I was using.

WAL mode is persistent. That is, if it was set once, it will automatically set on opening the database file again. Which journal mode is active, can be queried with PRAGMA journal_mode;.

After a bit more testing, it seems that with the original SQLite3 build they were just ignored because they were set after the journal_mode, but now it's causing some kind of inconsistency. I reordered the way they're set, because Google seems to say you cannot set these AFTER the journal_mode has been set to wal.

That's correct. For example, the page size can't be changed when WAL mode is enabled. That is, the order of pragmas matters.

@utelle
Copy link
Owner

utelle commented Oct 2, 2023

I THINK I'VE FOUND IT.

A bit more testing, the mmap_size seems to exclusively be the issue. But it's ONLY an issue when paired with encryption.

As stated in my previous post it may be that it is not possible to combine encryption with memory mapping. I will have to check the code and will report my findings here. However, this may take some time.

Breaks

Generate a new primary.db file. I let the application do this, just to keep everything the same. I then close the application and manually set the below commands, NO WAL involved here to remove that variable.

PRAGMA cipher='chacha20';
PRAGMA key ='test';
PRAGMA mmap_size = 30000000000;
PRAGMA page_size = 32768;
PRAGMA temp_store = memory;
PRAGMA auto_vacuum = FULL;

Then I run the below query, and it crashes SQLiteStudio and prevents the db from ever being read again.

select `name` from `sqlite_master` where `type` = 'table'

When you close / reopen the file it returns:

[14:25:58] Could not add database ./primary.db: unsupported file format

Works

Same as above, but removing the mmap_size

PRAGMA cipher='chacha20';
PRAGMA key ='test';
PRAGMA page_size = 32768;
PRAGMA temp_store = memory;
PRAGMA auto_vacuum = FULL;

So, I guess my question really boils down to am i using mmap_size wrong or is it not supported?

Yes, that's indeed the question. Looking at your experiments chances are high that PRAGMA mmap_size is actually not supported if encryption is enabled. Whether it will be possible to support it together with encryption, I don't know yet.

@coltoneshaw
Copy link
Author

As stated in my previous post it may be that it is not possible to combine encryption with memory mapping. I will have to check the code and will report my findings here. However, this may take some time.

No rush, if it's not supported it's not supported! I just didn't know that, and it was a part of my config and really throwing me off here. I spent way too much time figuring that out!

Personally, I would leave the auto commit mode at its default. If you have lots of data and massive changes (really many insertions and deletions) then it is better to explicitly run VACUUM from time to time.

Any specific reason why? I know it's not related to this issue, but i'm curious now.

@utelle
Copy link
Owner

utelle commented Oct 3, 2023

As stated in my previous post it may be that it is not possible to combine encryption with memory mapping. I will have to check the code and will report my findings here. However, this may take some time.

No rush, if it's not supported it's not supported! I just didn't know that, and it was a part of my config and really throwing me off here. I spent way too much time figuring that out!

I made a quick check of the old SQLite code (when the encryption API was still included in it). And my suspicion was confirmed that memory mapping can't be combined with encryption.

I will try to add a check in the current implementation, so that memory mapping can be used for unencrypted databases, but not for encrypted ones.

Personally, I would leave the auto commit mode at its default. If you have lots of data and massive changes (really many insertions and deletions) then it is better to explicitly run VACUUM from time to time.

Any specific reason why? I know it's not related to this issue, but i'm curious now.

Well, performing a VACUUM operation is an expensive operation - in time and space. The gain is often only small, because SQLite does a fairly good job in managing the database pages. And on SSD storage (that is very common nowadays) fragmentation plays only a minor role. Therefore enabling auto vacuum brings more costs than gain. It is even possible that fragmentation increases.

For databases with a high frequency of inserts and deletes it can still make sense to manually perform a VACUUM from time to time to compact the database file.

@coltoneshaw
Copy link
Author

I will try to add a check in the current implementation, so that memory mapping can be used for unencrypted databases, but not for encrypted ones.

Thanks!

For databases with a high frequency of inserts and deletes it can still make sense to manually perform a VACUUM from time to time to compact the database file.

This makes sense, I have a large amount of changes that happen while the db is open. When closing I run optimize and on opening I run vacuum (correctly now). So, it sounds like I've solved what I need on that!


I appreciate your help on pinning this issue! Feel free to leave this open to track adding mmap or close it out if that's your process.

@utelle
Copy link
Owner

utelle commented Oct 3, 2023

I appreciate your help on pinning this issue! Feel free to leave this open to track adding mmap or close it out if that's your process.

I leave this issue open, until it is resolved.

utelle added a commit that referenced this issue Oct 3, 2023
- Fixed issue #118 - tvOS/watchOS compilation errors. On Apple platforms the function SecRandomCopyBytes() will now be used instead of getentropy().
- Fixed issue #119 - "PRAGMA mmap_size" conflicts with encrypted databases, a check has been added to allow this pragma for unencrypted databases.
- Added "PRAGMA memory_security" to allow to clear memory before it is freed. This feature can have a considerable impact on performance and is therefore disabled by default.
@utelle
Copy link
Owner

utelle commented Oct 3, 2023

I released version 1.7.0. If PRAGMA mmap_size is used, this version will check whether the database is encrypted or not. If it is encrypted, memory mapping will not be used. However, unencrypted databases can still gain from memory mapping.

So, I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants