Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encrypting temporary files #148

Closed
ncruces opened this issue Apr 19, 2024 · 25 comments
Closed

Encrypting temporary files #148

ncruces opened this issue Apr 19, 2024 · 25 comments

Comments

@ncruces
Copy link

ncruces commented Apr 19, 2024

Hi. I'm trying to develop my own encrypting VFS (n the context of developing a Go driver). I was looking into your code for inspiration, and have a few comments/concerns.

Have you tested this branch?

else if (flags & SQLITE_OPEN_SUBJOURNAL)
{
const char* dbFileName = sqlite3_filename_database(zName);
mcFile->pMainDb = mcFindDbMainFileName(mcFile->pVfsMC, dbFileName);
mcFile->zFileName = zName;
SQLITE3MC_DEBUG_LOG("mcVfsOpen SUB Journal: mcFile=%p fileName=%s dbFileName=%s\n", mcFile, mcFile->zFileName, dbFileName);
}

I… don't think you can do this: sqlite3_filename_database is only safe to call with a main database, journal, or WAL name? None of the other kinds of files are safe to call this, and can corrupt memory…

@ncruces
Copy link
Author

ncruces commented Apr 19, 2024

That said, encrypting these files is important.

SQLITE_OPEN_SUBJOURNAL is used for SAVEPOINTs, and statement journals.

SQLITE_OPEN_TRANSIENT_DB are (I think) used for materialization of queries, transient indices and transient databases for VACUUM.

SQLITE_OPEN_TEMP_JOURNAL are used for journals of TEMP databases, but are also used by the VDBE sorter (in a format called PMA that's nothing like a journal).

So all of these potentially store database data, though not necessarily in the form of pages (and with reserved bytes at the end).

@utelle
Copy link
Owner

utelle commented Apr 20, 2024

Have you tested this branch?

else if (flags & SQLITE_OPEN_SUBJOURNAL)
{
const char* dbFileName = sqlite3_filename_database(zName);
mcFile->pMainDb = mcFindDbMainFileName(mcFile->pVfsMC, dbFileName);
mcFile->zFileName = zName;
SQLITE3MC_DEBUG_LOG("mcVfsOpen SUB Journal: mcFile=%p fileName=%s dbFileName=%s\n", mcFile, mcFile->zFileName, dbFileName);
}

I… don't think you can do this: sqlite3_filename_database is only safe to call with a main database, journal, or WAL name? None of the other kinds of files are safe to call this, and can corrupt memory…

Up to now I haven't experienced any memory corruption, and I haven't received any bug reports either. However, it can indeed happen that zName is a NULL pointer. I have to further investigate how to handle that situation.

That said, encrypting these files is important.

SQLITE_OPEN_SUBJOURNAL is used for SNAPSHOTs, and statement journals.

SQLITE_OPEN_TRANSIENT_DB are (I think) used for materialization of queries, transient indices and transient databases for VACUUM.

SQLITE_OPEN_TEMP_JOURNAL are used for journals of TEMP databases, but are also used by the VBDE sorter (in a format called PMA that's nothing like a journal).

So all of these potentially store database data, though not necessarily in the form of pages (and with reserved bytes at the end).

Currently SQLite3 Multiple Ciphers has the same limitations as the official SQLite Encryption Extension: temporary databases and in-memory databases are not encrypted.

I'll look into the issue and check if and when there are relevant effects.

@ncruces
Copy link
Author

ncruces commented Apr 20, 2024

You should look at the implementation of these functions:
https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/main.c#L4845

They literally just walk back the entire memory until they find 4 consecutive 0 bytes:
https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/main.c#L4710

The only reason this works for database, journal, and WAL file names is that the pager allocates memory this way for compatibility purposes. But it's a huge hack.
https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/pager.c#L4797

@utelle
Copy link
Owner

utelle commented Apr 20, 2024

You should look at the implementation of these functions. [...]

I'm aware of these functions. And I verified that the function sqlite3_filename_database is always useable for the filename passed to the VFS Open method. However, there is the case of a NULL pointer passed in as the filename. Function sqlite3_filename_database returns NULL in that case. As a consequence it is impossible to locate the codec structure in that case, and any data will be written to the file unencrypted. This may impose a problem.

The only reason this works for database, journal, and WAL file names is that the pager allocates memory this way for compatibility purposes. But it's a huge hack. https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/pager.c#L4797

AFAICT this shouldn't impose a problem, because my code doesn't rely on this specific memory layout. Instead the function sqlite3_filename_database is used.

As said only the case that the filename is a NULL pointer may be a problem. I will check whether we can do anything about this use case.

@ncruces
Copy link
Author

ncruces commented Apr 20, 2024

So the reason this might be safe is that SQLITE_OPEN_SUBJOURNAL is always called with a NULL database filename:
https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/pager.c#L4496

Then sqlite3_filename_database always returns NULL:
https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/main.c#L4845

And you always get the key for the main database of the connection:

static sqlite3mc_file* mcFindDbMainFileName(sqlite3mc_vfs* mcVfs, const char* zFileName)
{
sqlite3mc_file* pDb;
sqlite3_mutex_enter(mcVfs->mutex);
for (pDb = mcVfs->pMain; pDb && pDb->zFileName != zFileName; pDb = pDb->pMainNext){}
sqlite3_mutex_leave(mcVfs->mutex);
return pDb;
}

I don't know if that's the intended behaviour. But all it takes for a crash/corruption is for SQLITE_OPEN_SUBJOURNAL to be used with a named database.

Edit: you're assuming SQLite won't ever do this. That's a fair assumption. A VFS in a shim chain might, but it'd be violating some reasonable assumptions, so that's OK.

Currently SQLite3 Multiple Ciphers has the same limitations as the official SQLite Encryption Extension: temporary databases and in-memory databases are not encrypted.

Without looking at their source code, it's hard to know what this really means. But TEMP databases (and tables) use SQLITE_OPEN_TEMP_DB, which I didn't mention. SQLITE_OPEN_TRANSIENT_DB is used for “main” database data, that I have little doubt about. What exactly SQLITE_OPEN_TEMP_JOURNAL stores when used by the VDBE sorter I'm not sure; it's definitely not a journal, though.

My current thinking is to encrypt temporary files (no file name, and delete-on-close) with a random key. But I use whole file encryption, and don't depend on the file format (database, journal, wal) to identify page boundaries (and page reserved space).

Feel free to close though. I hope this was useful, and not a waste of your time!

@utelle
Copy link
Owner

utelle commented Apr 20, 2024

So the reason this might be safe is that SQLITE_OPEN_SUBJOURNAL is always called with a NULL database filename: https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/pager.c#L4496

Right. But this means that page data will be written to disk unencrypted.

This should be avoided, of course. Actually, the intention is to use the encryption key of the main database for all related journal files, but without the filename of the main database file, it is not possible to determine the encryption key of the main database.

Then sqlite3_filename_database always returns NULL: https://github.com/sqlite/sqlite/blob/0526db32e2657b2b16fa3333e5b8c6e24754eaf2/src/main.c#L4845

And you always get the key for the main database of the connection:

No, unfortunately not. A NULL pointer will be returned.

I don't know if that's the intended behaviour. But all it takes for a crash/corruption is for SQLITE_OPEN_SUBJOURNAL to be used with a named database.

The SQLite core uses only proper filenames or NULL.

Edit: you're assuming SQLite won't ever do this. That's a fair assumption. A VFS in a shim chain might, but it'd be violating some reasonable assumptions, so that's OK.

Right. Function sqlite3_filename_database is also only guaranteed to work, if the filename was provided by the pager (or the SQLite core).

So, if a developer violates the rules, he/she will be to blame for any misbehaviour.

Currently SQLite3 Multiple Ciphers has the same limitations as the official SQLite Encryption Extension: temporary databases and in-memory databases are not encrypted.

Without looking at their source code, it's hard to know what this really means.

Of course, this is correct. Without access to the SEE source code it is hard to tell how encryption is actually handled. I suspect that it is still done in that way that it is not done via a VFS. The VFS API lacks an official way to determine the SQLite connection to which the file belongs.

But TEMP databases (and tables) use SQLITE_OPEN_TEMP_DB, which I didn't mention. SQLITE_OPEN_TRANSIENT_DB is used for “main” database data, that I have little doubt about. What exactly SQLITE_OPEN_TEMP_JOURNAL stores when used by the VDBE sorter I'm not sure; it's definitely not a journal, though.

If no page data are stored then we would be on the safe side.

My current thinking is to encrypt temporary files (no file name, and delete-on-close) with a random key.

This is certainly an option, if you store the key in the SQLite file descriptor.

But I use whole file encryption, and don't depend on the file format (database, journal, wal) to identify page boundaries (and page reserved space).

In the beginning I had thought about such an approach, too, but decided against it.

Feel free to close though. I hope this was useful, and not a waste of your time!

For now, I will keep the issue open, because the handling of NULL filenames is certainly suboptimal - to say the least.

Thanks for sharing your thoughts.

@ncruces
Copy link
Author

ncruces commented Apr 21, 2024

If no page data are stored then we would be on the safe side.

It's not page data, but it's at a minimum:

  1. data from the columns being sorted, and
  2. either data for all output columns, or the primary key for each row.

This is certainly an option, if you store the key in the SQLite file descriptor.

Encrypting temporary files with a random key is what I settled on. But I benefit from not trying to be the default VFS: people are explicitly saying they want encryption for this connection. I also refuse to open the connection if there's no encryption key.

I guess if you're trying to be the default VFS, and both unencrypted and encrypted databases, the best thing you can do is advise users to set PRAGMA temp_store=memory. Which is also what SQLCipher advises.

Well, thanks for listening! And if you're curious, this is my attempt:
https://github.com/ncruces/go-sqlite3/tree/main/vfs/adiantum

@utelle
Copy link
Owner

utelle commented Apr 21, 2024

If no page data are stored then we would be on the safe side.

It's not page data, but it's at a minimum:

  1. data from the columns being sorted, and
  2. either data for all output columns, or the primary key for each row.

Well, as soon as database content data are involved, it would certainly make sense to encrypt such data if they are written to disk. Therefore it is important to hold temporary data in memory using compile time option SQLITE_TEMP_STORE or PRAGMA temp_store=memory.

The build files coming with SQLite3 Multiple Ciphers use the above compile time option. However, the behaviour can be changed at the user's discretion with the PRAGMA command.

Looking at the following quote from the SEE documentation ...

The data is encrypted in both the main database and in the rollback journal or WAL file but is unencrypted when held in memory.

... I get the impression that the official SQLite encryption extension does not encrypt temporary data.

BTW, do you happen to have SQL command samples at hand to trigger the various temporary data types in the VFS Open method? I would like to further investigate which temporary data are potentially written to disk.

This is certainly an option, if you store the key in the SQLite file descriptor.

Encrypting temporary files with a random key is what I settled on. But I benefit from not trying to be the default VFS: people are explicitly saying they want encryption for this connection. I also refuse to open the connection if there's no encryption key.

Yes, SQLite3 Multiple Ciphers makes its VFS the default, but that can be changed.

I guess if you're trying to be the default VFS, and both unencrypted and encrypted databases,

Actually, the SQLite3 Multiple Ciphers VFS is a VFS shim which can be combined with any real VFS. So, the user can decide which VFS to use and whether encryption should be enabled or not.

the best thing you can do is advise users to set PRAGMA temp_store=memory. Which is also what SQLCipher advises.

Yes, temporary data should be kept in memory. However, this is not always possible in case of memory constraints.

Therefore it could be useful to allow encrypting temporary data, too.

Well, thanks for listening! And if you're curious, this is my attempt: https://github.com/ncruces/go-sqlite3/tree/main/vfs/adiantum

I'll certainly take a look. Thanks for the pointer.

@ncruces
Copy link
Author

ncruces commented Apr 21, 2024

BTW, do you happen to have SQL command samples at hand to trigger the various temporary data types in the VFS Open method? I would like to further investigate which temporary data are potentially written to disk.

I can trigger most of these by running mptest and speedtest1 (which are part of the SQLite test suite), but I'll try to isolate the behaviours in question.

@utelle
Copy link
Owner

utelle commented Apr 21, 2024

I can trigger most of these by running mptest and speedtest1 (which are part of the SQLite test suite),

I have used speedtest1 in the past for some experiments, but not for an in-depth analysis. I will take another look at those tests.

but I'll try to isolate the behaviours in question.

TIA.

@ncruces
Copy link
Author

ncruces commented Apr 29, 2024

The example is in Go, but the relevant part is SQL, so hopefully it helps.

func Test_files(t *testing.T) {
	file := filepath.Join(t.TempDir(), "test.db")

	db, err := Open(file)
	if err != nil {
		t.Fatal(err)
	}
	defer db.Close()

	err = db.Exec(`
		-- Setup
		PRAGMA cache_size=10;
		PRAGMA cache_spill=0;
		PRAGMA temp_store=file;

		BEGIN;

		-- SQLITE_OPEN_MAIN_DB
		-- SQLITE_OPEN_MAIN_JOURNAL

		CREATE TABLE t1 (a, b);
		INSERT INTO t1 VALUES (0, zeroblob(1048576));
		INSERT INTO t1 VALUES (1, zeroblob(1048576));

		-- SQLITE_OPEN_TEMP_DB
		-- SQLITE_OPEN_TEMP_JOURNAL

		CREATE TABLE temp.t2 (c, d);
		INSERT INTO temp.t2 VALUES (0, zeroblob(1048576));
		INSERT INTO temp.t2 VALUES (1, zeroblob(1048576));

		COMMIT;
	`)
	if err != nil {
		t.Fatal(err)
	}

	err = db.Exec(`BEGIN`)
	if err != nil {
		t.Fatal(err)
	}

	for i := 0; i < 1000; i++ {
		err = db.Exec(`
			-- SQLITE_OPEN_SUBJOURNAL
			SAVEPOINT savept;

			INSERT INTO t1 VALUES (random(), zeroblob(1024));
			INSERT INTO temp.t2 VALUES (random(), zeroblob(1024));
		`)
		if err != nil {
			t.Fatal(err)
		}
	}
	err = db.Exec(`COMMIT`)
	if err != nil {
		t.Fatal(err)
	}

	err = db.Exec(`
		-- SQLITE_OPEN_TRANSIENT_DB
		SELECT * FROM t1, t2 WHERE a=c;
	`)
	if err != nil {
		t.Fatal(err)
	}

	err = db.Exec(`
		-- SQLITE_OPEN_TEMP_JOURNAL
		SELECT * FROM t1 ORDER BY a;
	`)
	if err != nil {
		t.Fatal(err)
	}
}

Except for SQLITE_OPEN_MAIN_DB and SQLITE_OPEN_MAIN_JOURNAL (and SQLITE_OPEN_MAIN_WAL/OPEN_SUPER_JOURNAL, which I didn't test here), all other files: have null name, and flags are create, exclusive, delete-on-close. So, basically, they're temporary files.

From reading the magic numbers that are written into offset zero of the files:

  • SQLITE_OPEN_TEMP_DB is a database, like SQLITE_OPEN_MAIN_DB
  • SQLITE_OPEN_TEMP_JOURNAL is a rollback journal, like SQLITE_OPEN_MAIN_JOURNAL when used as such
  • SQLITE_OPEN_TRANSIENT_DB seems to be something else, but may be a database without a header
  • SQLITE_OPEN_SUBJOURNAL seems to be something else, but may be a journal without a header
  • SQLITE_OPEN_TEMP_JOURNAL is something else when used by the VDBE sorter

I've also confirmed that all of SQLITE_OPEN_TRANSIENT_DB, SQLITE_OPEN_SUBJOURNAL and SQLITE_OPEN_TEMP_JOURNAL can contain database data (by writing a specific string into a table and testing if it shows up in an xWrite call). SQLITE_OPEN_TEMP_DB and SQLITE_OPEN_TEMP_JOURNAL will also have whatever you write to the temp database.

@utelle
Copy link
Owner

utelle commented Apr 29, 2024

The example is in Go, but the relevant part is SQL, so hopefully it helps.

Thanks for providing the example code.

Except for SQLITE_OPEN_MAIN_DB and SQLITE_OPEN_MAIN_JOURNAL (and SQLITE_OPEN_MAIN_WAL/OPEN_SUPER_JOURNAL, which I didn't test here), all other files: have null name, and flags are create, exclusive, delete-on-close. So, basically, they're temporary files.

The main database and journals are properly encrypted with SQLite3 Multiple Ciphers.

All files with a null filename will not be encrypted, because it is impossible to identify the codec linked to the main database file. Unfortunately, a VFS has no means to identify the database connection of the file, thus making it impossible to locate the codec of the database connection.

From reading the magic numbers that are written into offset zero of the files:

  • SQLITE_OPEN_TEMP_DB is a database, like SQLITE_OPEN_MAIN_DB
  • SQLITE_OPEN_TEMP_JOURNAL is a rollback journal, like SQLITE_OPEN_MAIN_JOURNAL when used as such

In accordance with the official SQLite Encryption Extension (SEE) temporary databases (and temporary journals related to them) will not be encrypted. Therefore it is strongly recommended to store temporary data in memory.

  • SQLITE_OPEN_TRANSIENT_DB seems to be something else, but may be a database without a header

If PRAGMA temp_store=memory is used, I assume that the data are all stored in memory. However, I have to analyze under which circumstances a transient database is written to file.

  • SQLITE_OPEN_SUBJOURNAL seems to be something else, but may be a journal without a header

In principal, subjournals will be encrypted. But only, if the codec could be located. This is a use case I have to further investigate.

  • SQLITE_OPEN_TEMP_JOURNAL is something else when used by the VDBE sorter

I assume the this journal type will reside in memory. But I have to check in detail.

I've also confirmed that all of SQLITE_OPEN_TRANSIENT_DB, SQLITE_OPEN_SUBJOURNAL and SQLITE_OPEN_TEMP_JOURNAL can contain database data (by writing a specific string into a table and testing if it shows up in an xWrite call).

Yes, if temporary data are written to file this would be a problem with SQLite3 Multiple Ciphers, because no encryption occurs.

SQLITE_OPEN_TEMP_DB and SQLITE_OPEN_TEMP_JOURNAL will also have whatever you write to the temp database.

That is clear. However, SQLite3 Multiple Ciphers never claimed to encrypt temporary database objects.

IMHO the behaviour of SQLite3 Multiple Ciphers is ok, as long as temporary data are kept in memory.

If an attacker has access to the memory of a system, it actually doesn't matter whether temporary data are encrypted or not, because those data will be accessible in SQLite3's cache anyway.

@ncruces
Copy link
Author

ncruces commented Apr 29, 2024

All these files have null filenames (except main database, journal, and WAL, as well as super journals which have no data).

So it is really important that they are kept in memory. And they will, by default, with SQLITE_TEMP_STORE. So SQLite3 Multiple Ciphers should be OK, I agree.

At most, I think you should tweak the website because I don't think the claim about subjournals holds. They're not encrypted… but they should also stay in memory with the default settings.

PS: to be clear: with PRAGMA temp_store=memory none of these temporary files are created so, since that's the default for this build, you're safe, IMO; it's just important that this is made clear.

@utelle
Copy link
Owner

utelle commented May 2, 2024

At most, I think you should tweak the website because I don't think the claim about subjournals holds. They're not encrypted… but they should also stay in memory with the default settings.

Yes, subjournals impose a problem due to the null filename, unless they are kept in memory.

PS: to be clear: with PRAGMA temp_store=memory none of these temporary files are created so, since that's the default for this build, you're safe, IMO; it's just important that this is made clear.

I will try to improve the documentation in this respect.

@utelle
Copy link
Owner

utelle commented May 2, 2024

I updated the documentation mentioning the importance of using SQLITE_TEMP_STORE=2 (or SQLITE_TEMP_STORE=3) and PRAGMA temp_store=MEMORY;.

The previous encryption implementation encrypted subjournal files if they were written to disk. Unfortunately, the current VFS based encryption implementation is not able to support this, because subjournal files are opened with a null filename. However, this is not a real problem as long as temporary data are always kept in memory (as is the current default).

Your approach of using whole file encryption allows to encrypt temporary files, too. However, this approach has its drawbacks:

  1. File access is most likely inefficient quite often, because you may have to encrypt/decrypt multiple 4k blocks if a read/write request crosses a block boundary (as will be the case almost always for journal files due to their structure).
  2. The level of security is most likely lower, because nonces can't be used easily.

@ncruces
Copy link
Author

ncruces commented May 2, 2024

I agree with you on both points.

Your point about performance will most likely affect the WAL, as it's read more often. Unaligned writes also need read-update-write, though these can mostly be optimized (at the cost of complexity).

Nounces (and MACs) can be introduced (to the database file) in a backwards compatible way, following the footsteps of the checksum VFS.

Still, I don't think HBSH scheme benefits as much from these as other encryption schemes.

Anyway this wasn't meant as a ding on your encryption package. I learned a lot from it (thanks!) and wanted to contribute to make it better.

@utelle
Copy link
Owner

utelle commented May 3, 2024

Your point about performance will most likely affect the WAL, as it's read more often. Unaligned writes also need read-update-write, though these can mostly be optimized (at the cost of complexity).

WAL journal mode would be affected more than standard journal mode, because content data can reside in the journal files for quite a long time.

Another effect of using journal file structure agnostic encryption would be that several of SQLite's tools (like showjournal or showwal) would be useless.

Nounces (and MACs) can be introduced (to the database file) in a backwards compatible way, following the footsteps of the checksum VFS.

I'm not sure whether introducing nonce in that way makes sense, because it actually wouldn't change the encryption process (if I understand it correctly).

One problem you may want to consider in your encryption scheme is that a SQLite database file starts with a fixed 16 byte string (SQLite format 3\0). If you include this header in the encryption process, you give an attacker a lever to break the scheme.

Still, I don't think HBSH scheme benefits as much from these as other encryption schemes.

For classical encryption schemes nonces add considerably to the security. Not sure if and how a HBSH scheme could benefit.

Anyway this wasn't meant as a ding on your encryption package. I learned a lot from it (thanks!) and wanted to contribute to make it better.

I'm always grateful for feedback, especially if it helps to further improve the project.

@ncruces
Copy link
Author

ncruces commented May 4, 2024

I'm not sure whether introducing nonce in that way makes sense, because it actually wouldn't change the encryption process (if I understand it correctly).

Adiantum (or any HBSH) is designed to be "wide-block", where any change in the plaintext causes the entire ciphertext block to change. As long as the page size matches the block size, adding some entropy to the page is enough. Authentication is more complicated, because it's MAC-then-encrypt, which is less than ideal.

One problem you may want to consider in your encryption scheme is that a SQLite database file starts with a fixed 16 byte string (SQLite format 3\0). If you include this header in the encryption process, you give an attacker a lever to break the scheme.

There are no known chosen plaintext attacks on ChaCha or Adiantum. This is important for disk encryption as "all zero" sectors are common enough that it'd be "game over" if you could leverage those to get the key.

That said the algorithm can be easily adapted to not encrypt the first N bytes of database files, and that's something I have kept in mind for extensibility. Adding nounces and MACs requires block size to match page size, and that's best implemented by keeping page size unencrypted in the header.

@utelle
Copy link
Owner

utelle commented May 5, 2024

Adiantum (or any HBSH) is designed to be "wide-block", where any change in the plaintext causes the entire ciphertext block to change. As long as the page size matches the block size, adding some entropy to the page is enough.

Yes, you are right.

Authentication is more complicated, because it's MAC-then-encrypt, which is less than ideal.

Authentication is useful to detect modifications early. However, SQLite itself is usually able to detect corruption. So, authentication is not a critical issue in my opinion.

There are no known chosen plaintext attacks on ChaCha or Adiantum.

That's right, but that might change someday in the future.

That said the algorithm can be easily adapted to not encrypt the first N bytes of database files, and that's something I have kept in mind for extensibility. Adding nounces and MACs requires block size to match page size, and that's best implemented by keeping page size unencrypted in the header.

Some OSes inspect the first header bytes of a SQLite database file to handle it appropriately. So, not encrypting the first (at least) 24 bytes is sometimes essential.

For now, I will close this issue, because after analyzing once again the handling of temporary files in SQLite, I came to the conclusion that it would require several further patches to the original SQLite source code to handle encryption of temporary data in my implementation. And I'd like to avoid further patches. I already have to apply too many patches for my liking.

@utelle utelle closed this as completed May 5, 2024
@utelle
Copy link
Owner

utelle commented May 5, 2024

I'd like to add a further comment, because I had a new idea how encryption of temporary files could be handled.

It is in the nature of temporary files that they exist usually only for a short period of time and certainly not across sessions. So, in fact the passphrase used for encryption has to be known only for the lifetime of a temporary file. That is, the passphrase can actually be chosen at random - without the user even knowing it. That is, random encryption could simply be established on opening a temporary file and terminated on closing/deleting it.

However, I would prefer that encryption for temporary files is only used, if the main database is encrypted. The problem is that a temporary sqlite3_file handle does not know to which main database (or to which SQLite connection) it belongs. At the moment I have no idea how to pass the information, whether encryption should be used or not, to the temporary file handle.

@ncruces
Copy link
Author

ncruces commented May 5, 2024

Exactly. That's what I do, but that's because my VFS is never the default VFS, and I error out if there's no key. So if you're open a connection with my VFS, you opt into encryption of everything. But at least I can confirm that random keys work fine for temporary files, if you're considering them.

Again: I think your solution is perfectly reasonable, given the design constraints. As long as users know they should keep temporary files in memory, and that's the default behavior, all is well.

@utelle
Copy link
Owner

utelle commented May 5, 2024

Exactly. That's what I do, but that's because my VFS is never the default VFS, and I error out if there's no key. So if you're open a connection with my VFS, you opt into encryption of everything.

In this respect your approach is easier/simpler, because encryption is always enabled, if a user chooses to use your VFS.

Allowing to enable or disable encryption, and to select the encryption scheme per database file as in my implementation complicates matters.

But at least I can confirm that random keys work fine for temporary files,

Good to know. Thanks.

if you're considering them.

Yes, I do.

Again: I think your solution is perfectly reasonable, given the design constraints. As long as users know they should keep temporary files in memory, and that's the default behavior, all is well.

Yes, I know. Nevertheless I consider to support encryption for temporary files. That would make this project more flexible - for example, if it is used in an environment with limited memory resources.

However, I'll have to find a way to enable or disable encryption of temporary files per database connection.

@ncruces
Copy link
Author

ncruces commented May 5, 2024

I'm not sure this works with temporary files, but for some files, SQLite calls xFileControl with SQLITE_FCNTL_PDB code and the sqlite3* connection.

From there you can use sqlite3_db_filename(db, NULL) to get the main database filename, and:

  1. sqlite3_uri_parameter to get URI parameters
  2. sqlite3_database_file_object to get the sqlite3_file*.

@utelle
Copy link
Owner

utelle commented May 5, 2024

I'm not sure this works with temporary files, but for some files, SQLite calls xFileControl with SQLITE_FCNTL_PDB code and the sqlite3* connection.

Unfortunately, this call is only done for ordinary, temporary, and transient database files. Not for temporary journal files or subjournal files. That is, the problem is solved at most partially.

@utelle
Copy link
Owner

utelle commented May 7, 2024

I'm not sure this works with temporary files, but for some files, SQLite calls xFileControl with SQLITE_FCNTL_PDB code and the sqlite3* connection.

Unfortunately, this call is only done for ordinary, temporary, and transient database files.

After performing a few further tests I have to correct the above statement. In general, SQLite does not create temporary or transient database files immediately, but will start with data in memory. Only if more than a certain amount of memory will be required to hold the temporary database, actually a disk file will be created, but the code creating the file in that case does not call xFileControl with SQLITE_FCNTL_PDB.

That is, for my implementation it would require most likely 4-6 additional SQLite source code patches to properly handle encryption for temporary files - unless temporary files would be encrypted unconditionally.

I see the following 3 options for my implementation:

  1. Encrypt temporary files unconditionally, but allow to enable or disable this option at compile time (or even at runtime).
  2. Apply the required additional patches to handle encryption of temporary files.
  3. Leave it as is - that is, never encrypt temporary files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants