-
Notifications
You must be signed in to change notification settings - Fork 11
Compression
JSON documents can be big! When JSON is stored as text (in an nvarchar(max)
column), Nevermore comes up with different strategies for reading it, based on whether the table always has smaller documents (<1K) or not.
However, Nevermore also supports compressing the JSON. You can use this if you use SQL Server 2016 or higher.
Much of the data in JSON documents is repetitive. Property names, types and ID's are repeated over and over.
In Octofront, for example, the LicenseKeys
table keeps a history of the changes made to each license key. Each history entry has a type, date, and some properties. Those strings are repeated potentially hundreds of times. Compressing this data reduces the size of the table at the time of writing from 435 MB to 118 MB.
Good example to try in Octopus would be Variables
, DeploymentProcess
. Bigger documents with collections as properties.
The benefits of compression are:
- Less space used in the database
- Less space used in data transfer
The downside, of course, is CPU time spent compressing and decompressing when reading data.
In testing with (local)\SQLEXPRESS and a 120K document with a 50% compression ratio, these benefits canceled each other out.
GZIP | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
False | 1.560 ms | 0.2761 ms | 0.0151 ms | 64.4531 | 29.2969 | - | 357.26 KB |
True | 1.703 ms | 0.3508 ms | 0.0192 ms | 66.4063 | 31.2500 | - | 356.59 KB |
But SQL Server rarely runs on the same machine. In those cases, compression makes a big difference. Here's the same test running when using NetLimiter to limit bandwidth in/out of SQLEXPRESS to 100mbps:
GZIP | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
False | 21.242 ms | 1.637 ms | 0.0898 ms | 62.5000 | 31.2500 | - | 357.29 KB |
True | 7.756 ms | 1.099 ms | 0.0602 ms | 62.5000 | 23.4375 | - | 356.6 KB |
100mbps is probably a real-world best case too. In Octofront, it's unlikely we get that when talking from our web apps to the SQL Azure database for example. I suspect the same will be true for Octopus Cloud. Customer networks may be different, but I'd expect it might make a considerable impact for many of them.
To use compression, you need to:
- Change the JSON column type to
varbinary(max)
. (See tips on migrations below). - In your
DocumentMap
, setCompressed
totrue
.
That's it!
When inserting or updating data, Nevermore will use GZIP to write compressed data to the table. When reading, Nevermore will use GZIP to read it back.
When you enable compression, Nevermore specifically uses Unicode encoding and GZIP because these are compatible with the COMPRESS
and DECOMPRESS
functions in SQL Server 2016.
If you are querying in SQL, and want to use JSON_VALUE
for example, you still can. Just call the DECOMPRESS
function in SQL 2016 or newer, and cast it to a string.
select cast(DECOMPRESS([JSON]) as nvarchar(max)) from dbo.BigObjectCompressed
If you enable compression on a table that already has data, you need to migrate the data - you can't have some records uncompressed.
The COMPRESS
function in SQL Server is helpful for the migration. The basic plan would be:
- Create a new table with all the same columns (
VariableSet2
), but make the JSON columnvarbinary(max)
- Do an insert into/select from, calling
COMPRESS
over the[JSON]
column - Drop the old table and any indexes on it, and rename the new table
- Add indexes to the new table
Example of the insert into/select from:
insert into dbo.LicenseKeys2
(
Id,
Col1,
Col2,
....
[JSON]
)
select Id, Col1, Col2, .... COMPRESS([JSON]) from dbo.LicenseKey
We could have chosen other approaches for compression - using BSON instead of JSON, using UTF8 (support coming in SQL 2019) rather than Unicode, and using a better compression algorithm than GZIP. However, those would make it difficult to use any of the data from within SQL directly. The compatibility and migration benefits of this setup seem to be the optimal balance.
Further reading:
Overview
Getting started
Extensibility
Misc