Skip to content

Compression

Paul Stovell edited this page Apr 15, 2020 · 19 revisions

JSON documents can be big! When JSON is stored as text (in an nvarchar(max) column), Nevermore comes up with different strategies for reading it, based on whether the table always has smaller documents (<1K) or not.

However, Nevermore also supports compressing the JSON. You can use this if you use SQL Server 2016 or higher.

Why use compression?

Much of the data in JSON documents is repetitive. Property names, types and ID's are repeated over and over.

In Octofront, for example, the LicenseKeys table keeps a history of the changes made to each license key. Each history entry has a type, date, and some properties. Those strings are repeated potentially hundreds of times. Compressing this data reduces the size of the table at the time of writing from 435 MB to 118 MB.

Good example to try in Octopus would be Variables, DeploymentProcess. Bigger documents with collections as properties.

The benefits of compression are:

  1. Less space used in the database
  2. Less space used in data transfer

The downside, of course, is CPU time spent compressing and decompressing when reading data.

In testing with (local)\SQLEXPRESS and a 120K document with a 50% compression ratio, these benefits canceled each other out.

GZIP Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
False 1.560 ms 0.2761 ms 0.0151 ms 64.4531 29.2969 - 357.26 KB
True 1.703 ms 0.3508 ms 0.0192 ms 66.4063 31.2500 - 356.59 KB

But SQL Server rarely runs on the same machine. In those cases, compression makes a big difference. Here's the same test running when using NetLimiter to limit bandwidth in/out of SQLEXPRESS to 100mbps:

GZIP Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
False 21.242 ms 1.637 ms 0.0898 ms 62.5000 31.2500 - 357.29 KB
True 7.756 ms 1.099 ms 0.0602 ms 62.5000 23.4375 - 356.6 KB

100mbps is probably a real-world best case too. In Octofront, it's unlikely we get that when talking from our web apps to the SQL Azure database for example. I suspect the same will be true for Octopus Cloud. Customer networks may be different, but I'd expect it might make a considerable impact for many of them.

Enabling compression

To use compression, you need to:

  1. Change the JSON column type to varbinary(max). (See tips on migrations below).
  2. In your DocumentMap, set Compressed to true.

That's it!

When inserting or updating data, Nevermore will use GZIP to write compressed data to the table. When reading, Nevermore will use GZIP to read it back.

Querying compressed data

When you enable compression, Nevermore specifically uses Unicode encoding and GZIP because these are compatible with the COMPRESS and DECOMPRESS functions in SQL Server 2016.

If you are querying in SQL, and want to use JSON_VALUE for example, you still can. Just call the DECOMPRESS function in SQL 2016 or newer, and cast it to a string.

select cast(DECOMPRESS([JSON]) as nvarchar(max)) from dbo.BigObjectCompressed

Migration

If you enable compression on a table that already has data, you need to migrate the data - you can't have some records uncompressed.

The COMPRESS function in SQL Server is helpful for the migration. The basic plan would be:

  1. Create a new table with all the same columns (VariableSet2), but make the JSON column varbinary(max)
  2. Do an insert into/select from, calling COMPRESS over the [JSON] column
  3. Drop the old table and any indexes on it, and rename the new table
  4. Add indexes to the new table

Example of the insert into/select from:

insert into dbo.LicenseKeys2
 (
   Id, 
   Col1,
   Col2,
   ....
   [JSON]
 )
select Id, Col1, Col2, .... COMPRESS([JSON]) from dbo.LicenseKey

Why not...

We could have chosen other approaches for compression - using BSON instead of JSON, using UTF8 (support coming in SQL 2019) rather than Unicode, and using a better compression algorithm than GZIP. However, those would make it difficult to use any of the data from within SQL directly. The compatibility and migration benefits of this setup seem to be the optimal balance.

Further reading:

Clone this wiki locally