Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make zap files smaller #992

Merged
merged 18 commits into from
Apr 18, 2023
Merged

Conversation

tecimovic
Copy link
Collaborator

@tecimovic tecimovic commented Apr 11, 2023

This PR addresses the problem of zap files being very large.

Following is done:
0. fileFormat is put into the ZAP files. It is set to 1 from this point onward. Lack of fileFormat assumes fileFormat 0.

  1. Zap can read old format or a new format. Makes no difference when files are being read in.
  2. When zap writes files, it ALWAYS writes file format 1, unless you override this with --saveFileFormat command line argument, or ZAP_SAVE_FILE_FORMAT environment variable.
  3. The file format is just a simple post-processing (upon write) or pre-processing (upon read) of the data from JSON file. This change is not "deep" inside zap. It literally just produces JS as it did earlier, and then does some "packing/unpacking" on the JSON data, converting some objects into Strings and the other way around. There is NO CHANGE to database queries, or anything else. It's just an ouput/input filter at the point where data is writen-to/read-from the file itself.

@codecov-commenter
Copy link

codecov-commenter commented Apr 11, 2023

Codecov Report

Merging #992 (fa2136c) into master (9b04799) will increase coverage by 0.16%.
The diff coverage is 81.57%.

@@            Coverage Diff             @@
##           master     #992      +/-   ##
==========================================
+ Coverage   66.87%   67.03%   +0.16%     
==========================================
  Files         156      157       +1     
  Lines       16985    17171     +186     
  Branches     3690     3750      +60     
==========================================
+ Hits        11358    11511     +153     
- Misses       5627     5660      +33     
Impacted Files Coverage Δ
src-electron/db/query-command.js 69.37% <ø> (ø)
src-electron/main-process/startup.js 43.38% <ø> (ø)
src-electron/importexport/file-format.js 80.23% <80.23%> (ø)
src-electron/importexport/export.js 97.11% <81.81%> (-1.84%) ⬇️
.../matter/app/zap-templates/templates/chip/helper.js 53.35% <100.00%> (ø)
src-electron/importexport/import.js 95.12% <100.00%> (+0.25%) ⬆️
src-electron/util/types.js 79.62% <100.00%> (+0.80%) ⬆️

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@tecimovic tecimovic force-pushed the zap-file-minimization branch from 94da7c8 to 000c658 Compare April 15, 2023 16:20
@tecimovic tecimovic marked this pull request as ready for review April 16, 2023 13:35
@tecimovic tecimovic linked an issue Apr 16, 2023 that may be closed by this pull request
@@ -159,7 +159,7 @@ test(
x = await testQuery.selectCountFrom(db, 'ENDPOINT_TYPE_CLUSTER')
expect(x).toBe(27)
x = await testQuery.selectCountFrom(db, 'ENDPOINT_TYPE_COMMAND')
expect(x).toBe(28)
expect(x).toBe(26)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update the developer documentation with examples for an easy reference in the future.

Copy link
Contributor

@bzbarsky-apple bzbarsky-apple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a summary of the specific changes being made, apart from the attribute/command bits I already saw above. Is anything else changing?

What actual problem are we trying to solve? If it's just the size of the .zap file on disk, then using gzipped json would be a much simpler fix, I expect, that would give even better space savings. If it's not size on disk, what is the problem being solved?

"side": "server",
"enabled": 1,
"attributes": [
"+ | 0x0000 | | server | RAM | | | 0x08 | 0 | 0 | 65534 | 0 => ZCL version [int8u]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the main change here is that attributes, instead of being stored as actual objects in the JSON, are now being stored as some sort of string that ZAP will parse?

This seems like a significant step backwards to me. It includes even more information in the .zap file that duplicates the XML (e.g. the attribute types) in a way that makes it even harder to remove. And there's no obvious documentation for the fields here, which makes working with the .zap files without using the ZAP UI (a common ask) more complicated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that gzipping the JSON would be a much more saving. However this is attempting to find a ballance between:

  • make files smaller
    AND
  • keep them in text format, readable and writable by humans in text editor.

The main problem is, that storing an attribute takes 16 lines of text:

{
            "name": "ClusterRevision",
            "code": 65533,
            "mfgCode": null,
            "side": "client",
            "type": "int16u",
            "included": 1,
            "storageOption": "RAM",
            "singleton": 0,
            "bounded": 0,
            "defaultValue": "2",
            "reportable": 1,
            "minInterval": 1,
            "maxInterval": 65534,
            "reportableChange": 0
}

So if you go scrolling over the text file with 6 blocks like this, you get lost in this.
A block of few attributes like this, is arguably easier to peruse:

"attributes": [
            "+ | 0x0000 |        | server | RAM |           |       |                 0x00 | 1 | 0 | 65344 | 0 => SceneCount [int8u]",
            "+ | 0x0001 |        | server | RAM |           |       |                 0x00 | 1 | 0 | 65344 | 0 => CurrentScene [int8u]",
            "+ | 0x0002 |        | server | RAM |           |       |               0x0000 | 1 | 0 | 65344 | 0 => CurrentGroup [group_id]",
            "+ | 0x0003 |        | server | RAM |           |       |                 0x00 | 1 | 0 | 65344 | 0 => SceneValid [boolean]",
            "+ | 0x0004 |        | server | RAM |           |       |                      | 1 | 0 | 65344 | 0 => NameSupport [bitmap8]",
            "+ | 0xfffd |        | server | RAM |           |       |                    4 | 1 | 0 | 65344 | 0 => ClusterRevision [int16u]"
]

But you are correct, that this now requires custom parsing, so this is 100% detrimental to scripted post-processing of zap files. Zap, though assumes that you are doing all the .zap file processing through zap itself. But obviously, that's a "wish" and possibly not a reality.

Note that there is NO MORE information, it's just differently formatted. We already had types and names there that served nothing.

I dunno.... I am not going to merge this in as a default file format, if you are strongly unhappy about it.

I can add file-format 2, which would be gzipped files?

Any other idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, doing gunzip, edit, gzip (or unzip, edit, zip) is a lot simpler than figuring out the custom syntax even for hand-editing...

And yes, there are already tools out there that do .zap file processing that are not ZAP itself. I guess those could be supported if there were a way to take the new format, convert to old, modify, convert to new.

Note that there is NO MORE information, it's just differently formatted.

Right, but it seems like it makes it harder to remove the redundant information that shouldn't really be there, if we wanted to do that in the future, since it bakes that information into the syntax.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Fair enough.

How about this:
1.) I will add a commit on top of this, that will keep the "default save file format" to 0, which means that none of this format will be in effect, unless someone specifically sets the ZAP_SAVE_FILE_FORMAT environment variable, and/or uses the --saveFileFormat command line argument.
2.) The will make happy people who choose to use this, probably not on Matter SDK.
3.) Everyone else, including Matter SDK for now, according to your feedback, will not even notice anything happened, since unless they go dig for it, they will simply continue using the old format.

Does that work for you now, temporarily? I can then merge this PR, people who care about this new file format can play with it, and everyone else is not affected?

This also gives us framework that if we want to add file format 3, 4, 5 or 6, we can easily add them on top of it, even if the file format 1 will continue proving to be unpopular.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great, thank you!

"side": "client",
"enabled": 1,
"commands": [
"0x0000 | | client | 1 | 1 => RequestInformation",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comments as for attributes.

* @param {*} fileFormat
*/
function convertToFile(state) {
if (state.fileFormat && state.fileFormat > 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& the same thing

@tecimovic tecimovic force-pushed the zap-file-minimization branch from c894323 to caf8642 Compare April 18, 2023 13:34
@tecimovic tecimovic merged commit 313e73a into project-chip:master Apr 18, 2023
@tecimovic tecimovic deleted the zap-file-minimization branch April 18, 2023 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants