-
-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New file format (stabilization) #1173
Comments
For the ink serialized format, there is no floats used to store ink strokes, there's only a minimum fixed size (like 1/1000th of a cm) and quantization is used. Hence everything is using ints. The other thing used is that instead of saving the data as
It's saved as
Hence first element, first derivative (difference) and then only second derivatives. The general idea is that the second derivatives are usually very small, hence additional compression can be done on top of this (huffman or bitpacking as we expect low values that can be represented on very few bits). I'd be great though to still have an easy way for people to obtain a human-readable or json version of the file and/or the file specification. Having vendor lock in for files because of undocumented binary files, or documented but without readily available readers that most people can use is not something I want, having been victim of this myself and the sheer insanity of trying to get the data out. For partial file loading, the natural thing would be to have something based on pages (separate files in a zipped folder seems common) but this wouldn't be enough because strokes that are on more than one page can occur. Maybe having optional files corresponding to sizes 2^n * 2^n pages including only strokes that can't be cast into a smaller children would work. This would mean it'd be relatively easy to test for strokes that are across pages, something that would be useful for any page management functionality (in addition to having some concept of a page). |
Alright, here's what I think:
I think a bigger talking point would be the format of the data itself. Maybe for now we can stick to what we are already doing by just using what serde gives us - the strokes store without any modifications to it's data layout - but I think it is worth discussing how the serialization layout can be different from the data layout of the application itself, especially with regards to partial file loading. |
I think the compression being in the header was to accomodate the possibility to save to and from the raw json encoding (so that you can uncompress the data a little more easily if needed). I think it was also a way to test more easily different compression methods as well |
Good point, keeping the version separate from the header was done for flexibility (i.e. renaming fields without having to specify the old alias, changing the serialization method for the header, adding new fields that do not need to implement Default, etc.), but I could include a more complete version inside the prelude to allow the use of 'Prerelease' and 'BuildMetadata' (i.e. [u64, u64, u64, Prerelease size (u8/u16), Prerelease (str), BuildMetadata size (u8/16) , BuildMetadata (str)] edit : b8062ec
What @Doublonmousse said is correct, furthermore this allows users to specify a compression level (transmitted to SavePrefs), and makes it trivial to add and test different compression and serialization methods.
I haven't worked on this aspect, however the 'body' of the proposed file format is a vector of u8 instead of the current ijson value, which makes it very flexible For completeness, here's the latest excalidraw image: |
In addition to the version, you may want to add feature flags. The idea is to have 3 sets of flags:
The use of feature flags is allow more flexibility to communicate between version of For example, imagine that in version Using versions alone it is not possible (or harder) to know for Of course, this idea is not mine, I borrowed from |
edit: I accidentally used an older version of the drawing to update the prelude version text |
Let's track what Rnote's new file format should look like here.
There are already some ideas and implementation for improvements floating around.
Ideally backwards compatibility is kept which I think is doable.
Improvements are:
Things to consider are: how to continue making upgrade path's relatively easy to implement - currently json's untyped
Value
's are used, that won't work with other formats. I can see two possibilities: limit additions and changes to what is backwards compatible from the perspective of the serde derivedSerialize
andDeserialize
trait, and if that's not possible start implementing both traits manually. This can be quite tedious in some cases though.The text was updated successfully, but these errors were encountered: