Skip to content
This repository was archived by the owner on Sep 1, 2021. It is now read-only.

add a script to convert files to formatted TOML #14

Merged
merged 1 commit into from
Oct 8, 2020

Conversation

maltek
Copy link
Contributor

@maltek maltek commented Oct 7, 2020

For rpm-software-management/rpmlint#439 we need to switch to TOML. To be able to keep working without interruption this PR adds a script that converts the current JSON to decently formatted TOML. The script is a bit hacky, but it makes sure that - except for comments - the parsed TOML files are identical to the JSON input.

The output of this script are TOML files looking like this:

#### sarg ####
  [sarg.audits."bsc#1150554"]
  # builds statistics based on Squid logfile metadata. include sarg-reports which is SUSE specific and important for privilege dropping.

    [sarg.audits."bsc#1150554".digests]
    "/etc/cron.daily/suse.de-sarg" = "sha256:d536dc68e198189149048a907ea6d56a7ee9fc732ae8fec5a4072ad06640e359"
    "/etc/cron.monthly/suse.de-sarg" = "sha256:d536dc68e198189149048a907ea6d56a7ee9fc732ae8fec5a4072ad06640e359"
    "/etc/cron.weekly/suse.de-sarg" = "sha256:d536dc68e198189149048a907ea6d56a7ee9fc732ae8fec5a4072ad06640e359"
    "/usr/sbin/sarg-reports" = "sha256:00ad25400bdc2031cd09f9b8f9e56c448c93b6b89a702d36dce6a385d79e637c"
#### filesystem ####
  [filesystem.audits."bsc#1174642"]
  # Public standard sticky-bit directories

    [filesystem.audits.'bsc#1174642'.meta]
        '/tmp' = { type = 'd', mode = '1777', owner = 'root:root' }
        '/var/tmp' = { type = 'd', mode = '1777', owner = 'root:root' }
        '/var/spool/mail' = { type = 'd', mode = '1777', owner = 'root:root' }
        '/tmp/.X11-unix' = { type = 'd', mode = '1777', owner = 'root:root' }
        '/tmp/.ICE-unix' = { type = 'd', mode = '1777', owner = 'root:root' }

As far as I am concerned, we could also drop the dictionary nesting level "audits", since it's the only sub-key to the package name that exists. Technically, "digests" and "meta" are also redundant (the rpmlint check already knows which of these it expects).

@maltek maltek requested a review from mgerstner October 7, 2020 14:16
@maltek maltek changed the title add a script to convert files to formatted JSON add a script to convert files to formatted TOML Oct 7, 2020
@mgerstner
Copy link
Collaborator

but it makes sure that - except for comments - the parsed TOML files are identical to the JSON input.

what exactly does that mean "except for comments"?

As far as I am concerned, we could also drop the dictionary nesting level "audits", since it's the only sub-key to the package name that exists.

I'm not particularly attached to this nesting but it seemed a good idea to leave room for extension should further needs arise in the future.

Technically, "digests" and "meta" are also redundant (the rpmlint check already knows which of these it expects).

Here I'm inclined to keep them. It makes it more clear when reading the data structure IMO. Also it is less error prone / explicitly marked as a certain type of entry.

@maltek
Copy link
Contributor Author

maltek commented Oct 8, 2020

what exactly does that mean "except for comments"?

As you can see in the see sample output, the comments are turned into TOML comments. So they still are in the output file, but them being comments means they are not visible after parsing this TOML into a Python data structure.


I'll remove the "audits" and keep "meta"/"digests".

@mgerstner
Copy link
Collaborator

I won't do a picky review review for this - hopefully - only temporarily used script. Only the user interface concerns me a bit.

The assert makes sure that the deserialized data is equal so there shouldn't be much to worry about regarding the correctness of the conversion.

  [filesystem.audits."bsc#1174642"]
  # Public standard sticky-bit directories

    [filesystem.audits.'bsc#1174642'.meta]

This repetitiveness surprises me a bit. Is this really the only way to nest in TOML?

@maltek
Copy link
Contributor Author

maltek commented Oct 8, 2020

Only the user interface concerns me a bit.

Yup, that's really the discussion I wanted to have with this PR.

This repetitiveness surprises me a bit. Is this really the only way to nest in TOML?

The only other way to have nested key-value pairs (besides INI-style sections) are "inline tables" like I'm using for the file meta data. They can't span more than a single-line, so that's no good. So repeating e.g. the package name all the time is unavoidable.

The whole [filesystem.audits."bsc#1174642"] line isn't technically necessary, I just wanted to make it clear where the comment belongs to. We could as well move the comments below the meta/digests header, my worry was just that this turns out weird if we ever add another sub-key in the same kind of file.

(Also, the indentation doesn't do anything, I just prefer it that way.)

@maltek
Copy link
Contributor Author

maltek commented Oct 8, 2020

So, to be clear: we could do something like this:

#### nscd ####
    [nscd.'bsc#1174642'.meta]
        # nss caching daemon socket. is packaged as %ghost, therefore 'appears' to be a regular file.
        '/run/nscd/socket' = { type = '-', mode = '0666', owner = 'root:root' }


#### patch2mail ####
    [patch2mail.'bsc#1150552'.digests]
        # looks up pending zypper patches and sens them out by mail
        '/etc/cron.daily/patch2mail' = 'sha256:2db3aaa7addba83e60e24fdb868dd6f353bfb76b005fd2a2fdaf079fb54fe597'

Copy link
Collaborator

@mgerstner mgerstner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to be clear: we could do something like this:

Looks concise, would be fine for me.

@maltek
Copy link
Contributor Author

maltek commented Oct 8, 2020

OK, I've updated the script. I've done one more change: instead of using a comment to start the section for a package, I use an empty INI-style header:

[bind-chrootenv]
    [bind-chrootenv.'bsc#1174642'.meta]
        # Chroot duplicates of some uncritical character devices. urandom was historically packaged non-world-writable, probably to avoid the rpmlint error triggering.
        '/var/lib/named/dev/null' = { type = 'c', mode = '0666', dev = '1,3', owner = 'root:root' }

Somehow I thought before that wouldn't work. There can't be more than one section with the same name in a file, so at least this prevents multiple sections about a package in the same file. (Nesting isn't enforced though, so audits for a package could in theory be put into the wrong section.)

@mgerstner mgerstner merged commit 51854a1 into openSUSE:master Oct 8, 2020
@marxin
Copy link

marxin commented Oct 22, 2020

@maltek Maybe a silly question, but have you tried using toml.dumps instead of manually writing of a TOML file?
Link to the documentation: https://pypi.org/project/toml/

@maltek
Copy link
Contributor Author

maltek commented Oct 22, 2020

@marxin that works to correctly preserve the contents, but not to get nice readable formatting.

@marxin
Copy link

marxin commented Oct 23, 2020

@marxin that works to correctly preserve the contents, but not to get nice readable formatting.

I see. Anyway, once the rpmlint pull request is merged, I would really recommend migrating all the configuration to TOML files and remove the JSON format. Then you can update the TOML files and no conversion will be needed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants