libselinux: rework selabel_file(5) database · cgzones/selinux@92306da

Commit

libselinux: rework selabel_file(5) database

Currently the database for file backend of selabel stores the file
context specifications in a single long array.  This array is sorted by
special precedence rules, e.g. regular expressions without meta
character first, ordered by length, and the remaining regular
expressions ordered by stem (the prefix part of the regular expressions
without meta characters) length.

This results in suboptimal lookup performance for two reasons;
File context specifications without any meta characters (e.g.
'/etc/passwd') are still matched via an expensive regular expression
match operation.
All such trivial regular expressions are matched against before any non-
trivial regular expression, resulting in thousands of regex match
operations for lookups for paths not matching any of the trivial ones.

Rework the internal representation of the database in two ways:
Convert regular expressions without any meta characters and containing
only supported escaped characters (e.g. '/etc/rc\.d/init\.d') into
literal strings, which get compared via strcmp(3) later on.
Store the specifications in a tree structure to reduce the to number of
specifications that need to be checked.

Since the internal representation is completely rewritten introduce a
new compiled file context file format mirroring the tree structure.
The new format also stores all multi-byte data in network byte-order, so
that such compiled files can be cross-compiled, e.g. for embedded
devices with read-only filesystems (except for the regular expressions,
which are still architecture-dependent, but ignored on architecture mis-
match).

The improved lookup performance will also benefit SELinux aware daemons,
which create files with their default context, e.g. systemd.

Fedora 41 (pre-compiled regular expressions are omitted on Fedora):
    file_contexts.bin:           567248  ->   413191  (bytes)
    file_contexts.homedirs.bin:   20677  ->    13107  (bytes)

Debian Sid (pre-compiled regular expressions are included):
    file_contexts.bin:          7790690  ->  3646256  (bytes)
    file_contexts.homedirs.bin:  835950  ->   708793  (bytes)

(selabel_lookup -b file -k /bin/bash)

Fedora 41 in VM:
    text:      time:       7.2 ms  ->   3.5 ms
               peak heap:   2.33M  ->    1.81M
               peak rss:    6.64M  ->    6.37M
    compiled:  time:       5.9 ms  ->   1.6 ms
               peak heap:   2.14M  ->    1.23M
               peak rss:    6.76M  ->    5.91M

Debian Sid on Raspberry Pi 3:
    text:      time:      33.4 ms  ->  21.2 ms
               peak heap:  10.59M  ->  607.32K
               peak rss:    6.55M  ->    4.46M
    compiled:  time:      38.3 ms  ->  23.5 ms
               peak heap:  13.28M  ->    2.00M
               peak rss:   12.21M  ->    7.60M

(restorecon -vRn /)

Fedora 41 in VM:
       9.6 s  ->   1.3 s
Debian Sid on Raspberry Pi 3:
      94.6 s  ->  12.1 s

(restorecon -vRn -T0 /)

Fedora 39 in VM (8 cores):
      10.9 s  ->   1.0 s
Debian Sid on Raspberry Pi 3 (4 cores):
      58.9 s  ->  12.6 s

(note: I am unsure why the parallel runs on Fedora are slower)

There might be subtle differences in lookup results which evaded my
testing, because some precedence rules are oblique.  For example
`/usr/(.*/)?lib(/.*)?` has to have a higher precedence than
`/usr/(.*/)?bin(/.*)?` to match the current Fedora behavior.  Please
report any behavior changes.

The maximum node depth in the database is set to 3, which seems to give
the best performance to memory usage ratio.  Might be tweaked for
systems with different filesystem hierarchies (Android?).

I am not that familiar with the selabel_partial_match(3),
selabel_get_digests_all_partial_matches(3) and
selabel_hash_all_partial_matches(3) related interfaces, so I only did
some rudimentary tests for them.

CC: Petr Lautrbach <[email protected]>
CC: James Carter <[email protected]>
CC: Stephen Smalley <[email protected]>
Signed-off-by: Christian Göttsche <[email protected]>
Acked-by: James Carter <[email protected]>

Loading branch information

cgzones authored and jwcart2 committed Nov 15, 2024

1 parent 90b1c23 commit 92306da

libselinux/src/label_backends_android.c

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -91,7 +91,7 @@ static int process_line(struct selabel_handle *rec,
  
    	unsigned int nspec = data->nspec;

    	const char *errbuf = NULL;

    	items = read_spec_entries(line_buf, &errbuf, 2, &prop, &context);

    	items = read_spec_entries(line_buf, strlen(line_buf), &errbuf, 2, &prop, &context);

    	if (items < 0) {

    		if (errbuf) {

    			selinux_log(SELINUX_ERROR,

0 comments on commit `92306da`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `92306da`

Commit

There are no files selected for viewing

0 comments on commit 92306da

0 comments on commit `92306da`