Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[21.05] full-disk encryption: finalisation, improvements, checks, routine tooling #1087

Merged
merged 13 commits into from
Aug 22, 2024

Conversation

osnyx
Copy link
Member

@osnyx osnyx commented Aug 21, 2024

@flyingcircusio/release-managers

This wraps up PL-131325 by adding supportive tooling for maintenance tasks:

  • fc-luks keystore test-open for regularly verifying the correct key assignment of volumes

The following sensu checks are added:

  • swap is off
  • fc-luks check LUKS header parameter check

Following smaller regressions or cleanups are included:

  • test successful ceph service activation after bootup PL-132687
  • remove dead code python27 environment of previous ceph releases

Release process

Impact: internal

Changelog:

PR release workflow (internal)

  • PR has internal ticket
  • internal issue ID (PL-…) part of branch name
  • internal issue ID mentioned in PR description text
  • ticket is on Platform agile board
  • ticket state set to Pull request ready
  • if ticket is more urgent than within the next few days, directly contact a member of the Platform team

Design notes

  • Provide a feature toggle if the change might need to be adjusted/reverted quickly depending on context. Consider whether the default should be on or off. Example: rate limiting.
  • All customer-facing features and (NixOS) options need to be discoverable from documentation. Add or update relevant documentation such that hosted and guided customers can understand it as well.

Security implications

  • Security requirements defined? (WHERE)
    • must not introduce new known regressions
    • expected critical properties of our disk encryption need to be checked regularly (swap off, LUKS parameters)
    • cover new tooling with unit and integration tests
    • regular maintenance tasks can be streamlined with supporting tooling to reduce potential errors and mental load
  • Security requirements tested? (EVIDENCE)
    • automated tests still pass
    • sensu-checks of test host are green
    • manually tested new fc-luks subcommands and edge cases
    • extended unit tests for fc-luks check
    • the following commands are at least covered by NixOS smoke tests to verify interaction with potential changes in called system commands: fc-luks keystore test-open, fc-luks check

osnyx and others added 9 commits July 31, 2024 11:13
For streamlining the regular task of verifying whether (all) volumes of
a host can still be opened with the admin key and device key, I
implemented the `fc-luks keystore test-open` subcommand.

As this relies on the same volume discovery logic as the `reencrypt`
command, I refactored and reused that discoverly logic out into a
separate method.

Also includes a NixOS integration test for this, but no unit tests.
Reactivating the already-implemented, but commented-out test for
successful daemon activation after reboots of encrypted ceph hosts.
The test was broken due to stateful modifications of the key material in
previous parts of the test, moving the subtest in front of these
modifications resolved it.

PL-132687
This has been unused since we dropped Ceph Jewel.

Piggybacking this in this branch as FDE work is slightly related to
storage and ceph.
When doing full-disk encryption, persisted encrypted data from disk can
still be kept in memory in a decrypted state. When swapping memory pages
to disk again, there is the danger of persisting that plaintext data to
disk again, circumventing the goal of the encryption.
As we do not want to use swap on physical machines anymore, we can just
monitor and warn when swap is unexpectedly used.

PL-131325
Establish a regular sensu check on physical hosts that checks certain
LUKS header parameters of all encrypted devices, to discover unexpected
diversions.

- tie together the individual condition checks
- integrate checks into fc-luks tooling with discovery of encrypted
  volumes
- add sensu check
- extend checks with plausibility checks for proper dump input
- add unit and integration tests for checks with proper mocking
This makes sense to expose the check to potential changes in to output
format of `cryptsetup luksDump`.
Detailed unit tests are done against mock outputs only.
Not all physical machines actually have encrypted volumes as of now,
e.g. KVM hosts. The check does not need to run on them.
@osnyx osnyx force-pushed the PL-131325-fde-finalise branch from 69b1841 to 6de0aee Compare August 21, 2024 21:59
osnyx added 4 commits August 22, 2024 14:01
By running fc-luks commands as a sensu check, we suddenly do not inherit
the full system PATH anymore, breaking access to necessary external
tools used like `lvs`.
We can adopt the `fc-ceph.conf` approach already used by `fc-ceph`
subsystems, for now even the default PATH is sufficient.

Requires some extra mocking in unit tests; the NixOS integration test
now explicitly empties the path beforehands.
@osnyx osnyx force-pushed the PL-131325-fde-finalise branch from 8400bfc to f5c854e Compare August 22, 2024 12:13
@osnyx osnyx marked this pull request as ready for review August 22, 2024 12:24
@ctheune ctheune merged commit 1d6e41d into fc-21.05-dev Aug 22, 2024
2 checks passed
@ctheune ctheune deleted the PL-131325-fde-finalise branch August 22, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants