Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SecureDrop workstation to support 4.2 and use an install-time base template. #970

Merged
merged 13 commits into from
Apr 9, 2024

Conversation

zenmonkeykstop
Copy link
Contributor

@zenmonkeykstop zenmonkeykstop commented Apr 2, 2024

Status

Work in progress:

  • sdw-admin --apply command detects existing appVM dependencies on debian-12-minimal and alerts the user, but should also reinstall debian-12-minimal if necessary
  • some tests failing - needs a grsec kernel with the expected identifier (including -workstation-grsec) and needs updates to the RPC checks.
  • launcher tests are failing in CI

Description of Changes

Fixes #969 .

Testing

Much larger plan TK, but roughly

  • Provision dev env via make clone, make dev, verify that the latter completes successfully
  • reboot
  • Verify that make test passes
  • Verify that the launcher is triggered and update is successful
  • Verify basic client functionality (log in, retrieve and view submissions,, reply, export/print)

Deployment

Migration from 4.1 is not supported - fresh installs on 4.2 only.

Checklist

If you have made changes to the provisioning logic

  • All tests (make test) pass in dom0

If you have added or removed files

vi

  • I have updated MANIFEST.in and rpm-build/SPECS/securedrop-workstation-dom0-config.spec

If documentation is required

  • I have opened a PR in the docs repo for these changes, or will do so later
  • I would appreciate help with the documentation

@zenmonkeykstop zenmonkeykstop marked this pull request as draft April 2, 2024 14:15
Makefile Outdated Show resolved Hide resolved
Makefile Outdated
TERM=xterm-256color $(CONTAINER) bash -c "sudo ln -s $$PWD/scripts/fake-setarch.py /usr/local/bin/setarch && sudo reprotest 'make build-rpm' 'rpm-build/RPMS/noarch/*.rpm' --variations '+all,+kernel,-fileordering,-domain_host'"
TERM=xterm-256color $(CONTAINER) bash -c "sudo ln -s $$PWD/scripts/fake-setarch.py /usr/local/bin/setarch && sudo reprotest 'make build-rpm' 'rpm-build/RPMS/noarch/*.rpm' --variations '+all,+kernel,-time,-fileordering,-domain_host'"
@echo
@echo Warning! Temporarily removed time variations for reprotest.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a blocker. Or if its just reprotest/libfaketime we can switch to the approach of doing two builds and then diffing them.

sd-base-template-install-additional-packages:
pkg.installed:
- pkgs:
- qubes-core-agent-passwordless-root
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these should be dependencies in -config or other packages. But I'm not sure why we're not starting with the GNOME (or even XFCE) templates to begin with, I thought that was what we had discussed initially?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we start with -gnome or -xfce, then the base template will include more than we want. The initial discussion was actually around using -minimal, I did flip to -xfce temporarily, but the resulting attack surface in terms of unused packages is too great, and removing them would be more hassle than starting with minimal and adding just what we need.

Copy link
Contributor Author

@zenmonkeykstop zenmonkeykstop Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm on board with adding them in the securedrop-workstation-config package.

@@ -53,6 +51,9 @@ configuration over time.

%install
%{python3} -m pip install --no-compile --no-index --no-build-isolation --root %{buildroot} .
# direct_url.json is is not reproducible and not strictly needed
rm %{buildroot}/%{python3_sitelib}/*%{version}.dist-info/direct_url.json
sed -i "/\.dist-info\/direct_url\.json,/d" %{buildroot}/%{python3_sitelib}/*%{version}.dist-info/RECORD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the client:

override_dh_strip_nondeterminism:
	find ./debian/ -type f -name '*.pyc' -delete
	find ./debian/ -type d -name '__pycache__' -delete
	find ./debian/ -type f -name 'pip-selfcheck.json' -delete
	find ./debian/ -type f -name 'direct_url.json' -delete
	find ./debian/ -type f -name 'RECORD' -delete
	dh_strip_nondeterminism $@

let's just delete these instead of trying to fix them up

@zenmonkeykstop zenmonkeykstop force-pushed the 4_2-compat-deb12-base branch 2 times, most recently from a38de98 to 503d73d Compare April 2, 2024 15:16
@legoktm legoktm force-pushed the 4_2-compat-deb12-base branch 3 times, most recently from dd82553 to 0f3bb34 Compare April 2, 2024 19:22
@legoktm
Copy link
Member

legoktm commented Apr 2, 2024

Summary of the dom0 test failures:

  • test_sd_client_apparmor - apparently not set to enforce
  • test_log_dirs_properly_named / test_logs_are_flowing - unclear what the error is, but probably related to the next two
  • test_redis_service_running - it's not running
  • test_sd_log_service_running - it's not running
  • test_sd_proxy_rpc_spec - file doesn't exist maybe?
  • test_Policies - RPC policy doesn't exist
  • test_sd_workstation_template - template doesn't exist
  • test_all_fedora_vms_uptodate - Command 'sudo dnf check-update' returned non-zero exit status 100.
  • test_gpg_domain_configured - returned '' instead of sd-gpg
  • test_logging_configured, and a number of related dependency-related tests all saying not installed
  • test_mime_types - selecting wrong opener
  • _check_kernel - known that the kernel is reporting the wrong version
  • test_all_sd_vm_apt_sources - missing apt source

@@ -146,10 +70,23 @@ dom0-rpc-qubes.r5-format-ask-allow:
securedrop.Log * sd-log sd-log deny notify=no
securedrop.Log * @tag:sd-workstation sd-log allow

securedrop.Proxy * sd-app sd-proxy allow

qubes.Gpg * @tag:sd-client sd-gpg allow
Copy link
Contributor

@rocodes rocodes Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(are we missing the allow rule for GpgImportKey? qubes.GpgImportKey * @tag:sd-client sd-gpg allow)


qubes.USB * sd-devices sys-usb allow

# TODO: should this be handled with the new Global Config UI instead?
Copy link
Contributor

@rocodes rocodes Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is appropriate for this policy file, so that the config is set by files we control (and if uninstall is run it will be removed cleanly)

@@ -1,7 +1,8 @@
# -*- coding: utf-8 -*-
# vim: set syntax=yaml ts=2 sw=2 sts=2 et :

{% if grains['id'] in ["securedrop-workstation-bullseye", "sd-small-bullseye-template", "sd-large-bullseye-template"] %}
# TODO: parametrise this
{% if grains['id'] in ["sd-small-bookworm-template", "sd-large-bookworm-template"] %}
Copy link
Contributor

@rocodes rocodes Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if we check against {{ sdvars.distribution }} , we will more towards one source of truth for debian version name (maybe that's what you meant by parameterise?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup that's what I meant.

@@ -22,7 +22,7 @@ function request_migration() {

# Template consolidation. If old template names are found,
# then we must rerun the full states to re-apply.
if [[ -n "$(qvm-ls --tags sd-workstation --raw-list | perl -nE '/sd-(?!small|large).*-template/ and print $_')" ]] ; then
if [[ -n "$(qvm-ls --tags sd-workstation --raw-list | perl -nE '/sd-(?!small|large|base).*-template/ and print $_')" ]] ; then
reason="template-consolidation"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this, since we're starting clean on 4.2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think so.


dom0-install-securedrop-workstation-template:
# Ensure debian-12-minimal is present for use as base template
dom0-install-debian-minimal-template:
cmd.run:
Copy link
Contributor

@rocodes rocodes Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just use qvm.template_installed instead of cmd.run. If memory serves, we had an issue with the qvm-template tooling because we were using the same repo to serve templates and the dom0 config rpm, and qvm-template was upset by this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be an issue.

@legoktm legoktm force-pushed the 4_2-compat-deb12-base branch from 7e867f3 to 34e59ff Compare April 2, 2024 22:36
@legoktm
Copy link
Member

legoktm commented Apr 2, 2024

I squashed it down from 41 commits to 11. 3 are annotated as FIXMES, I think we'd want to drop them before merging. I'll look into the rpm reproducibility issue tomorrow, and the shutdown VMs one can be dropped once freedomofpress/securedrop-workstation-ci#44 is deployed. I'm not sure about the qvm-check one.

@legoktm legoktm marked this pull request as ready for review April 2, 2024 22:48
@mig5
Copy link
Contributor

mig5 commented Apr 3, 2024

the shutdown VMs one can be dropped once freedomofpress/securedrop-workstation-ci#44 is deployed.

freedomofpress/securedrop-workstation-ci#44 is now merged and deployed to the runner bastion, should now be active for all future CI runs.

# Finally, remove the old supported fedora DVM we created. We won't uninstall
# the template, in case it's being used elsewhere, but the `sd-` VMs we can
# reasonably manage (remove) ourselves.
# TODO: does this need to be updated to refer to sd-fedora-<previous>-dvm? or was it
# a one-time thing?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we're still going with sd-fedora-xx-dvm for the sys-usb dvm template, it would be friendly to clean up the old ones when we're not using them anymore. But since we're installing clean on 4.2, users won't have an out-of-date sd-fedora dvm on their machines, so this won't be relevant til the next fedora base template upgrade.

"SecureDrop Workstation should be installed on a fresh Qubes OS install.\n"
"The installation process will overwrite any user modifications to the\n"
f"{BASE_TEMPLATE} TemplateVM, and will disable old-format qubes-rpc\n"
"policy directives.\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this latter part is true anymore - I think we have decided that we won't disable them, we'll just make sure all our rules take precedence before the old-style rules can even load.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sdw-admin will get another pass to clean up output and add more stringent prod checks, we can bake that final decision in (and maybe check the rpc rules config as well as checking for VMs) when we do.

@legoktm legoktm force-pushed the 4_2-compat-deb12-base branch from 34e59ff to f5cf6da Compare April 3, 2024 18:10
@@ -9,7 +9,7 @@
BULLSEYE_STRING = "Debian GNU/Linux 11 (bullseye)"
Copy link
Contributor

@deeplow deeplow Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed, I think.

@legoktm
Copy link
Member

legoktm commented Apr 3, 2024

I don't want to disuade people from pointing out things that can safely be removed, because I think there's a bunch, but I think we can defer them for follow-up PRs that will be trivial to merge?


In my testing locally, with the two commits I just pushed, the only test failures I have left are related to the kernel version. There are a few different failures remaining in CI that I wonder if they're related to the VM shutdown issue (but I can see they're being shut down in the log?)

@zenmonkeykstop
Copy link
Contributor Author

I don't want to disuade people from pointing out things that can safely be removed, because I think there's a bunch, but I think we can defer them for follow-up PRs that will be trivial to merge?

Yup, I agree - I'd like to get a functional 4.2 main ASAP and work on fixes/major changes/tweaks, in that order.

@legoktm
Copy link
Member

legoktm commented Apr 4, 2024

We're down to two failing tests (excluding kernel version suffix):

INFO:[2024-04-04-00:02:31:790536] ======================================================================
INFO:[2024-04-04-00:02:31:790573] ERROR: test_all_fedora_vms_uptodate (test_vms_platform.SD_VM_Platform_Tests.test_all_fedora_vms_uptodate)
INFO:[2024-04-04-00:02:31:790610] Asserts that all Fedora-based templates, such as sys-net, have all
INFO:[2024-04-04-00:02:31:790647] ----------------------------------------------------------------------
INFO:[2024-04-04-00:02:31:790682] Traceback (most recent call last):
INFO:[2024-04-04-00:02:31:790718]   File "/home/user/securedrop-workstation/tests/test_vms_platform.py", line 174, in test_all_fedora_vms_uptodate
INFO:[2024-04-04-00:02:31:790752]     self._ensure_packages_up_to_date(vm, fedora=True)
INFO:[2024-04-04-00:02:31:790788]   File "/home/user/securedrop-workstation/tests/test_vms_platform.py", line 104, in _ensure_packages_up_to_date
INFO:[2024-04-04-00:02:31:790821]     stdout, stderr = vm.run(cmd)
INFO:[2024-04-04-00:02:31:790858]                      ^^^^^^^^^^^
INFO:[2024-04-04-00:02:31:790891]   File "/usr/lib/python3.11/site-packages/qubesadmin/vm/__init__.py", line 320, in run
INFO:[2024-04-04-00:02:31:790927]     raise e
INFO:[2024-04-04-00:02:31:790960]   File "/usr/lib/python3.11/site-packages/qubesadmin/vm/__init__.py", line 316, in run
INFO:[2024-04-04-00:02:31:790996]     return self.run_service_for_stdio('qubes.VMShell',
INFO:[2024-04-04-00:02:31:791029]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
INFO:[2024-04-04-00:02:31:791064]   File "/usr/lib/python3.11/site-packages/qubesadmin/vm/__init__.py", line 296, in run_service_for_stdio
INFO:[2024-04-04-00:02:31:791098]     raise exc
INFO:[2024-04-04-00:02:31:791135] subprocess.CalledProcessError: Command 'sudo dnf check-update' returned non-zero exit status 100.

and

INFO:[2024-04-04-00:02:31:792645] ======================================================================
INFO:[2024-04-04-00:02:31:792681] FAIL: test_all_sd_vm_apt_sources (test_vms_platform.SD_VM_Platform_Tests.test_all_sd_vm_apt_sources)
INFO:[2024-04-04-00:02:31:792716] Test all VMs fpf apt source list iteratively.
INFO:[2024-04-04-00:02:31:792753] ----------------------------------------------------------------------
INFO:[2024-04-04-00:02:31:792787] Traceback (most recent call last):
INFO:[2024-04-04-00:02:31:792823]   File "/home/user/securedrop-workstation/tests/test_vms_platform.py", line 221, in test_all_sd_vm_apt_sources
INFO:[2024-04-04-00:02:31:792857]     self._validate_apt_sources(vm)
INFO:[2024-04-04-00:02:31:792893]   File "/home/user/securedrop-workstation/tests/test_vms_platform.py", line 83, in _validate_apt_sources
INFO:[2024-04-04-00:02:31:792927]     self.assertTrue(self.apt_url in contents)
INFO:[2024-04-04-00:02:31:792963] AssertionError: False is not true

I pushed some better error messages to surface which VMs are at fault. I suspect it's sys-net, etc. that have pending Fedora updates because we're not restarting them (since they don't have the sd-workstation tag). I also don't know if we can restart them, because we need a network connection to communicate with the ws-ci-bastion. If this is the case I would suggest we skip these tests if we're in the CI environment and leave a fixme for a better solution.

@zenmonkeykstop
Copy link
Contributor Author

zenmonkeykstop commented Apr 4, 2024

One option (for the system template updates at least) might be to trigger the cli updater as a first/early step in the install process. For the pilot we told folk to manually update before kicking things off, but if we can do it for them so much the better.

@legoktm
Copy link
Member

legoktm commented Apr 4, 2024

Also the VM shutdown we did in the CI repo doesn't seem sufficient, I think we might need to properly wait until they're shutdown. So we can probably ship the current FIXME commit (with a different summary) and revert it once CI is fixed.

@legoktm legoktm force-pushed the 4_2-compat-deb12-base branch 3 times, most recently from 426d6c1 to 1e36c74 Compare April 5, 2024 03:19
@cfm cfm mentioned this pull request Apr 8, 2024
4 tasks
eaon and others added 13 commits April 9, 2024 10:20
Primarily changing the base dev containers to Fedora 37 and Python 3.11.
Suspecting upstream issues in rpm land is causing issues with 1 file's
modification time not being clamped correctly only in a reprotest
environment. However, there's no actual issue with reproducible builds,
the packaging correctly clamps times.
Note that it uses the `fedora-XX-xfce` naming pattern now.
We run under "set -e", which will abort the entire script when
"qvm-check --running foo" returns nonzero when foo isn't running.  Here
we simplify the logic and wrap it in a function to enter with "set +e"
and exit reinstating "set -e".
This is needed for CI/tests to pass, a number of dependencies were
too old.

For the launcher, remove black (the root-installed black already
formats the launcher) and gitpython (no longer needed by bandit).
This is needed to make workstation CI happy.
Instead of shipping our own template, we now pull the debian-12-minimal template,
clone it, install some base packages, then clone it again into the small and
large variants.

Take the opportunity to switch the kernel to use PVH instead of HVM.

While people should only be using fresh Qubes installs for SDW, a check is added
to ensure there are no other VMs using the debian-12-minimal template, since we
need it to be pristine.
Just verify the new policy files are installed, don't do any verification
of contents yet.
While we do recommend a reboot, just preemptively shutdown all of
our VMs after provisioning to ensure anything that follows (e.g. CI
runs) will be from a cleaner state.

We tried to implement this in workstation-ci itself (<freedomofpress/securedrop-workstation-ci#44>),
but it isn't working as we desired, so we'll do this for now and revert
once CI is fixed.
@zenmonkeykstop zenmonkeykstop force-pushed the 4_2-compat-deb12-base branch from 1e36c74 to 941f051 Compare April 9, 2024 14:20
@legoktm
Copy link
Member

legoktm commented Apr 9, 2024

I'll look into the rpm reproducibility issue tomorrow

I forgot to report back that I did investigate the reproducibility issue - the rpms are reproducible except when built under reprotest (I suspect this is some libfaketime interaction gone wrong). I've updated the commit/comments to reflect this and don't consider it a blocker to moving forwards.

Copy link
Member

@cfm cfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to stress-test things, uninstalling an existing installation on Qubes 4.2 from 4d6ca1a...5873c63 and then installing from here:

  • Provision dev env via make clone, make dev, verify that the latter completes successfully
[user@dom0 ~] sdw-admin --uninstall
[user@dom0 ~] cd securedrop-workstation
[user@dom0 securedrop-workstation] make clone
[user@dom0 securedrop-workstation] make dev

Failed because the signing key has changed per #972. Apply that locally, then:

[user@dom0 securedrop-workstation] sdw-admin --uninstall  # our templates deleted: ✓
[user@dom0 securedrop-workstation] make clone
[user@dom0 securedrop-workstation] make dev
  • reboot

Updates, and reboot again.

  • Verify that make test passes

Exceptions:

@@ -17,7 +17,7 @@
{% if d.environment == "dev" %}
# use apt-test and nightlies
{% set sdvars = sdvars_defaults["test"] %}
{% set _ = sdvars.update({"component": "nightlies"}) %}
{% set _ = sdvars.update({"component": "main nightlies"}) %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks:

else:
self.apt_url = FPF_APT_TEST_SOURCES.format(dist=dist, component="nightlies")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yupp - to fix, we can either update that test or push the kernels+extras into the nightlies channel as well. (they would not need to be built nightly, just pushed when updated). I think the latter is the better option as we want dev instances to always be on nightly deb packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the kernel string test failure will be addressed with next 6.6 kernels, @legoktm already fixed that in a PR pending to kernel-builder.

Copy link
Contributor Author

@zenmonkeykstop zenmonkeykstop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@legoktm's changes LGTM. I can't approve having created the PR tho, someone else will need to.

with io.open(filepath, "r") as f:
data = yaml.safe_load(f)
return data
def test_policies(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good first step IMO

@@ -47,3 +47,8 @@ qvm-check --quiet sd-fedora-39-dvm 2> /dev/null && \
sudo qubesctl --show-output --skip-dom0 --targets sd-fedora-39-dvm state.highstate && \
qvm-shutdown --wait sys-usb && qvm-start sys-usb || \
sudo qubesctl --show-output --skip-dom0 --targets sys-usb state.highstate

# Shut down all VMs to ensure new configuration takes place, if the user doesn't reboot.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we actually need a reboot, we could probably change this to a user Y/N prompt - "The system must be rebooted to complete the installation. Reboot now? Y/N" or similar. If we don't actually need a reboot then kicking the VMs is fine.

Copy link
Member

@cfm cfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving based on all testing and review so far, per #970 (review).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Drop securedrop-workstation-bullseye base template, use a debian-12 system template
7 participants