Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sssd should run under unprivileged user #3412

Closed
sssd-bot opened this issue May 2, 2020 · 25 comments
Closed

sssd should run under unprivileged user #3412

sssd-bot opened this issue May 2, 2020 · 25 comments
Assignees
Labels
Bugzilla Closed: Fixed Issue was closed as fixed.

Comments

@sssd-bot
Copy link

sssd-bot commented May 2, 2020

Cloned from Pagure issue: https://pagure.io/SSSD/sssd/issue/2370


Ticket was cloned from Red Hat Bugzilla (product Red Hat Enterprise Linux 7): Bug 1113783

Please note that this Bug is private and may not be accessible as it contains confidential Red Hat customer information.

Description of problem:
sssd is currently running as root and should be reimplemented to run under an
unprivileged user. 

Version-Release number of selected component (if applicable):
sssd-1.11.2-65

How reproducible:
100%

Steps to Reproduce:
1. systemctl sssd start
2. ps axZ | grep sssd

Actual results:
sssd and all children running as root

Expected results:
sssd and all children running under unprivileged user.

Additional info:

Comments


Comment from dpal at 2014-07-03 16:02:10

Fields changed

blockedby: =>
blocking: =>
changelog: =>
coverity: =>
design: =>
design_review: => 0
feature_milestone: =>
fedora_test_page: =>
milestone: NEEDS_TRIAGE => SSSD 1.12.1
review: True => 0
selected: =>
testsupdated: => 0


Comment from jhrozek at 2014-07-08 10:11:36

Fields changed

priority: major => blocker


Comment from jhrozek at 2014-07-29 17:23:19

Fields changed

design: => https://fedorahosted.org/sssd/wiki/DesignDocs/NotRootSSSD


Comment from jhrozek at 2014-07-31 13:32:11

Fields changed

owner: somebody => jhrozek
status: new => assigned


Comment from jhrozek at 2014-09-08 20:08:41

Mass-moving all tickets that didn't make 1.12.1 into 1.12.2

milestone: SSSD 1.12.1 => SSSD 1.12.2


Comment from jhrozek at 2014-09-30 19:06:10

We need to do a release as requested by downstream. Moving tickets that are not fixed already or very close to acking to 1.12.3

milestone: SSSD 1.12.2 => SSSD 1.12.3


Comment from jhrozek at 2014-10-20 21:50:00

First round of patches:

mark: => 0


Comment from jhrozek at 2014-10-30 12:12:36

Fields changed

patch: 0 => 1


Comment from jhrozek at 2014-11-05 20:12:26

ldap_child and krb5_child changes:
- f3a2594
- 77b1337
- 06f10b2
- 9369407
- 5eef3da
- 0348c74
- 45414c1


Comment from jhrozek at 2014-11-18 20:44:12

More Kerberos changes:


Comment from jhrozek at 2014-11-18 20:55:22

Build fixes:
- d167039
- 0a039d5
- f9ac9aa


Comment from jhrozek at 2014-11-18 20:58:03

sssd_be privilege drop patch:


Comment from jhrozek at 2014-11-25 20:02:18

Most of the work is done, so I'm closing this ticket. There are some additional enhancements tracked by individual tickets. See e.g. the design page for more details.

resolution: => fixed
status: assigned => closed


Comment from jhrozek at 2017-02-24 14:35:03

Metadata Update from @jhrozek:

@sssd-bot sssd-bot added Bugzilla Closed: Fixed Issue was closed as fixed. labels May 2, 2020
@sssd-bot sssd-bot closed this as completed May 2, 2020
@cgwalters
Copy link
Contributor

Copy-pasting from coreos/coreos-assembler#2046 (comment)

suid binaries are really just a dangerous idea in general. They're a huge attack surface. For this use case instead, the monitor process can expose an internal API (e.g. over a socketpair) that runs something privileged and passes its output to the requesting unprivileged process.

@adelton
Copy link

adelton commented Feb 17, 2021

I guess some description of the role and architecture of the monitor process would be useful.

Is it there to gather and provide some data from the individual SSSD processes (in which case, does it need to be run as root at all?) or is it there to for example watch for failed SSSD processes and restart them (in which case, systemd which already runs as root and with full capabilities maybe already provides some mechanism to do that)?

@alexey-tikhonov
Copy link
Member

I guess some description of the role and architecture of the monitor process would be useful.

Is it there to gather and provide some data from the individual SSSD processes (in which case, does it need to be run as root at all?) or is it there to for example watch for failed SSSD processes and restart them (in which case, systemd which already runs as root and with full capabilities maybe already provides some mechanism to do that)?

Mostly the latter.

And yes, this is a long standing idea to get rid of monitor and rely on systemd entirely.

So from my point of view, extending monitor is not a way to get rid of suid bits form *_child helpers.

@cgwalters
Copy link
Contributor

So from my point of view, extending monitor is not a way to get rid of suid bits form *_child helpers.

The way I think of this is: Yes you're running more code as non-root, but by adding suid binaries, you have increased the risk to the system as a whole.

A separate privileged process that communicates over e.g. private socketpair() with other children is a well-used Unix design pattern. It doesn't actually mean that process has to be the parent (or monitor) of the other processes.

In other words, you can run multiple systemd units - one of which is e.g. a socket-activated sssd-priv-helper.service that exposes a Unix domain socket that you authenticate by e.g. validating the uid of the connecting process is sssd.

@alexey-tikhonov
Copy link
Member

The way I think of this is: Yes you're running more code as non-root, but by adding suid binaries, you have increased the risk to the system as a whole.

Might be. Overall `SSSD running as non-root user" isn't "production ready" yet, that's clear.

I think we will fix attrs (i.e. get rid of suid) for a case where sssd is built with --with-sssd-user=root (this is Fedora for example) shortly.

Proper solution is more involved though.

@cgwalters
Copy link
Contributor

@alexey-tikhonov
Copy link
Member

I think we will fix attrs (i.e. get rid of suid) for a case where sssd is built with --with-sssd-user=root (this is Fedora for example) shortly.

#5510

@alexey-tikhonov
Copy link
Member

So from my point of view, extending monitor is not a way to get rid of suid bits form *_child helpers.

The way I think of this is: Yes you're running more code as non-root, but by adding suid binaries, you have increased the risk to the system as a whole.

JFTR:

  • Fedora is going to be built '--with-sssd-user=sssd' starting, probably, F-41 (sssd-2.10)
  • SUID bit was replaced with fine grained file capabilities:
    %global child_capabilities cap_chown,cap_dac_override,cap_setuid,cap_setgid=ep
  • binaries in question are:
-rwxr-x---. 1 root sssd 141K Jan 21  2021 krb5_child
-rwxr-x---. 1 root sssd  52K Jan 21  2021 ldap_child
-rwxr-x---. 1 root sssd  32K Jan 21  2021 selinux_child

-- so executable only by 'root' and service user thus I don't think it "increases the risk to the system as a whole".

@cgwalters
Copy link
Contributor

-- so executable only by 'root'

Yes, that's a significant mitigation indeed. But still, I feel pretty confident will show up in scans/audits and will cause you more pain down the line than the "privileged monitor process" approach or another similar one.

Maybe e.g. running with PrivateTmp= or so, and have a (privileged) ExecStartPre=- run code that copies the binaries into that namespace, and only makes them suid there.

BTW if you're doing new work to run unprivileged I highly recommend using DynamicUser=yes instead of having a fixed uid done via e.g. %post useradd or so. It has numerous advantages, including working much better with image-based update systems and avoiding uid/gid drift in those scenarios.

@adelton
Copy link

adelton commented Mar 22, 2024

BTW if you're doing new work to run unprivileged I highly recommend using DynamicUser=yes instead of having a fixed uid done via e.g. %post useradd or so. It has numerous advantages, including working much better with image-based update systems and avoiding uid/gid drift in those scenarios.

I'm of quite the opposite opinion. By using DynamicUser=yes, what you get is uncertainty and the complexity of chowns logic possibly needed upon startup.

Can you elaborate what the benefits for the image-based systems (or services running in containers) would be exactly?

@alexey-tikhonov
Copy link
Member

-- so executable only by 'root'

Yes, that's a significant mitigation indeed. But still, I feel pretty confident will show up in scans/audits and will cause you more pain down the line

Do you have an example where this could really increase risk / attack surface?
It looks for me as "running privileged process all the time" vs "running privileged helper occasionally (by restricted users only)".
So far I don't understand how the latter could be worse.

Additional note: a set of file capabilities (that replaced full blown SUID bit) is going to be restricted further, and other measures to constrain helpers will be considered (like selinux policies).

Maybe e.g. running with PrivateTmp= or so, and have a (privileged) ExecStartPre=- run code that copies the binaries into that namespace, and only makes them suid there.

Interesting option.
But if this creates a new namespace, then will privileged helper run within this namespace have privileges in the host namespace? I'm not sure what is "file system namespace".

@cgwalters
Copy link
Contributor

I'm of quite the opposite opinion. By using DynamicUser=yes, what you get is uncertainty and the complexity of chowns logic possibly needed upon startup.

What is uncertain?

the complexity of chowns logic possibly needed upon startup.

It's pretty well optimized.

The thing is...ah yes, I see it is a floating user/group:
https://src.fedoraproject.org/rpms/sssd/blob/rawhide/f/sssd.spec#_992

For image based updates it is really, really helpful to minimize the number of dynamic/floating users that are part of lower levels in the OS.

Please please consider using DynamicUser=yes as it seems nearly ideal for this use case as sssd is just a cache.

@alexey-tikhonov
Copy link
Member

The thing is...ah yes, I see it is a floating user/group:

JFTR, it will use %sysusers_create_compat starting f-41. But that's probably doesn't change much.

Please please consider using DynamicUser=yes as it seems nearly ideal for this use case as sssd is just a cache.

And where this cache should be kept / what should be ownership and ACL? So that it would be persistent between service restarts / machine reboots?

@cgwalters
Copy link
Contributor

And where this cache should be kept / what should be ownership and ACL?

Use StateDirectory=sssd in the unit - when used in combination with DynamicUser=yes, this ensures ownership is handled, and it persists across reboots as /var does in general.

@travier
Copy link
Contributor

travier commented Mar 25, 2024

JFTR, it will use %sysusers_create_compat starting f-41. But that's probably doesn't change much.

Yes please 🙂 https://docs.fedoraproject.org/en-US/packaging-guidelines/UsersAndGroups/#_allocation_strategies

@travier
Copy link
Contributor

travier commented Mar 25, 2024

%global child_capabilities cap_chown,cap_dac_override,cap_setuid,cap_setgid=ep

That's a bit better but that's still equivalent to full root in most cases: https://grsecurity.net/false_boundaries_and_arbitrary_code_execution

Are any users expected to be part of the sssd group or is it only used for the daemon?

@alexey-tikhonov
Copy link
Member

alexey-tikhonov commented Mar 25, 2024

%global child_capabilities cap_chown,cap_dac_override,cap_setuid,cap_setgid=ep

That's a bit better but that's still equivalent to full root in most cases: https://grsecurity.net/false_boundaries_and_arbitrary_code_execution

But should be better than to run entire daemon unrestricted all the time (as it's the case now).
And as I mentioned, set will be restricted further.

Are any users expected to be part of the sssd group or is it only used for the daemon?

The latter.

@cgwalters
Copy link
Contributor

I'm of quite the opposite opinion. By using DynamicUser=yes, what you get is uncertainty and the complexity of chowns logic possibly needed upon startup.

Sorry to really belabor this but given sssd's critical architectural role a central point around user management...what is done here partially sets a "tone" and example for overall system architecture around what other services should do. If you have concerns about DynamicUser=yes - please be concrete.

Tangentially related to this I just added a large blurb on this over in https://containers.github.io/bootc/building/users-and-groups.html and it touches on the fact that the "classic RPM %pre floating allocation" pattern will generally cause uid/gid drift for image based systems unless it's manually fixated. DynamicUser=yes is again just significantly better from that PoV.

@adelton
Copy link

adelton commented Mar 25, 2024

SSSD is (among other functions) a name service so it may be a source of uids for processes. As such it should be very certain about its own identity and uid clashes should be ruled out. The fact that the daemon with DynamicUser=yes can run as the same uid as regular user processes compromises the whole uid resolution, and it is a nightmare from the auditing point of view. Systemd cannot prevent the daemon from starting as such uid because at the time of the daemon start, the remote uids are not yet being resolved by SSSD so systemd has no way of knowing that there might be conflicting uids.

I dread the idea of a system where the uid ranges are not reasonably separated / isolated. Ability to audit operations and their authors is one of the strict SFRs of Common Criteria certification, among others.

The DynamicUser=yes explicitly uses uids in the range 61184–65519, so above the SYS_UID_MAX logic of login.defs. To me that is a clear indication that this mechanism should not be used for system services and system daemons.

I simply fail to see what value DynamicUser=yes provides to SSSD, or in general to daemons packaged in OSes. I don't see what benefit this would bring over the soft static allocation (https://docs.fedoraproject.org/en-US/packaging-guidelines/UsersAndGroups/#_soft_static_allocation). Just pick a uid a go with that. No drifts, not issues upon upgrades, no chowns. Uid from a range designated for system accounts.

For example, Apache's uid 48 has been stable for ages on rpm-based systems. I don't see a reason why the same couldn't be done for SSSD.

By the way, it seems FPC has already acked the soft static allocation for SSSD in https://pagure.io/packaging-committee/issue/570 -- we just never followed up with the actual allocation in uidgid.

@cgwalters
Copy link
Contributor

The fact that the daemon with DynamicUser=yes can run as the same uid as regular user processes compromises the whole uid resolution,

One of two things is true: DynamicUser=yes works or it doesn't. If you're arguing that there are issues, then either those issues need to be fixed, or it can't be used in general (right?).

so systemd has no way of knowing that there might be conflicting uids. [...] The DynamicUser=yes explicitly uses uids in the range 61184–65519, so above the SYS_UID_MAX logic of login.defs. To me that is a clear indication that this mechanism should not be used for system services and system daemons.

I think the rationale for that is exactly that it's distinct from system allocations and human users. What else would it do?

I simply fail to see what value DynamicUser=yes provides to SSSD, or in general to daemons packaged in OSes. I don't see what benefit this would bring over the soft static allocation

It doesn't scale to have a centralized registry for uids, and there's only so much space under 1000. (It also doesn't scale to have a centralized registry for software, hence containers...)

For example, Apache's uid 48 has been stable for ages on rpm-based systems. I don't see a reason why the same couldn't be done for SSSD. By the way, it seems FPC has already acked the soft static allocation for SSSD

Yes, that's obviously fine, I won't argue against it. But still, there will be a big need to have something like DynamicUser=yes in the general case because of the centralization problem above. So if you have concerns about potential overlaps or other design issues, I don't see a way around fixing that in general.

@adelton
Copy link

adelton commented Mar 26, 2024

One of two things is true: DynamicUser=yes works or it doesn't. If you're arguing that there are issues, then either those issues need to be fixed, or it can't be used in general (right?).

Why do you view it as black or white? The DynamicUser=yes feature likely works as described. To me it seems like a quick way to un-privilege your services. (I wanted to say quick'n'dirty but I didn't want to start another round of justifying my words -- this is not a review of the DynamicUser=yes feature. :-) )

That does not mean everything should be migrated to it en masse. Even https://fedoraproject.org/wiki/Changes/SystemdSecurityHardening does not tout it as a one-size-fits-all security mechanism:

We have to choose between DynamicUser=yes or User if either is feasible for the service to use.

Different types of workloads and services have different security ramifications. Risks that you might tolerate with some user service with no impact/side-effect on the rest of the system is vastly different from risks that you can tolerate from a daemon that is a basis of identity, authentication, and authorization in the system. Note I also have a problem with classifying SSSD as "just a cache".

I think the rationale for that is exactly that it's distinct from system allocations and human users. What else would it do?

I don't necessarily blame it. It does not mean I want to see SSSD use it.

It doesn't scale to have a centralized registry for uids, and there's only so much space under 1000. (It also doesn't scale to have a centralized registry for software, hence containers...)

I've heard the argument about 1000 for many years. And yet there are still tons of gaps in uidgid. The current soft static allocation process actually seems like a fairly reasonable gatekeeper.

Yes, that's obviously fine, I won't argue against it. But still, there will be a big need to have something like DynamicUser=yes in the general case because of the centralization problem above. So if you have concerns about potential overlaps or other design issues, I don't see a way around fixing that in general.

There likely isn't, there are inherent risks and uids are hard. (And even if you get uids in sync for example across systems, your subuids might be out of sync -- just recently we had a containerized FreeIPA user/admin hit that.)

If we are talking in general, the fact that systemd does not seem to leave any auditing trail about the uid it used for the DynamicUser=yes service is scary and should be No. 1 thing to address. I wouldn't like to be in a position to explain to evaluators why we have no idea where a given process and its uid came from or why there are chown operations from one "random" uid to another audited.

@cgwalters
Copy link
Contributor

Just going to re-bump this point: #3412 (comment)

I still think sssd adding new setcaps binaries is the wrong direction. There's many viable patterns for dropping privileges without adding new setcap/setuid binaries.

What I'd really like to be able to enable is people running from pid 1 on down with NoNewPrivileges=yes - things like run0 are a big help towards that. Lennart (systemd) I believe agrees.

I have not tried to audit the sssd setuid binaries so I'm not claiming there is actually any new vulnerabilities.

But this is also about setting the long term direction for the base OS, and because sssd is part of that the behavior chosen here also acts as a reference point, and I think it's not a good reference.

@alexey-tikhonov
Copy link
Member

alexey-tikhonov commented Jan 17, 2025

I still think sssd adding new setcaps binaries is the wrong direction.

For some systems, like RHEL, that's not an addition but a replacement of a full blown suid binaries with fine grained capabilities.

There's many viable patterns for dropping privileges without adding new setcap/setuid binaries.

So far every alternative boils down to "have a privileged process [running all the time or socket activated]" and communicate with it from unprivileged code via some kind of IPC.
But this is exactly what SSSD is doing.
The difference is a way privileged process gets its privileges: suid, file capabilities or started privileged by another privileged software (like polkit or systemd).

What I'd really like to be able to enable is people running from pid 1 on down with NoNewPrivileges=yes - things like run0 are a big help towards that. Lennart (systemd) I believe agrees.

We do cater as well and we do care as well of systems that do not run systemd.

It would be nice to support polkit instead of file capabilities via ./configure switch but that's quite of work for very low benefit.

People who would like to run entire user space with NoNewPrivileges=yes most probably don't need/use SSSD.
SSSD is about support of centrally managed user identities, including privileges of those identities via sudo/hbac/etc rules.
NoNewPrivileges=yes means sudo-less system. IIUC, this means "network" (LDAP) admins can't be local admins?

And while Lennart writes "which think that things like pluggable client-side modules, LDAP and so on are actually a good thing, even though I would vehemently disagree with that" that doesn't mean nobody needs it.

I have not tried to audit the sssd setuid binaries so I'm not claiming there is actually any new vulnerabilities.

One needs to package SSSD carefully, making sure random user can't run privileged SSSD binaries.

Once this condition is met, one can reasonably claim there are no new vulnerabilities in the sense that having credentials of 'sssd' service user means to be able to write SSSD cache and this means to be able to craft identities, rewrite password hash, create sudo rules, etc. I.e. ability to run SSSD privileged helpers means system is already severely compromised.

This is very different to a situation of empowered binaries that can be run by unprivileged users.

@alexey-tikhonov
Copy link
Member

alexey-tikhonov commented Jan 17, 2025

It would be nice to support polkit instead of file capabilities via ./configure switch but that's quite of work for very low benefit.

I feel I need to elaborate on this.

What I mean is the following:
if

  • a program runs with elevated privileges
  • and there is a bug in this program that allows privilege escalation for otherwise unprivileged user

then the way a program received privileged capabilities in run time - file capabilities or polkit - doesn't matter that much.

If, due to a hypothetical bug, 'krb5_child' mistakenly authenticates an user that didn't actually provide valid credential, this will end up in a system compromise doesn't matter how 'krb5_child' is run (even if fully unprivileged, simply because it plugs into PAM stack).

What I read about 'sudo vs run0' boils down to:

  • 'sudo' is unsafe because it is started in uncontrolled environment.

But that's just not a case with SSSD because unprivileged users can't start its privileged binaries.

That's why I think there is a little benefit in switching SSSD from file capabilities to polkit.
It is not at all obvious by how much attack surface will decrease thanks to this move. It might well be other way around due to more complicated code.

@alexey-tikhonov
Copy link
Member

JFTR, the current state (as of sssd-2.10.1) is:

krb5_child cap_dac_read_search,cap_setgid,cap_setuid=p
ldap_child cap_dac_read_search=p
selinux_child cap_setgid,cap_setuid=p
sssd_pam cap_dac_read_search=p
  • cap_dac_read_search is there:
    (1) mostly to allow access to a host keytab
    (2) to check for existing FILE:/DIR: ccache of a user being authenticated
    So if host keys are stored in a file readable by 'sssd' user or group (via regular DAC) and support of old FILE:/DIR: ccache types isn't required, then one can strip cap_dac_read_search from all SSSD binaries right away.

  • if one doesn't store users selinux policy in IPA, then selinux_child isn't required at all (or its caps can be stripped)

  • this leaves 'cap_setgid,cap_setuid=p' for 'krb5_child'. This is used to impersonate an user while storing TGT acquired on their behalf during authentication. There is no trivial solution to avoid this. One option was mentioned here. I also consider addition of an option to skip storing TGT altogether, then if this option is used then cap_set*id can be stripped as well.

(I understand this doesn't help "from pid 1 on down with NoNewPrivileges=yes" goal, so more JFTR)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bugzilla Closed: Fixed Issue was closed as fixed.
Projects
None yet
Development

No branches or pull requests

6 participants