Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: new RFC on always-available users #44134

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

knz
Copy link
Contributor

@knz knz commented Jan 20, 2020

This works enables the creation of multiple accounts besides root that have the same special properties: the ability to log in, monitor and troubleshoot a cluster when system ranges are unavailable.

Link to RFC

Release note: None

@knz knz requested a review from a team as a code owner January 20, 2020 10:23
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz force-pushed the 20200119-users-rfc branch 6 times, most recently from a931e7e to 374ba55 Compare January 20, 2020 10:45
@knz knz force-pushed the 20200119-users-rfc branch from 374ba55 to eb1b55d Compare January 20, 2020 10:48
@aaron-crl
Copy link

I like this approach because it extracts functionality from the root account and prepares for more granular and accountable access for other customer defined accounts.

@knz
Copy link
Contributor Author

knz commented Jan 24, 2020

Remark from @tbg: de-emphasize the use of gossip. It's possible to achieve an eventually consistent RAM-only copy of a table by other means than gossip.

@tbg tbg removed the request for review from a team June 25, 2020 09:50
Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like the approach here. There are a lot of reasons to get the KV ops out of the critical path, and I think gossip is the right alternative. I'm not concerned about the potential for propagation delays, although this is something we should validate with our more security-sensitive customers.

de-emphasize the use of gossip. It's possible to achieve an eventually consistent RAM-only copy of a table by other means than gossip.

It's possible, but why? Gossip is a pretty good way to maintain this kind of cache, and it seems to be reasonably robust in practice. What advantages would the hypothetical alternative have?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


docs/RFCS/20200119_always_available_users.md, line 45 at r1 (raw file):

authorize a regular user includes:

- for authentication: `system.namespace`, `system.users`,

The system.namespace table is not addressed in this document (it used to be gossiped, but now it has been moved out of the system config span).

Do we need to add the role_options table here to bring it up to date to current CRDB?


docs/RFCS/20200119_always_available_users.md, line 83 at r1 (raw file):

Previously, only `root` could access and monitor a
unavailable cluster. With this change enabled, all users
with the `admin` role can access and monitor when the cluster is

With the admin role being broken out into multiple permissions, I don't think the always-available capability should be tied to the admin role specifically. A read-only user with enough permissions to view the console would be a good candidate for this treatment. Maybe this should be a permission that could be granted to any user (and would be included in the bundle of permissions for the admin role).


docs/RFCS/20200119_always_available_users.md, line 95 at r1 (raw file):

The RFC also works under the following assumptions:

- regular user accounts can be numerous (tens of thousands).

There could be huge numbers of users, but this is rare. For users with more modest needs, I think it could be desirable to mark all their users as cached. Maybe we should facilitate this by marking the first N users created as cached (although that sets up a bit of a delayed surprise).

Note that to support truly unbounded numbers of users we'd need to move the system.users table out of the system config span.


docs/RFCS/20200119_always_available_users.md, line 98 at r1 (raw file):

- only few (hundreds or less) of these accounts are meant
  to access and monitor/operate an unavailable cluster.
- small propagation delays in updates to user accounts and

One of the reasons we do a KV read to the system.users table on every login is because of an assumption when this was implemented that small delays would not be tolerated - dropping a user or revoking a grant needed to be effectively synchronous.

I've always been skeptical of this assumption, but we should validate which of these assumptions is more valid. And if propagation delays are tolerated, we should be able to do more caching of user lookups even for users that are not covered by this always-available mechanism to reduce load on the system ranges.

Another possible answer is that some propagation delay can be tolerated, but there should be an upper bound. (I've seen users get nervous about the lack of formal guarantees around gossip, even though it's usually quite fast). This would argue for some sort of time-limited lease, although the lease duration would need to be quite large to be useful for debugging struggling clusters.


docs/RFCS/20200119_always_available_users.md, line 113 at r1 (raw file):

  username           STRING NOT NULL,
  hashedPassword     STRING,
  roles              []STRING NOT NULL,

Note that the system.users table is already gossipped via the system config range. So really what we're doing here is denormalizing the role membership into a users table for a subset of users.

Should we just do that into the existing users table for everyone? This would be more expensive for clusters with many users, but the fact that the users table is already gossiped implies an effective limit on the number of users already.


docs/RFCS/20200119_always_available_users.md, line 165 at r1 (raw file):

would remain blocked.

The RFC is considering either of the following three solutions.

Another option would be to encode the necessary contents of the web_sessions record into the cookie itself (signed and possibly encrypted) so it would be self-contained.

Again, the original reason for the KV read on every access was to ensure that revocation of a session token would be immediate. As long as we're OK with a possible delay in the effect of revocation, a self-contained session cookie would let us handle this by gossiping only revoked tokens, or just failing open if the KV read times out.


docs/RFCS/20200119_always_available_users.md, line 187 at r1 (raw file):

before the stored table.

The difference between 1.1 and 1.2 is that 1.1 would make the Admin UI

1.2 also doesn't have any cache invalidation - without updates via gossip, a revoked session token would still be usable until the cache times out .


docs/RFCS/20200119_always_available_users.md, line 204 at r1 (raw file):

on the admin UI (or active within the same one-hour period).

Observing that web tokens are only valid for one hour each, we can

Where does this one hour value come from? server.web_session_timeout defaults to one week. I would not assume that we can afford to cache the entire web sessions table.


docs/RFCS/20200119_always_available_users.md, line 245 at r1 (raw file):

table nor RAM caches.

However, any deployment of TLS certificates needs a *revocation

We now support revocation of TLS certificates via OCSP. So this is an option for sophisticated users who run their own CA and OCSP server.

We don't provide such tools out of the box, but I'm also not convinced that they're necessary - if "any deployment of TLS certs" needed it, we shouldn't be using client certs at all without providing it. I think we need to solve the authorization problem for TLS certs to be a possible solution here, but I'd be OK with the status quo for revocation.

From a UX perspective, though, I don't think it's a good idea to have TLS certs as the emergency-access path for clusters that primarily operate by passwords. So I agree with the overall decision to focus on a more highly-available users table instead of moving things out of band via certs.

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell and @knz)


docs/RFCS/20200119_always_available_users.md, line 95 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

There could be huge numbers of users, but this is rare. For users with more modest needs, I think it could be desirable to mark all their users as cached. Maybe we should facilitate this by marking the first N users created as cached (although that sets up a bit of a delayed surprise).

Note that to support truly unbounded numbers of users we'd need to move the system.users table out of the system config span.

I have thought a little bit more about this and I'd like to propose this feature to be enabled for an account or a role depending on a new role option CACHED. There would be a background process on each node which would periodically refresh the cache from system.users by taking entries that have this attribute set.


docs/RFCS/20200119_always_available_users.md, line 98 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

One of the reasons we do a KV read to the system.users table on every login is because of an assumption when this was implemented that small delays would not be tolerated - dropping a user or revoking a grant needed to be effectively synchronous.

I've always been skeptical of this assumption, but we should validate which of these assumptions is more valid. And if propagation delays are tolerated, we should be able to do more caching of user lookups even for users that are not covered by this always-available mechanism to reduce load on the system ranges.

Another possible answer is that some propagation delay can be tolerated, but there should be an upper bound. (I've seen users get nervous about the lack of formal guarantees around gossip, even though it's usually quite fast). This would argue for some sort of time-limited lease, although the lease duration would need to be quite large to be useful for debugging struggling clusters.

I generally agree on the idea of cache + expiration lease.

However even with that we still need to mark some privileged accounts as non-expirable for the purpose of troubleshooting. This is the primary goal of this RFC. So I suggest we keep the distinction between "general users, who may have cached credentials with expiry" and "always-available users who always have cached credentials with no expiry".


docs/RFCS/20200119_always_available_users.md, line 113 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Note that the system.users table is already gossipped via the system config range. So really what we're doing here is denormalizing the role membership into a users table for a subset of users.

Should we just do that into the existing users table for everyone? This would be more expensive for clusters with many users, but the fact that the users table is already gossiped implies an effective limit on the number of users already.

  1. I was assuming that system.users is not gossiped. If it is that is a bug and it needs to be corrected.

  2. this RFC proposes to have a cache for certain users, and since we are bound to support multiple data sources for user credentials, I can see this proposal ot be a cache for any user across all sources.

@rafiss
Copy link
Collaborator

rafiss commented Jul 26, 2022

I'm linking this issue here as it seems related: #82916 In this case, the loss of an internal (timeseries) range rendered the entire cluster inoperable, so even the root user could not log in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
X-noremind Bots won't notify about PRs with X-noremind
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants