Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cert: Extend tls code to authenticate tenant scoped client cert #79065

Merged
merged 1 commit into from
Jun 15, 2022

Conversation

rimadeodhar
Copy link
Collaborator

@rimadeodhar rimadeodhar commented Mar 30, 2022

This PR extends the TLS code to use a tenant scoped
client cert to authenticate a client for specific tenant.

Release note (security update): We introduce a new
tenant scoped client certificate to authenticate a client
on a specific tenant. A tenant scoped client certificate contains the client
name within the CN and the tenant ID, to which the
certificate is being scoped to, as the SAN. The tenant ID
is embedded within the URI section with the format
"crdb://tenant/<tenant_id>". For example, a root
client certificate scoped to a tenant with ID 123
will contain "root" in the CN field and the URI
"crdb://tenant/123" in the URI section of the
certificate. This certificate will authorize the root
client on the tenant with the ID 123. It will result
in an authorization error if used to authenticate the
root client on any other tenant.

Informs #77958

Please ignore commit 1 in this PR, that will be reviewed in #79064.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rimadeodhar rimadeodhar changed the title Auth ten crt cert: Extend tls code to authenticate tenant scoped client cert Mar 30, 2022
@rimadeodhar rimadeodhar marked this pull request as ready for review March 30, 2022 17:54
@rimadeodhar rimadeodhar requested review from a team as code owners March 30, 2022 17:54
@rimadeodhar rimadeodhar requested a review from a team March 30, 2022 17:54
@rimadeodhar rimadeodhar requested a review from a team as a code owner March 30, 2022 17:54
@rimadeodhar rimadeodhar requested review from knz, bdarnell, catj-cockroach and a team March 30, 2022 17:54
Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/rpc/pg.go, line 84 at r2 (raw file):

		// using a password.
		certPath := ctx.ClientCertPath(username)
		keyPath := ctx.ClientKeyPath(username)

Should we enforce using only tenant scoped client certs if the current server is a tenant SQL only server as suggested by @bdarnell in the GH issue #77958? Currently, the code will allow using a non tenant scoped client cert if one is available, but maybe we don't want to allow that? This decision will also impact the authentication within UserCertAuthHook below.

Code quote:

		certPath := ctx.ClientCertPath(username)
		keyPath := ctx.ClientKeyPath(username)

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 14 of 14 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, and @rimadeodhar)


-- commits, line 23 at r2:
This needs a more exhaustive release note in the category "security update". Please provde an example of a SAN string.


pkg/rpc/pg.go, line 86 at r2 (raw file):

		keyPath := ctx.ClientKeyPath(username)
		certsAvailable := checkCertAndKeyAvailable(certPath, keyPath)
		if !certsAvailable && ctx.tenID != roachpb.SystemTenantID {

I would remove the 2nd part of this condition. It is fine to use a tenant-scoped client cert for the system tenant too.


pkg/security/auth.go, line 150 at r2 (raw file):

		// If the current server is a non system tenant SQL server, we should use a tenant scoped
		// client certificate.

see my comment above: if the client cert is tenant-scoped and its tenant ID is that of the system tenant, tht should be fine.


pkg/security/auth.go, line 219 at r2 (raw file):

}

func GetTenantIfTenantScopedClientCert(tlsState *tls.ConnectionState) (roachpb.TenantID, error) {

Please explain what this function does in a comment.


pkg/security/auth.go, line 236 at r2 (raw file):

		if uri.Host == "tenant" {
			tenantInfo := strings.TrimPrefix(uri.Path, "/")
			tenantID, err = strconv.ParseUint(tenantInfo, 10, 64)

Could you perhaps factor this Parse call (and also the one in the CLI side in #79064) to be a single parse function in the roachpb package that returns a TenantID and an error?


pkg/security/certificate_loader.go, line 231 at r2 (raw file):

	case `client`:
		// This could be a client certificate or a tenant scoped client certificate.
		if strings.Contains(filename, "tenant") {

You need to change this as per this review comment


pkg/security/certificate_loader.go, line 242 at r2 (raw file):

				return nil, errors.Errorf("client certificate filename should match client.<user>%s", certExtension)
			} else {
				return nil, errors.Errorf("tenant scoped client certificate filename should match client.<user>.tenant-<tenant_id>%s", certExtension)

ditto


pkg/security/certificate_loader.go, line 524 at r2 (raw file):

func extractTenantAndUserFromCertName(filename string) (SQLUsername, uint64, error) {
	parts := strings.Split(filename, ".")

ditto


pkg/sql/pgwire/auth_methods.go, line 435 at r2 (raw file):

			tlsState.PeerCertificates[0].Subject.CommonName,
		).Normalize()
		hook, err := security.UserAuthCertHook(false /*insecure*/, &tlsState, tenantID)

use execCfg.RPCContext.TenantID here as argument. (you can name the execCfg in the argument list to authCert above.

then please remove the tenantID argument from all the functions in pgwire where you've added it, and remove the roachpb dependency for non-test code you've just introduced in this PR. It's undesirable.

@rimadeodhar rimadeodhar requested a review from a team as a code owner April 11, 2022 21:43
Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the PR based on the review comments and also added unit tests. This is RFAL.

@catj-cockroach: Do you think we need any additional safeguards to ensure this certificate cannot be misused over HTTP etc.?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)

// maybeGetTenantScopeFromClientCert returns a tenantID if the client certificate is scoped
// to a tenant. It returns a bool value which is set to true if the certificate is a tenant
// scoped client certificate.
func maybeGetTenantScopeFromClientCert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change this function (inline with my comments in the other PR) to get the usernames and tenantIDs at the same time? We can replace the existing GetUsernames function at the same time:

type CertificateUserScope struct {
    Username string
    TenantID roachpb.TenantID
}

func GetCertificateUsersScopes(certificate *x509.Certificate) ([]CertificateUserScope, err error) {
    // if we get a `crdb://node` or `crdb://tenant/<id>/node`, or if the common name is "node", return an error
    
    // If using the `crdb://` SAN, get both the username and tenant ID from it.
    
    // if not, return the legacy user ID and the system tenant ID.
    return []CertificateUserScope{
        CertificateUserScope{
            TenantID: roachpb.SystemTenantID,
            Username: usernameFromCommonName,
    }, nil
}

func GetCertificateUserScopesFromConnection(tlsState *tls.ConnectionState) ([]CertificateUserScope, error) {
    if tlsState == nil {
		return roachpb.TenantID{}, false, errors.Errorf("request is not using TLS")
	}
	if len(tlsState.PeerCertificates) == 0 {
		return roachpb.TenantID{}, false, errors.Errorf("no client certificates in request")
  }
  
  return GetCertificateUsersScopes(tlsState.PeerCertificates[0])
}

And then have a separate function to determine if we have a CertificateUserScope that matches the provided username and tenant ID.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been addressed.

pkg/security/certificate_loader.go Outdated Show resolved Hide resolved
pkg/security/certificate_manager.go Outdated Show resolved Hide resolved
pkg/security/certificate_loader.go Outdated Show resolved Hide resolved
// If the certificate is a tenant scoped client certificate, we should enforce that the tenant ID
// and client name matches with the certificate. Otherwise, it is sufficient to just check that the
// client name matches the certificate.
// TODO(rima): Should we enforce always using tenant scoped client cert for non-system tenants?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue we should not allow the use of client certificates without the crdb://tenant/<id>/user/<username> SAN in a multi-tenant system.

The system tenants, such as node users, should use crdb://node/<node id> certificates to differentiate them, and we could add logic to prevent them from having username SANs in them as well.

@rimadeodhar rimadeodhar force-pushed the auth_ten_crt branch 2 times, most recently from 42667f4 to 9b4a1c3 Compare April 18, 2022 19:14
Copy link
Contributor

@catj-cockroach catj-cockroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 10 files at r1, 2 of 14 files at r2, 8 of 55 files at r3, 50 of 50 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


pkg/rpc/auth.go, line 165 at r4 (raw file):

	// read and write all data, merely good hygiene. For example, there is
	// no reason to permit the root user to send raw Raft RPCs.
	nodeUserScope, isNodeUser := certUserScope[security.NodeUser]

nit: this is a lot of extra logic for this function, would it be possible for us to simplify this to a series of separate functions?

For example, the check that verifies if we have someone who is not a root user or node user could become function that checks for the node user (returning true if the node user is found), and then checks for the root user

if !security.IsNodeOrRootUser(certUserScope, aTenant.tenantID) {
    return roachpb.TenantID{}, authErrorf("client certificate %s is not allowed to perform this RPC", clientCert.Subject)
}

pkg/rpc/auth.go, line 172 at r4 (raw file):

	var userScope security.CertificateUserScope
	if isNodeUser {
		userScope = nodeUserScope

Rather than having the possibility of a certificate returning the node user scope and a root user scope, if we get a node client certificate we should only return the node user scope:

func GetActiveScope(certUserScope CertificateUserScope, username string, serverTenantID TenantID) (CertificateUserScope, error) {
    var userScope CertificateUserScope
    for _, scope := range certUserScope {
        // if we get a scope that matches the Node user, immediately return.
        if scope.Username == NodeUser {
            return scope, nil
        }
        if scope.Username == username {
             if scope.TenantID == serverTenantID || scope.TenantID == SystemTenantID {
                 userScope = scope
             }
        }
    }
   // double check we're not returning an empty scope
   if userScope.Username == "" {
       return userScope, errors.New("could not find active user scope for client certificate")
   }

   // only return userScope if we haven't found a NodeUser.
   return userScope
}

This function would then be called with:

userScope, err := security.GetActiveScope(certUserScope, security.RootUser, aTenant.tenantID)
if err != nil {
    return roachpb.TenantID{}, authErrorf("client certificate %s cannot be used to perform RPC on tenant %d", clientCert.Subject, a.tenant.tenantID)
}

We then know that the userScope will be for the node user and if not, it will be for the expected user for the system tenant or the active tenant ID.


pkg/security/auth.go, line 39 at r4 (raw file):

type CertificateUserScope struct {
	Username  string
	TenantIDs map[roachpb.TenantID]struct{}

What's TenantIDs used for here? Given that a Tenant would be a different cluster, would a username for one tenant ID not refer to a different user than the same username with a different tenant ID?

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


-- commits, line 23 at r2:

Previously, knz (kena) wrote…

This needs a more exhaustive release note in the category "security update". Please provde an example of a SAN string.

Done.


pkg/rpc/auth.go, line 165 at r4 (raw file):

Previously, catj-cockroach (Cat J) wrote…

nit: this is a lot of extra logic for this function, would it be possible for us to simplify this to a series of separate functions?

For example, the check that verifies if we have someone who is not a root user or node user could become function that checks for the node user (returning true if the node user is found), and then checks for the root user

if !security.IsNodeOrRootUser(certUserScope, aTenant.tenantID) {
    return roachpb.TenantID{}, authErrorf("client certificate %s is not allowed to perform this RPC", clientCert.Subject)
}

Done, as per the suggestion below.


pkg/rpc/auth.go, line 172 at r4 (raw file):

Previously, catj-cockroach (Cat J) wrote…

Rather than having the possibility of a certificate returning the node user scope and a root user scope, if we get a node client certificate we should only return the node user scope:

func GetActiveScope(certUserScope CertificateUserScope, username string, serverTenantID TenantID) (CertificateUserScope, error) {
    var userScope CertificateUserScope
    for _, scope := range certUserScope {
        // if we get a scope that matches the Node user, immediately return.
        if scope.Username == NodeUser {
            return scope, nil
        }
        if scope.Username == username {
             if scope.TenantID == serverTenantID || scope.TenantID == SystemTenantID {
                 userScope = scope
             }
        }
    }
   // double check we're not returning an empty scope
   if userScope.Username == "" {
       return userScope, errors.New("could not find active user scope for client certificate")
   }

   // only return userScope if we haven't found a NodeUser.
   return userScope
}

This function would then be called with:

userScope, err := security.GetActiveScope(certUserScope, security.RootUser, aTenant.tenantID)
if err != nil {
    return roachpb.TenantID{}, authErrorf("client certificate %s cannot be used to perform RPC on tenant %d", clientCert.Subject, a.tenant.tenantID)
}

We then know that the userScope will be for the node user and if not, it will be for the expected user for the system tenant or the active tenant ID.

Done.


pkg/rpc/pg.go, line 86 at r2 (raw file):

Previously, knz (kena) wrote…

I would remove the 2nd part of this condition. It is fine to use a tenant-scoped client cert for the system tenant too.

Done.


pkg/security/auth.go, line 150 at r2 (raw file):

Previously, knz (kena) wrote…

see my comment above: if the client cert is tenant-scoped and its tenant ID is that of the system tenant, tht should be fine.

Done.


pkg/security/auth.go, line 219 at r2 (raw file):

Previously, knz (kena) wrote…

Please explain what this function does in a comment.

Done.


pkg/security/auth.go, line 236 at r2 (raw file):

Previously, knz (kena) wrote…

Could you perhaps factor this Parse call (and also the one in the CLI side in #79064) to be a single parse function in the roachpb package that returns a TenantID and an error?

Done.


pkg/security/auth.go, line 39 at r4 (raw file):

Previously, catj-cockroach (Cat J) wrote…

What's TenantIDs used for here? Given that a Tenant would be a different cluster, would a username for one tenant ID not refer to a different user than the same username with a different tenant ID?

TenantIDs are all the tenants that are user is authorized to access. From what I understand of the multitenant unified architecture, it is possible for a single user to access multiple tenants. So while loading tenant scope from a client certificate, it is possible for the tenant SAN section to contain multiple tenant IDs.


pkg/sql/pgwire/auth_methods.go, line 435 at r2 (raw file):

Previously, knz (kena) wrote…

use execCfg.RPCContext.TenantID here as argument. (you can name the execCfg in the argument list to authCert above.

then please remove the tenantID argument from all the functions in pgwire where you've added it, and remove the roachpb dependency for non-test code you've just introduced in this PR. It's undesirable.

Done.


pkg/security/certificate_loader.go, line 231 at r2 (raw file):

Previously, knz (kena) wrote…

You need to change this as per this review comment

Done.


pkg/security/certificate_loader.go, line 242 at r2 (raw file):

Previously, knz (kena) wrote…

ditto

Done.


pkg/security/certificate_loader.go, line 524 at r2 (raw file):

Previously, knz (kena) wrote…

ditto

Done.


pkg/security/certificate_loader.go, line 99 at r3 (raw file):

Previously, catj-cockroach (Cat J) wrote…

Why do we need a separate value for this? ClientPem should be sufficient for both imo.

Done.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent work! The overall direction looks good and convincing.

I think we need to be a bit careful about reviewing the changes to the authentication code, we likely want more than just one pair of extra eyes (mine); it would be interesting to get more, at least Cat's and perhaps Ben's.

Reviewed 2 of 55 files at r3, 2 of 50 files at r4, 10 of 11 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


pkg/security/auth.go, line 35 at r5 (raw file):

// system tenant is treated as a global certificate which can authenticate
// on any tenant.
// TODO(rima): Should we remove this global authorization privilege for users

I'd say yes, we should remove it.

However the more urgent question is: what made you think it was useful to add this exception? Do we already have proof that we need this universal login capability?


pkg/security/auth.go, line 98 at r5 (raw file):

// getCertificateTenantScopes retrieves the tenant scopes from a certificate.
// TODO(#80312): Validate username within SAN and remove username from CN.

nit: TODO(rima): Validate ...; see issue https://github.com/...80312


pkg/security/auth.go, line 103 at r5 (raw file):

	if len(cert.URIs) == 0 {
		// No URI SAN, likely an old certificate. Populate tenant scope with default system tenant ID
		// to generate a global client certificate

nit: missing period at end of the sentence.

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


pkg/security/auth.go, line 35 at r5 (raw file):

Previously, knz (kena) wrote…

I'd say yes, we should remove it.

However the more urgent question is: what made you think it was useful to add this exception? Do we already have proof that we need this universal login capability?

Just based on an offline discussion with @catj-cockroach. Since old certificates don't have SANs the idea was to treat them as global certificates with the system tenant scope. I don't want to break any existing functionality for old client certificates which don't have the tenant SAN in them. So the idea is:

  • If we have a tenant SAN, use that as scope.
  • For old client certificates with no SAN, use system tenant ID and treat them as global certificates which is how they behave today where we don't enforce any tenant checks for them.

My inclination is that we should get rid of this global functionality eventually but not sure if I should be doing it in the first iteration of adding tenant scoping to certs.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


pkg/security/auth.go, line 35 at r5 (raw file):

Previously, rimadeodhar (Rima Deodhar) wrote…

Just based on an offline discussion with @catj-cockroach. Since old certificates don't have SANs the idea was to treat them as global certificates with the system tenant scope. I don't want to break any existing functionality for old client certificates which don't have the tenant SAN in them. So the idea is:

  • If we have a tenant SAN, use that as scope.
  • For old client certificates with no SAN, use system tenant ID and treat them as global certificates which is how they behave today where we don't enforce any tenant checks for them.

My inclination is that we should get rid of this global functionality eventually but not sure if I should be doing it in the first iteration of adding tenant scoping to certs.

I think you're taking a shortcut in your reasoning and i'd like to avoid that.
We have 2 different types of certificates:
A) certs without a tenant SAN
B) certs with a tenant SAN that specifies the system tenant ID

I am ok with making certs of type (A) global. I don't think we want to make certs of type (B) global too.
Can you explain why you think certs of type (B) should be make global too?

@catj-cockroach
Copy link
Contributor

Done. Just to confirm my understanding on why this is important though - the reason we don't want to override a node SAN is to ensure that any certificate with a node goes through the above validation of only working for system tenant?

It's more to ensure our code operates consistently. A certificate with a Node username and the system tenant will always take precedence over a regular user SAN, regardless of where it is in the certificate. This makes troubleshooting much easier.

Copy link
Contributor

@catj-cockroach catj-cockroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r9, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, and @knz)

@rimadeodhar rimadeodhar force-pushed the auth_ten_crt branch 6 times, most recently from 716a9bd to 8fd9a9c Compare May 23, 2022 22:38
Copy link
Contributor

@catj-cockroach catj-cockroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r10, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @knz, and @rimadeodhar)


pkg/rpc/auth.go line 188 at r10 (raw file):

		if scope.Username == username.NodeUser {
			if scope.Global {
				return scope, nil

Good catch!

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this code to fix the acceptance test failures. Also, please ignore the Example-ORM test failures for now. Those need to be fixed by updating the cockroach-go and example-orms repo for which I'll publish PRs once the base PR updating the CLI is merged.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


pkg/cli/interactive_tests/test_cert_advisory_validation.tcl line 82 at r11 (raw file):

end_test

start_test "Check that 'cert create-client' can utilize cert principal map."

Since we no longer want to support cert principal map workarounds, we need to remove these tests as they will not work for the new cert auth code. @catj-cockroach, @knz let me know if you have any concerns here. We can discuss again if we feel this concept is still important to support.

@knz
Copy link
Contributor

knz commented May 26, 2022

Since we no longer want to support cert principal map workarounds

This does not sound right. Why do you think this is true?

@catj-cockroach
Copy link
Contributor

Since we no longer want to support cert principal map workarounds

This does not sound right. Why do you think this is true?

This is based on a conversation @rimadeodhar and I had, and I think I misspoke. Certificates with tenant scoping do not need the certificate principal mapping as the crdb:// URI is specifically for CRDB, where as certificates that do not have the crdb:// URI would still need to rely on it.

In security.GetCertificateUserScopes we still have the following code to handle principal maps if no crdb:// URIs are located:

	if len(userScopes) == 0 {
		users := getCertificatePrincipals(peerCert)
		for _, user := range users {
			scope := CertificateUserScope{
				Username: user,
				Global:   true,
			}
			userScopes = append(userScopes, scope)
		}
	}

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification @catj-cockroach (I should have been clearer)! We will still support principal mapping for older certs. However, for newer certs, we will only rely on the username extracted from the crdb URI and not the CN. For the tcl tests, we use the cockroach cert command to create client certs which adds a system tenant SAN by default with this new code so the tcl tests for principal mapping are not relevant anymore.
@knz do let me know if you believe we still need to support this in some way even with the SANs.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)

@knz
Copy link
Contributor

knz commented May 26, 2022

I would not try to describe this as "old" vs "new".

The use case that the principal map is aimed to solve is when a customer uses a 3rd party tool to create certs, and that tool is limited on how it populates the fields. For example such a tool is the Amazon Identity Manager (IAM).

When there's a 3rd party tool involved, it's possible we can't ask the tool to issue/sign a cert with a custom SAN field (I'm not sure; we'd need to check!)
It's also possible (the reason why cert principal map exist) that they can't put simple usernames in fields either, and are required to add a domain suffix (e.g. [email protected])

So we have to make our product such that the customer can mint the cert in such a way that, when combined with our cert principal map feature, we result in something crdb can use and identifies the SQL user that the customer cares about.

Arguably, the story above is orthogonal to our CC story (where we mint the certs ourselves), but it does mean we can't get rid of cert principal map, and it also means we should not talk about it as "old" vs "new" (and perhaps we need new words instead. Suggestions?)

@catj-cockroach
Copy link
Contributor

Maybe CRDB-specific client certificates (certificates with crdb://) vs multi-application client certificates?

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context, @knz. I think Catherine's terminology sounds appropriate. Do you feel we need to make any changes to the code then? For multi-application certificates, certificate principals will continue to be supported as they won't contain crdb-specific SANs.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'm not understanding why you're removing the tests in test_cert_advisory_validation

Reviewed 14 of 14 files at r11, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


pkg/cli/interactive_tests/test_cert_advisory_validation.tcl line 82 at r11 (raw file):

Previously, rimadeodhar (Rima Deodhar) wrote…

Since we no longer want to support cert principal map workarounds, we need to remove these tests as they will not work for the new cert auth code. @catj-cockroach, @knz let me know if you have any concerns here. We can discuss again if we feel this concept is still important to support.

As we discussed, is it right to remove these tests?


pkg/cmd/roachtest/tests/multitenant_utils.go line 69 at r11 (raw file):

	version, err := fetchCockroachVersion(ctx, t.L(), c, n[0])
	// Tenant scoped certificates were introduced in version 22.1.
	if strings.HasPrefix(version, "v22.1") {

You want a check "v22.1 or later". This will fail on v22.2 or v23.1

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, @knz, and @rimadeodhar)


pkg/cli/interactive_tests/test_cert_advisory_validation.tcl line 82 at r11 (raw file):

Previously, knz (kena) wrote…

As we discussed, is it right to remove these tests?

So the reason for removing these tests is because the test uses the updated cockroach cert cli command to generate the client certs. As per the update, cockroach cli generates crdb specific certs by default using the system tenant as the tenant scope. So these tests fail because the principal map is no longer applicable for crdb specific certs. Ideally, there should be a way to generate multi-application client certificates (i.e. certs without a tenant scope) which could be used by these tests for enforcing the principal map. But as it stands today, I'm not sure what would be the best way to do so. So I have removed these tests for now but maybe this is a problem we should solve for in an upcoming PR to enable adding these tests back. Since we plan to retain the use of prinicpal maps for multi-application client certificates, seems like these tests are good to have.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM again :)

the CI linter test is telling you something

Reviewed 34 of 34 files at r12, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @catj-cockroach, and @rimadeodhar)


pkg/cli/interactive_tests/test_cert_advisory_validation.tcl line 82 at r11 (raw file):

Previously, rimadeodhar (Rima Deodhar) wrote…

So the reason for removing these tests is because the test uses the updated cockroach cert cli command to generate the client certs. As per the update, cockroach cli generates crdb specific certs by default using the system tenant as the tenant scope. So these tests fail because the principal map is no longer applicable for crdb specific certs. Ideally, there should be a way to generate multi-application client certificates (i.e. certs without a tenant scope) which could be used by these tests for enforcing the principal map. But as it stands today, I'm not sure what would be the best way to do so. So I have removed these tests for now but maybe this is a problem we should solve for in an upcoming PR to enable adding these tests back. Since we plan to retain the use of prinicpal maps for multi-application client certificates, seems like these tests are good to have.

Yes please file a followup issue to explain this is a feature we intend to continue supporting and it needs proper testing. The team who will likely own this is CDX-ID (jon tsiros).

@rimadeodhar rimadeodhar force-pushed the auth_ten_crt branch 2 times, most recently from b2c5ee7 to 299c6fe Compare June 14, 2022 21:10
This PR extends the TLS code to use a tenant scoped
client cert to authenticate a client for specific tenant.

Release note (security update): We introduce a new
tenant scoped client certificate to authenticate a client
on a specific tenant. A tenant scoped client certificate
contains the client name within the CN and the tenant ID,
to which the certificate is being scoped to, as the SAN.
The tenant ID is embedded within the URI section with the
format "crdb://tenant/<tenant_id>/user/<username>".
For example, a root client certificate scoped to a tenant
with ID 123 will contain "root" in the CN field and the URI
"crdb://tenant/123/user/root" in the URI section of the
certificate. This certificate will authorize the root
client on the tenant with the ID 123. It will result
in an authorization error if used to authenticate the
root client on any other tenant.
@rimadeodhar
Copy link
Collaborator Author

Merging this now that all teamcity builds are passing.

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 15, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants