-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auth: add server-only ACL #18715
auth: add server-only ACL #18715
Conversation
d04d8ae
to
1299565
Compare
The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the second in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch involves creating a new "virtual" ACL object for checking permissions on server operations and a matching `AuthenticateServerOnly` method for server-only RPCs that can produce that ACL. Ref: hashicorp/nomad-enterprise#1218 Ref: #18703
1299565
to
b2d799f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Just a couple of random thoughts or questions that can easily be ignored.
// The attribute below is avirtual policy that we never expose directly to | ||
// the end user | ||
server string | ||
isLeader bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking my logic that currently this value will always be false, and this is expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there's a LeaderACL
object in nomad/structs
that'll be used in one of the following PRs. My intent was not to have anything in these PRs that wasn't yet in use to avoid confusion, but I missed this one.
Co-authored-by: James Rasell <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor questions, but nice refactoring!
// validateCertificateForName returns true if the certificate is valid | ||
// for the given domain name. | ||
func validateCertificateForName(cert *x509.Certificate, expectedName string) (bool, error) { | ||
if cert == nil { | ||
return false, nil | ||
} | ||
|
||
validNames := []string{cert.Subject.CommonName} | ||
validNames = append(validNames, cert.DNSNames...) | ||
for _, valid := range validNames { | ||
if expectedName == valid { | ||
return true, nil | ||
} | ||
} | ||
|
||
return false, fmt.Errorf("invalid certificate, %s not in %s", | ||
expectedName, strings.Join(validNames, ",")) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea for this function to replace ctx.ValidateCertificateForName()
? Since we have a ctx
above we could use it directly?
Lines 127 to 148 in 635afee
// ValidateCertificateForName returns true if the RPC context certificate is valid | |
// for the given domain name. | |
func (ctx *RPCContext) ValidateCertificateForName(name string) error { | |
if ctx == nil || !ctx.TLS { | |
return nil | |
} | |
cert := ctx.Certificate() | |
if cert == nil { | |
return errors.New("missing certificate information") | |
} | |
validNames := []string{cert.Subject.CommonName} | |
validNames = append(validNames, cert.DNSNames...) | |
for _, valid := range validNames { | |
if name == valid { | |
return nil | |
} | |
} | |
return fmt.Errorf("invalid certificate, %s not in %s", name, strings.Join(validNames, ",")) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're seeing a bit of an artifact of breaking this work into multiple PRs. We only need the (*RPCContext).ValidateCertificateForName
method to exist because we haven't done the "auth: add client-only ACL" PR.
When we've completed that, the only callers for this logic will exist encapsulated in the nomad/auth
package so there's no reason to keep it in nomad/rpc.go
anymore. That allows us to have all the code that tests it in the nomad/auth
package without having to copy the logic into a ValidateCertificateForName
method on the testContext
struct (at which point we'd be testing the mock context logic and not the production code).
// we need to allow both humans with management tokens and | ||
// non-leader servers to list keys, in order to support | ||
// replication | ||
err := validateTLSCertificateLevel(k.srv, k.ctx, tlsCertificateLevelServer) | ||
if err != nil { | ||
if aclObj, err := k.srv.ResolveACL(args); err != nil { | ||
return err | ||
} else if aclObj != nil && !aclObj.IsManagement() { | ||
return structs.ErrPermissionDenied | ||
} | ||
if aclObj, err := k.srv.ResolveACL(args); err != nil { | ||
return err | ||
} else if aclObj != nil && !aclObj.IsManagement() { | ||
return structs.ErrPermissionDenied | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When servers call this endpoint do they pass an aclObj
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it turns out, servers don't call this Keyring.List
endpoint at all! I think we originally had this to support keyring replication between leader and followers, but the followers just read the metadata from the state store and then Keyring.Get
the actually key material.
The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the third in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch involves creating a new "virtual" ACL object for checking permissions on client operations and a matching `AuthenticateClientOnly` method for client-only RPCs that can produce that ACL. Unlike the server ACLs PR, this also includes a special case for "legacy" client RPCs where the client was not previously sending the secret as it should (leaning on mTLS only). Those client RPCs were fixed in Nomad 1.6.0, but it'll take a while before we can guarantee they'll be present during upgrades. Ref: hashicorp/nomad-enterprise#1218 Ref: #18703 Ref: #18715 Ref: #16799
The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the third in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch involves creating a new "virtual" ACL object for checking permissions on client operations and a matching `AuthenticateClientOnly` method for client-only RPCs that can produce that ACL. Unlike the server ACLs PR, this also includes a special case for "legacy" client RPCs where the client was not previously sending the secret as it should (leaning on mTLS only). Those client RPCs were fixed in Nomad 1.6.0, but it'll take a while before we can guarantee they'll be present during upgrades. Ref: hashicorp/nomad-enterprise#1218 Ref: #18703 Ref: #18715 Ref: #16799
The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the third in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch involves leveraging the refactored `auth` package to remove the weird "mixed auth" helper functions that only support the Variables read/list RPC handlers. Instead, pass the ACL object and claim together into the `AllowVariableOperations` method in the usual `acl` package. Ref: hashicorp/nomad-enterprise#1218 Ref: #18703 Ref: #18715 Ref: #16799 Ref: #18730 Fixes: #15875
The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the third in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch involves leveraging the refactored `auth` package to remove the weird "mixed auth" helper functions that only support the Variables read/list RPC handlers. Instead, pass the ACL object and claim together into the `AllowVariableOperations` method in the usual `acl` package. Ref: hashicorp/nomad-enterprise#1218 Ref: #18703 Ref: #18715 Ref: #16799 Ref: #18730 Fixes: #15875
The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the final patch in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch adds a new virtual ACL policy field for when ACLs are disabled and updates our authentication logic to use it. Included: * Extends auth package tests to demonstrate that nil ACLs are treated as failed auth and disabled ACLs succeed auth. * Adds a new `AllowDebug` ACL check for the weird special casing we have for pprof debugging when ACLs are disabled. * Removes the remaining unexported methods (and repeated tests) from the `nomad/acl.go` file. * Update the semgrep rules to detect improper nil ACL checking and remove the old invalid ACL checks. * Update the contributing guide for RPC authentication. Ref: hashicorp/nomad-enterprise#1218 Ref: #18703 Ref: #18715 Ref: #16799 Ref: #18730 Ref: #18744
The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the final patch in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch adds a new virtual ACL policy field for when ACLs are disabled and updates our authentication logic to use it. Included: * Extends auth package tests to demonstrate that nil ACLs are treated as failed auth and disabled ACLs succeed auth. * Adds a new `AllowDebug` ACL check for the weird special casing we have for pprof debugging when ACLs are disabled. * Removes the remaining unexported methods (and repeated tests) from the `nomad/acl.go` file. * Update the semgrep rules to detect improper nil ACL checking and remove the old invalid ACL checks. * Update the contributing guide for RPC authentication. Ref: hashicorp/nomad-enterprise#1218 Ref: #18703 Ref: #18715 Ref: #16799 Ref: #18730 Ref: #18744
The RPC handlers expect to see
nil
ACL objects whenever ACLs are disabled. By usingnil
as a sentinel value, we have the risk of nil pointer exceptions and improper handling ofnil
when returned from our various auth methods that can lead to privilege escalation bugs. This is the second in a series to eliminate the use ofnil
ACLs as a sentinel value for when ACLs are disabled.This patch involves creating a new "virtual" ACL object for checking permissions on server operations and a matching
AuthenticateServerOnly
method for server-only RPCs that can produce that ACL.Ref: https://github.com/hashicorp/nomad-enterprise/pull/1218
Ref: #18703