Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scope expression template registration #136

Merged
merged 4 commits into from
Jan 31, 2019
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
135 changes: 135 additions & 0 deletions rfcs/0136-scope-expression-registration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# RFC 136 - Scope Expression Registration
* Comments: [#136](https://api.github.com/repos/taskcluster/taskcluster-rfcs/issues/136)
* Proposed by: @imbstack

# Summary

We should build a way that a cluster admin can know what a set of scopes allows within
the Taskcluster platform. We should publish this information so that others can build tools/tests
on top of this.

## Motivation

It is currently possible to figure some subset of this out with some background knowledge of the
platform and docs/code reading, but is in general too difficult. In addition, there are
workers/services that do not ever publish a machine readable version of the scopes that they use
for access control, making this task yet more difficult.
imbstack marked this conversation as resolved.
Show resolved Hide resolved

This issue could be partially solved just by building easier ways for services/workers to publish docs
but it will provide no guarantee that any docs published this way are valid or current. We
should build a system that allows for these to be positively asserted.
imbstack marked this conversation as resolved.
Show resolved Hide resolved

Protecting the registration system by scopes and making it the only way to do authorization allows
a cluster admin to pick and choose which services will interact with their cluster and also allow
them to know at any given time which services exist in their cluster.

In addition, this will have the benefits of moving us to doing authorization in the auth service and
ensure that this move is performant as this will drastically reduce the amount of bits we need to
shovel back and forth over the wire than would be true otherwise.

# Details

We will define the following:

* **scope expression:** A way of building complex assertions from basic scopes. This is currently defined
in taskcluster-lib-scopes. It will need to be standardized more generally for this to be an external api.
* **scope expression template:** A method of building a scope expression given certain parameters. Until
imbstack marked this conversation as resolved.
Show resolved Hide resolved
the parameters are substituted, this is not a valid scope expression that we can use for authorization.
This includes parameters like `<this>` and also `for` and `if`.
* **term:** One of the parameters of a scope expression template. These are called terms when they have
additional metadata provided to allow for useful documentation and inspection.
* **operation:** Some functionality that is protected by scopes and requires authorization to perform. In a
service, these map exactly to the list of endpoints. In a worker, there is only one operation and it is
executing a task.
* **relying party:** An entity that attaches meaning to scopes. This can be a service, worker, or other entity
that might want to authorize an operation or assign scopes. (i.e. taskcluster-github
is a relying party by both authorizing some operations based on scopes that begin with "github" and also giving
"repo" scopes to tasks.
imbstack marked this conversation as resolved.
Show resolved Hide resolved
* **scope namespace:** A prefix of scope strings that is owned by a relying party
* **operation namespace:** A prefix of operations performed by a single relying party

This scope expression template registration will be implemented roughly as follows (skip to end for example):

1. We will add a new field to our service api definitions at all of the cluster, service, and endpoint levels
that allows us to define terms at their most appropriate level. No overriding will be permitted and each
term must be defined with both human language documentation and a regex defining valid values for parameters
submitted for these terms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will we guarantee that the human readable language matches the regex? Defining security things twice, especially in different forms feels less than ideal. What if we changed the format to make it easy to generate a human readable string based on the non-regex declaration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the regex and the description give different information: acceptable values, vs. "meaning".

For example, the term "hookId" might be defined as "A user-supplied identifier for a single hook, unique within the scope of an associated hookGroupId". The regex would be /^([a-zA-Z0-9-_]*)$/.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the regex and the description give different information: acceptable values, vs. "meaning".

Yes, I think it's important to document the intention behind the regexp because that will usually make it easier to do a proper review and validation.

1. Auth service will grow a `register()` endpoint that will allow a relying party to upload structured documentation
of its own operations to the auth service. Each operation will be within the operation namespace granted to the
relying party, and the relying party must have a corresponding scope to complete the registration.
For instance, only taskcluster-github will be allowed to register the `github.createStatus` operation.
This operation will have associated metadata including the scope expression template that guards it.
This registration must be versioned and have an expiration date. Registration will return an unforgeable token.
1. Auth service will grow an `authorize()` endpoint that will take as input an operation, a clientId, and a set of
parameters that will be substituted into the operation's scope expression template in order to turn it into a
scope expression. The endpoint will return a simple yes/no answer as to whether or not the clientId in question
has sufficient scopes to satisfy the scope expression with those parameters. Either this endpoint or another one
must still support sending a set of scopes, rather than just a clientId in order to support workers checking
the scopes field of tasks.
1. Update all of our worker implementations (generic worker and script worker) to register their operation somehow.
Each worker will be represented by a single operation whose scope expression has many `if` statements for each
feature the worker supports. I expect that worker-manager will actually register these with auth to avoid
over-calling auth service. I think we can do this by making worker implementation part of the definition of a
workerType and having worker implementations generate the data that must be entered into the workerType definition.
1. Build a library that supports all of this querying in multiple languages along with a specification.yml so that
others can either use our services to reason about these or run locally.
1. Build into the auth service a way to query all registered scope expression templates. The input will be a set of
scopes. The output will be a set of operations that may be authorized by these scopes and for each of these operations,
a list of possible values for parameters. This is a bit difficult to understand but the example should make this
more clear.
imbstack marked this conversation as resolved.
Show resolved Hide resolved

# Example

We create a new service called `taskcluster-dishwasher`. It washes dishes and has a single operation called
`dishwasher.wash`. This operation is protected by a scope expression template:

```json
{
"AllOf": [
"dishwasher:wash:<detergent>"
]
}
```

The service will also define the term `detergent` with a bit of text about different values that might
be present and also a regex that defines valid values for `detergent`.

Before the service is started, when the cluster admin creates a client for this service, they will
ensure that it has the scope `auth:register:dishwasher.*`. When the service is started, it will register
this operation and the associated terms with the auth service.

Now the cluster admin would like to allow some users to use this service. They use either the tools site
or ci-admin to add some scopes to roles. Before they do this however, they can use the built-in support
in both of these tools for querying the auth service to see what the scopes they are giving out authorize.

Because the auth service now has global knowledge of the scopes that control which operations, it can reliably
tell you that for instance:

* Giving a client the scope `dishwasher:*` allows them to call `dishwasher.wash` with _any_ detergent.
* Giving a client the scope `dishwasher:wash:ajax-*` allows them to call `dishwasher.wash` with any
detergent that begins with `ajax-`.
* Given a client the scope `dishwasher:wash:comet` allows them to call `dishwasher.wash` but only with the
detergent `comet`.
* If a client does not have any `dishwasher` scopes, we know that they cannot do anything with the
dishwashing service.

This would work similarly for workers. It can even be helpful if we just register relying parties as owning
a specific scope namespace as we can tell roughly what a given set of scopes permits -- although with
significantly less granularity.

# Open Questions

Many of the finer points of this are quite hand-wavy. We can try to button things down a bit before
moving forward if we like.

This does require auth to be running for any other service to start. Maybe we can be smart by retrying
registration until auth is running or just always checking to see if auth has the most recent version
of the definition every time we call `authorize()`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK for a service to not really be "up" until auth is started -- after all, it's unlikely to work correctly without auth anyway. Running in Kubernetes means that the existing strategy of exiting the process when startup fails is a good fix, since k8s will automatically restart the process -- so when auth finally comes up, the dependent service will too (Heroku isn't so nice..)


# Implementation

<once the RFC is decided, these links will provide readers a way to track the
implementation through to completion>

* <link to tracker bug, issue, etc.>
* <...>