Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce tap APIService, update linkerd tap #3167

Merged
merged 7 commits into from
Aug 1, 2019
Merged

Conversation

siggy
Copy link
Member

@siggy siggy commented Jul 31, 2019

The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.

This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated API server. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.

The linkerd tap command now makes requests against the new APIService.

The Tap APIService implements three Kubernetes-style endpoints:
GET /apis/tap.linkerd.io/v1alpha1
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap

Users authorize to the new tap.linkerd.io/v1alpha1 via RBAC. Only the
watch verb is supported. Access is also available via subresources
such as deployments/tap and pods/tap.

This change introduces the following resources into the default Linkerd
install:

  • Global
    • APIService/v1alpha1.tap.linkerd.io
    • ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
  • linkerd namespace:
    • Secret/linkerd-tap-tls
  • kube-system namespace:
    • RoleBinding/linkerd-linkerd-tap-auth-reader

Tasks not covered by this PR:

  • linkerd top
  • linkerd dashboard
  • linkerd profile --tap
  • removal of the unauthenticated tap controller

Fixes #2725, #3162

Signed-off-by: Andrew Seigner [email protected]

This PR is based on https://github.com/linkerd/linkerd2/compare/dad/tap-pod-2

@siggy siggy self-assigned this Jul 31, 2019
@siggy siggy force-pushed the siggy/tap-apiservice branch from c2846f9 to 831e803 Compare July 31, 2019 11:21
@l5d-bot
Copy link
Collaborator

l5d-bot commented Jul 31, 2019

Integration test results for 831e803: fail 😕
Log output: https://gist.github.com/6fa204de2776cde0acf9735262272501

@siggy siggy force-pushed the siggy/tap-apiservice branch from 831e803 to 3d58eb1 Compare July 31, 2019 11:40
@l5d-bot
Copy link
Collaborator

l5d-bot commented Jul 31, 2019

Integration test results for 3d58eb1: fail 😕
Log output: https://gist.github.com/86f94b50be1f3a594ca00e9b4543e53e

@siggy siggy force-pushed the siggy/tap-apiservice branch from 3d58eb1 to ad43c40 Compare July 31, 2019 12:04
@l5d-bot
Copy link
Collaborator

l5d-bot commented Jul 31, 2019

Integration test results for ad43c40: fail 😕
Log output: https://gist.github.com/d88911a8252e25422a23881bc4639363

@siggy siggy force-pushed the siggy/tap-apiservice branch from ad43c40 to c64937d Compare July 31, 2019 16:24
@l5d-bot
Copy link
Collaborator

l5d-bot commented Jul 31, 2019

Integration test results for c64937d: fail 😕
Log output: https://gist.github.com/d6afe87ffad9a635cf596c75bedd83de

@l5d-bot
Copy link
Collaborator

l5d-bot commented Jul 31, 2019

Integration test results for e365353: fail 😕
Log output: https://gist.github.com/b279decfe53ad37efa4252e5cbc3cf82

@siggy siggy requested a review from kleimkuhler July 31, 2019 17:40
@siggy siggy force-pushed the siggy/tap-apiservice branch from e365353 to 4895534 Compare July 31, 2019 17:44
@l5d-bot
Copy link
Collaborator

l5d-bot commented Jul 31, 2019

Integration test results for 4895534: success 🎉
Log output: https://gist.github.com/d33d4cd49c284330fa230816b23f9eb3

@l5d-bot
Copy link
Collaborator

l5d-bot commented Jul 31, 2019

Integration test results for 2048887: success 🎉
Log output: https://gist.github.com/b1047b9bcabb52dca3ea91d577d89222

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just gave this a quick once over and it looks great: no major surprises. Next I'll test and review in more detail.

namespace, resource, name, req.Header.Get(a.usernameHeader), req.Header[a.groupHeader],
)

err := pkgK8s.ResourceAuthzForUser(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this subject access review necessary? are there requests which could get past the aggregator but fail here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to craft a request that gets by the aggregator but fails this SAR, but I'm not convinced it's not possible. This is the recommended pattern from Kubernetes:
https://kubernetes.io/docs/tasks/access-kubernetes-api/configure-aggregation-layer/#extension-apiserver-authorizes-the-request

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this is the exact same SAR check that the aggregator does. So doing it here as well feels redundant. Removing it also allows us to remove the auth-delegator role binding.


// TapReqToURL converts a TapByResourceRequest protobuf object to a URL for use
// with the Kubernetes tap.linkerd.io APIService.
func TapReqToURL(req *pb.TapByResourceRequest) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels pretty specific to the tap apiservice and a bit out of place in this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but ended up putting it here for a few reasons:

  • both /cli/cmd and /controller/tap depend on it, so it should be somewhere in /pkg
  • protohttp already depends on /controller/gen/public, so putting TapReqToURL here does not add any new edges to our dependency tree
  • it relates to the business of protobuf <-> http conversion

Given all this, the only other option would be to put that one function into a new package under /pkg, and that new package would still depend on /controller/gen/public (which is a more general issue to be fixed in #2751). Once #2751 completes, I'd be more open to a reorg of this code.

Copy link
Contributor

@dadjeibaah dadjeibaah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick glance as well and it all looks good. I'd like to kick the tires in a little bit.

I had one clarifying question. What does the current Tap Public API endpoint now serve? Does it give a deprecation warning or do we still need it for the dashboard?

@siggy siggy merged commit a59c1dd into master Aug 1, 2019
@siggy siggy deleted the siggy/tap-apiservice branch August 1, 2019 21:02
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.

This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel`, where two go routines
could both call `close(done)` on exit.

Fixes #3168

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.

This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel` issue, where two go
routines could both call `close(done)` on exit.

Fixes #3168

Signed-off-by: Andrew Seigner <[email protected]>

fix

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.

This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel` issue, where two go
routines could both call `close(done)` on exit.

Fixes #3168

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.

This change migrates `linkerd profile --tap` to the new Tap APIService.

Depends on #3186
Fixes #3169

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.

This change migrates `linkerd profile --tap` to the new Tap APIService.

Depends on #3186
Fixes #3169

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.

This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel` issue, where two go
routines could both call `close(done)` on exit.

Fixes #3168

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.

This change migrates `linkerd profile --tap` to the new Tap APIService.

Depends on #3186
Fixes #3169

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Aug 2, 2019
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.

This change migrates `linkerd profile --tap` to the new Tap APIService.

Depends on #3186
Fixes #3169

Signed-off-by: Andrew Seigner <[email protected]>
ihcsim pushed a commit that referenced this pull request Aug 2, 2019
ihcsim added a commit that referenced this pull request Aug 6, 2019
* Updated controller template with proxy partials
* Declare dependency in requirements.yaml
* Add partial template for proxy's metadata
* Add proxy-init partial template
* Script to lint Helm charts and update their dependencies
* Update partials chart Chart.yaml
* Add proxy-init and resource partial templates
* Replace hard coded namespace variable in proxy env var
* Ignore chart dependencies .tgz files
* Add missing fields and re-order YAML elements to match CLI output
* Reuse control plane's resource partial template in 'partials' chart
* Set the proxy's destination service address env var
* Add Grafana's template
* Update api version of controller RBAC
* Add Heartbeat template
* Remove duplicated resources partial template
* Add remainder control plane components templates
* Add template for the 'linkerd-config' config map
* Add debug container template
* Update proxy partial with 'disable-identity' and 'disable-tap' variables

Note that these are inject-only variables.
Also added the LINKERD2_PROXY_TAP_SVC_NAME env var.

* Add validation conditions to ensure identity and tap aren't disabled for
control plane components
* Add partials for service account token mount path and security context capabilities
* Change proxy and proxy-init templates to use global scope

Some of the nested variables are removed from values.yaml to ensure changes
made to root-level variables are propagated directly into the partial
templates. The previous approach of using YAML anchors in the
values.yaml to share common values can get out-of-sync when values are
changed via the Helm's `--set` option.

* Update templates and values file to match #3161
* Perform a dry run installation if there is a local Tiller
* Reorder JSON elements in linkerd-config
* Re-adjust nested partials indentation to work with inject 'patch' chart

Previously, the partials will render their content as an element in the list.
While it works for installation, the toJson function in the 'inject' patch code
ends up converting it into a JSON list, instead of the expected JSON
object.

* Trap the last fail command in the Helm shell script
* Add the identity trust anchor
* Address Thomas' feedback on handling HA

All the HA-related variables are moved to values-ha.yaml

* Convert ignore ports string to JSON list in linkerd-config

Also fixed some indentation issues.

* Add values-ha.yaml
* Include the service account token mount path only if identity is enabled
* Fixed malformed JSON in linkerd-config config map
* Rename chart to 'linkerd2'
* Add NOTES.txt
* Fix incorrect variable path in proxy template
* Remove fake TLS assets
* Add 'required' constraint to identity trust anchors variable
* Update tap templates per #3167
* Bump default version to edge-19.8.1 due to dependency on RSA support

Signed-off-by: Ivan Sim <[email protected]>
cpretzer pushed a commit that referenced this pull request Aug 6, 2019
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.

This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel` issue, where two go
routines could both call `close(done)` on exit.

Fixes #3168

Signed-off-by: Andrew Seigner <[email protected]>
cpretzer pushed a commit that referenced this pull request Aug 6, 2019
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.

This change migrates `linkerd profile --tap` to the new Tap APIService.

Depends on #3186
Fixes #3169

Signed-off-by: Andrew Seigner <[email protected]>
cpretzer pushed a commit that referenced this pull request Aug 6, 2019
* Updated controller template with proxy partials
* Declare dependency in requirements.yaml
* Add partial template for proxy's metadata
* Add proxy-init partial template
* Script to lint Helm charts and update their dependencies
* Update partials chart Chart.yaml
* Add proxy-init and resource partial templates
* Replace hard coded namespace variable in proxy env var
* Ignore chart dependencies .tgz files
* Add missing fields and re-order YAML elements to match CLI output
* Reuse control plane's resource partial template in 'partials' chart
* Set the proxy's destination service address env var
* Add Grafana's template
* Update api version of controller RBAC
* Add Heartbeat template
* Remove duplicated resources partial template
* Add remainder control plane components templates
* Add template for the 'linkerd-config' config map
* Add debug container template
* Update proxy partial with 'disable-identity' and 'disable-tap' variables

Note that these are inject-only variables.
Also added the LINKERD2_PROXY_TAP_SVC_NAME env var.

* Add validation conditions to ensure identity and tap aren't disabled for
control plane components
* Add partials for service account token mount path and security context capabilities
* Change proxy and proxy-init templates to use global scope

Some of the nested variables are removed from values.yaml to ensure changes
made to root-level variables are propagated directly into the partial
templates. The previous approach of using YAML anchors in the
values.yaml to share common values can get out-of-sync when values are
changed via the Helm's `--set` option.

* Update templates and values file to match #3161
* Perform a dry run installation if there is a local Tiller
* Reorder JSON elements in linkerd-config
* Re-adjust nested partials indentation to work with inject 'patch' chart

Previously, the partials will render their content as an element in the list.
While it works for installation, the toJson function in the 'inject' patch code
ends up converting it into a JSON list, instead of the expected JSON
object.

* Trap the last fail command in the Helm shell script
* Add the identity trust anchor
* Address Thomas' feedback on handling HA

All the HA-related variables are moved to values-ha.yaml

* Convert ignore ports string to JSON list in linkerd-config

Also fixed some indentation issues.

* Add values-ha.yaml
* Include the service account token mount path only if identity is enabled
* Fixed malformed JSON in linkerd-config config map
* Rename chart to 'linkerd2'
* Add NOTES.txt
* Fix incorrect variable path in proxy template
* Remove fake TLS assets
* Add 'required' constraint to identity trust anchors variable
* Update tap templates per #3167
* Bump default version to edge-19.8.1 due to dependency on RSA support

Signed-off-by: Ivan Sim <[email protected]>
kleimkuhler added a commit that referenced this pull request Aug 8, 2019
### Motivation

PR #3167 introduced the tap APIService and migrated `linkerd tap` to use it.
Subsequent PRs (#3186 and #3187) updated `linkerd top` and `linkerd profile
--tap` to use the tap APIService. This PR moves the web's Go server to now also
use the tap APIService instead of the public API. It also ensures an error
banner is shown to the user when unauthorized taps fail via `linkerd top`
command in *Overview* and *Top*, and `linkerd tap` command in *Tap*.

### Details

The majority of these changes are focused around piping through the HTTP error
that occurs and making sure the error banner generated displays the error
message explaining to view the tap RBAC docs.

`httpError` is now public (`HTTPError`) and the error message generated is short
enough to fit in a control frame (explained [here](https://github.com/linkerd/linkerd2/blob/kleimkuhler%2Fweb-tap-apiserver/web/srv/api_handlers.go#L173-L175)).

### Testing

The error we are testing for only occurs when the linkerd-web service account is
not authorzied to tap resources. Unforutnately that is not the case on Docker
For Mac (assuming that is what you use locally), so you'll need to test on a
different cluster. I chose a GKE cluster made through the GKE console--not made
through cluster-utils because it adds cluster-admin.

Checkout the branch locally and `bin/docker-build` or `ares-build` if you have
it setup. It should produce a linkerd with the version `git-04e61786`. I have
already pushed the dependent components, so you won't need to `bin/docker-push
git-04e61786`.

Install linkerd on this GKE cluster and try to run `tap` or `top` commands via
the web. You should see the following errors:

### Tap

![web-tap-unauthorized](https://user-images.githubusercontent.com/4572153/62661243-51464900-b925-11e9-907b-29d7ca3f815d.png)

### Top

![web-top-unauthorized](https://user-images.githubusercontent.com/4572153/62661308-894d8c00-b925-11e9-9498-6c9d38b371f6.png)

Signed-off-by: Kevin Leimkuhler <[email protected]>
siggy added a commit that referenced this pull request Oct 14, 2019
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.

Update the fake clientset code to support APIService objects.

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Oct 14, 2019
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.

Update the fake clientset code to support APIService objects.

Fixes #3559

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Oct 16, 2019
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.

Update the fake clientset code to support APIService objects.

Fixes #3559

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Oct 21, 2019
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.

Update the fake clientset code to support APIService objects.

Fixes #3559

Signed-off-by: Andrew Seigner <[email protected]>
ihcsim pushed a commit that referenced this pull request Oct 21, 2019
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.

Update the fake clientset code to support APIService objects.

Fixes #3559

Signed-off-by: Andrew Seigner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the tap service an APIExtension
6 participants