Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add debugging support for Skaffold: skaffold debug #1702

Merged
merged 66 commits into from
Mar 27, 2019

Conversation

briandealwis
Copy link
Member

@briandealwis briandealwis commented Feb 27, 2019

This PR adds a new command to Skaffold called debug. debug provides a zero configuration solution for setting up containers for debugging. It is a mix of run and dev in that it builds and deploys artifacts similar to run, but monitors and cleans-up like dev. debug transforms the Kubernetes manifests to setup a container for debugging as required for the container's runtime technology.

Comments and suggestions are welcome. I'm know my Go is terrible and can be improved.

About skaffold debug

skaffold debug is like skaffold dev with the following differences:

  • Kubernetes pod-bearing objects that reference built images are transformed to setup runtime-specific debugging technologies:
    • JVM applications have a JAVA_TOOL_OPTIONS environment variable added/adjusted to configure the JDWP agent to listen for a specific port
    • NodeJS applications have a --inspect=port added to configure a DevTools connection
  • The associated debug ports are exposed and labelled
  • An annotation is added to the PodSpec metadata to describe the JDWP/DevTools port named debug.cloud.google.com/config which points to a string containing a JSON object mapping container-name → a debug runtime configuration object. Each debug configuration object has at least a runtime field that points to a simple ID describing the underlying language runtime technology (e.g., jvm, nodejs); other fields specify any configuration specific to that language runtime technology. For example , a pod with two containers named microservice and web (line breaks added for legibility)
debug.cloud.google.com/config: '{
   "microservice":{"devtools":9229,"runtime":"nodejs"},
   "adapter":{"jdwp":5005,"runtime":"jvm"}
}'

Example transforms

For NodeJS, involving a PodSpec from examples/hot-reload. The changes in this example:

  1. An annotation debug.cloud.google.com/config has a JSON object that describes the connection configuration details. As Kubernetes container objects don't carry metadata, we have to put this metadata on the container's parent; as a pod/podspec can have multiple containers, we use a key-value map, keyed by the container name.
  2. A new argument --inspect=9229 is added to the node command line
  3. The devtools port 9229 is exposed

Note that we identify this artifact as Node based on the container attempting to run nodemon. (npm run xxx is not handled as it doesn't support passing through --inspect.)

apiVersion: v1
kind: Pod
metadata:
  annotations:
    debug.cloud.google.com/config: '{"node":{"devtools":9229,"runtime":"nodejs"}}'
  creationTimestamp: null
  labels:
    cleanup: "true"
    docker-api-version: "1.38"
    skaffold-builder: local
    skaffold-deployer: kubectl
    skaffold-tag-policy: git-commit
    tail: "true"
  name: node
  namespace: default
spec:
  containers:
  - args:
    - nodemon
    - --inspect=9229
    - --legacy-watch
    - server.js
    image: gcr.io/k8s-skaffold/node-example:c42c314176d27cd0cfc1d7fbe983ab25f97c85dfcd7faf3000d86ba08bc5c3cd
    name: node
    ports:
    - containerPort: 3000
    - containerPort: 9229
      name: devtools

For Java (a deployment from examples/jib). The changes in this example:

  1. An annotation debug.cloud.google.com/config has a JSON object that describes the connection configuration details. As Kubernetes container objects don't carry metadata, we have to put this metadata on the container's parent; as a pod/podspec can have multiple containers, we use a key-value map, keyed by the container name.
  2. A new environment variable JAVA_TOOL_OPTIONS is added; this is picked up by the JVM at launch.
  3. The JDWP port 5005 is exposed.
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    cleanup: "true"
    docker-api-version: "1.38"
    skaffold-builder: local
    skaffold-deployer: kubectl
    skaffold-tag-policy: git-commit
    tail: "true"
  name: web
  namespace: default
spec:
  selector:
    matchLabels:
      app: web
  strategy: {}
  template:
    metadata:
      annotations:
        debug.cloud.google.com/config: '{"web":{"jdwp":5005,"runtime":"jvm"}}'
      creationTimestamp: null
      labels:
        app: web
        cleanup: "true"
        docker-api-version: "1.38"
        skaffold-builder: local
        skaffold-deployer: kubectl
        skaffold-tag-policy: git-commit
        tail: "true"
    spec:
      containers:
      - env:
        - name: JAVA_TOOL_OPTIONS
          value: -agentlib:jdwp=transport=dt_socket,server=y,address=5005,suspend=n,quiet=y
        image: gcr.io/k8s-skaffold/skaffold-jib:d6b74559a3f72c7ed03add0aaeed5bbddfd35be583b68db57896c24d61c29ce4
        name: web
        ports:
        - containerPort: 8080
        - containerPort: 5005
          name: jdwp
        resources: {}

Change Overview

This implementation happens entirely at deploy time. When I initially starting writing this code, my plan was to implement it as a mutating admission controller, where the code only had access to the images. So this code walks the manifests to be deployed and examines and transforms any containers referenced in a pod/podspec that use a supported language runtime.

We guess the runtime technology for an image container by examining the referenced image container configuration, specifically looking at the environment variables and command-line. It turns out that most of the language runtime base images define a XXXX_VERSION environment variable or use a well-known command-name (java or node or nodemon).

  • most debugging transform code is in pkg/skaffold/debugging
  • adds a hook to deploy/kubectl.go to configure a manifest transformer (ManifestList)
  • debugging's manifest transformer (ApplyDebuggingTransforms in pkg/skaffold/debugging/debug.go) walks the Kubernetes objects and applies changes to podspecs based on the guessed runtime technology
  • Update: {{Skaffold Change}} A skaffold/build/Artifact alone isn't enough to introspect on the corresponding container image as it may have been loaded into the local docker daemon or pushed to a registry. I changed skaffold/build/Artifact to include a Location enum, and changed the Builder interface to return an Artifact with the location, rather than just return the built tag.
  • guessing the runtime technology is a bit involved as it requires retrieving the container config and looking at the environment and command-line
    -~~ the container image's config (go-containerregistry/pkg/v1/config/Config) is contained by re-defining the skaffold/build/Artifact to also provide a configuration retriever function.~~ Update: now entirely hidden inside the debugging package. There are two retriever implementations, one that fetches from a registry and the other from the Docker daemon. This must be determined at build-time as it depends on the local-push.
    • passing this configuration retriever back is a bit … ugly
  • the transforms examine the pod's ports to allocate a non-conflicting port that is close to the standard port (e.g., 5005 for JDWP, 9229 for NodeJS DevTools)

Follow-ups

I'd prefer to leave these follow-ups to separate PRs:

cmd/skaffold/app/cmd/debug.go Outdated Show resolved Hide resolved
cmd/skaffold/app/cmd/debug.go Show resolved Hide resolved
cmd/skaffold/app/cmd/debug.go Outdated Show resolved Hide resolved
pkg/skaffold/deploy/kubectl/debug.go Outdated Show resolved Hide resolved
pkg/skaffold/deploy/kubectl/debug.go Outdated Show resolved Hide resolved
port = portAlloc(5005)

javaToolOptions := v1.EnvVar{
Name: "JAVA_TOOL_OPTIONS",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shd we make these constants?

@tejal29
Copy link
Member

tejal29 commented Feb 27, 2019

Great PR!! Looking forward to see this in.

Copy link
Contributor

@balopat balopat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great!
I tried it and now it works nicely too!

I have mostly nits and some refactoring + docs comments.

Some other observations:

  • It is a little bit annoying how the multiple containers can race for ports so sometimes web1 gets the 8081 sometimes web2.
  • sync and rebuild would be actually nice - now the iteration requires restarting skaffold debug - which I usually do by changing the yaml file :)

pkg/skaffold/build/build.go Outdated Show resolved Hide resolved
pkg/skaffold/debugging/debug.go Outdated Show resolved Hide resolved
pkg/skaffold/debugging/debug.go Outdated Show resolved Hide resolved
pkg/skaffold/debugging/transform.go Outdated Show resolved Hide resolved
pkg/skaffold/debugging/transform.go Outdated Show resolved Hide resolved
docs/content/en/docs/how-tos/debug/_index.md Outdated Show resolved Hide resolved
docs/content/en/docs/how-tos/debug/_index.md Show resolved Hide resolved
pkg/skaffold/deploy/kubectl.go Show resolved Hide resolved
pkg/skaffold/debugging/debug_test.go Outdated Show resolved Hide resolved
pkg/skaffold/debugging/debug_test.go Outdated Show resolved Hide resolved
@etanshaul
Copy link
Contributor

Thanks Brian. The logic lgtm. I’ll defer to the Skaffold folks for the golang specifics.

@briandealwis
Copy link
Member Author

Thanks @balopat for the thorough review.

It is a little bit annoying how the multiple containers can race for ports so sometimes web1 gets the 8081 sometimes web2.

IIUC, this is related to the port-forwarding of declared container ports to the local host, right?
It would be even worse if we don't rewrite the replicas to 1! I see this as a tooling issue — the GWT tooling for Eclipse, for example, provides a view with an annotated list of URLs that a developer might want to examine. We should have enough information from the events to do something similar.

sync and rebuild would be actually nice - now the iteration requires restarting skaffold debug - which I usually do by changing the yaml file :)

+1

@briandealwis
Copy link
Member Author

PTAL @balopat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants