Everything that produces an output can be used inside Vessel to gather information and do scans.
Integration of a new tool is a very simple three stage process:
- Stage 1
- Add the tool inside the Vessel container, by adding it into the
Dockerfile
.
- Add the tool inside the Vessel container, by adding it into the
- Stage 2
- Create a new python file inside
vessel/tools
directory named after the tool you want to add and populate it with as many functions as the Kubernetes you want to handle, declaring for each one a@vessel_hook
annotation. - Inside the module you will get passed a resource
dict
and avessel.models.Context
and will need to return an array ofvessel/models.Issue
with justissue
andissue_metadata
fields. - The array of
vessel/models.Issue
will need to be wrapped with thevessel_result
function.
- Create a new python file inside
- Stage 3
- Add your tool (or not) to the defaults.
Here's an example of the workflow used to integrate kubesec tool.
It's easy to add a tool to the container, by adding a section like this to the Dockerfile:
# install kubesec
RUN cd /tmp && \
curl -sfL -o kubesec_linux_amd64.tar.gz https://github.com/controlplaneio/kubesec/releases/download/v2.11.4/kubesec_linux_amd64.tar.gz && \
cd /usr/local/bin && \
tar -xzvf /tmp/kubesec_linux_amd64.tar.gz kubesec && \
rm /tmp/kubesec_linux_amd64.tar.gz
Have a look at the file named kubesec.py
under the vessel/tools
path. The import section is structured to include all the needed modules. In this case we will have:
import json
import yaml
import logging
from ..models import Issue, Context
from ..manager import vessel_hook, vessel_result
from io import StringIO
from sh import kubesec, ErrorReturnCode_2
Pay attention to the from
lines, you'll need to import Issue
and Context
functions from ..models
(the vessel/models.py
file) and vessel_hook
and vessel_result
from ..manager
(the vessel/manager.py
file) to be able to properly uniform the output of the command.
The last statement from sh import kubesec, ErrorReturnCode_2
uses the sh
Python module to make kubesec
usable as a function and declares its return code ErrorReturnCode_2
, since kubesec
returns 2 if it finds critical problems with the passed resource.
Next we activate the logger, like this:
logger = logging.getLogger(__name__)
And then we have the crucial part of the module, the execution of the kubesec
tool.
The kubesec
command executes as we would launch this from a terminal:
$ cat myresource.yaml | kubesec scan -
The typical output of a problematic yaml will be something like this:
[
{
"object": "Pod/calico-node-qqdmj.kube-system",
"valid": true,
"fileName": "STDIN",
"message": "Failed with a score of -35 points",
"score": -35,
"scoring": {
"critical": [
{
"id": "Privileged",
"selector": "containers[] .securityContext .privileged == true",
"reason": "Privileged containers can allow almost completely unrestricted host access",
"points": -30
},
{
"id": "HostNetwork",
"selector": ".spec .hostNetwork == true",
"reason": "Sharing the host's network namespace permits processes in the pod to communicate with processes bound to the host's loopback adapter",
"points": -9
}
],
"passed": [
...
],
"advise": [
...
]
}
}
]
This means we need to cycle over the [ { "scoring": { "critical": [] } } ]
section of the json, extracting for each problem the id
as a unique identifier and the entire critical record for metadata.
This translates in this code snippet:
def resource_kubesec(k8s_object):
result = StringIO()
issues = []
specyaml = yaml.dump(k8s_object)
try:
kubesec("scan", "-", _in=specyaml, _out=result)
except ErrorReturnCode_2:
report = json.loads(result.getvalue())[0]
for v in report['scoring'].get('critical', []):
issues.append(Issue(name=f'{v["id"]}', metadata=v))
except BaseException:
logger.error("Unable to execute kubescan")
return issues
We are declaring three main variables:
-
result
to get the entire buffer of the output coming from thekubesec
command; -
issues
to contain all the issues in an array; -
specyaml
in which we dump the yaml content of the resource we're handling;
We execute kubesec
and if it fails with return code 2 (so the yaml is considered problematic) we load the [0]
record of the json into the report
variable and we cycle over all the critical
records under scoring
.
After this, we'll have the result of the scan in issues
, ready to be returned to Vessel.
Last section will include, for each resource you'll want to handle, a function named after the resource introduced by a @vessel_hook
annotation:
@vessel_hook
def deployment(resource, ctx):
return vessel_result(resource_kubesec(resource))
@vessel_hook
def deploymentconfig(resource, ctx):
return vessel_result(resource_kubesec(resource))
@vessel_hook
def job(resource, ctx):
return vessel_result(resource_kubesec(resource))
@vessel_hook
def statefulset(resource, ctx):
return vessel_result(resource_kubesec(resource))
@vessel_hook
def daemonset(resource, ctx):
return vessel_result(resource_kubesec(resource))
The new tool could become a default or not, depending on how the vessel/cli.py
is configured.
If you want it to become a default there are two sections you'll need to change, related to the one shot mode:
@cli.command()
@click.argument('resource', type=click.File())
@click.option('-t', '--tools', envvar='TASKS', default="linter,trivy,kubesec,kubescore", type=click.STRING, help="Tools to run separated by commas [default]")
And the server mode:
@cli.command()
@click.option('-n', '--namespaces', envvar='NAMESPACES', default=None, type=click.STRING, help="Namespaces to watch separated by commas [default]")
@click.option('-t', '--tools', envvar='TOOLS', default="linter,trivy,kubesec,kubescore", type=click.STRING, help="Tools to run separated by commas [linter,trivy,kubesec]")
Once everything described in this page is in place, you can make a single pull-request for every tool you want to add.