Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update air-gapped docs #7160

Merged
merged 10 commits into from
Aug 9, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
197 changes: 93 additions & 104 deletions docs/docs/advanced/air-gap.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea about re-structuring this page, but we can do that in another PR.

Original file line number Diff line number Diff line change
@@ -1,142 +1,131 @@
# Air-Gapped Environment
# Advanced Network Scenarios

Trivy can be used in air-gapped environments. Note that an allowlist is [here][allowlist].
Trivy needs to connect to the internet occasionally, in order to download relevant content. This document explains the network connectivity requirements of Trivy and setting up Trivy in particular scenarios.

## Air-Gapped Environment for vulnerabilities
## Network requirements

### Download the vulnerability database
At first, you need to download the vulnerability database for use in air-gapped environments.
Trivy's databases are distributed as OCI images via GitHub Container registry (GHCR):

=== "Trivy"
- <https://ghcr.io/aquasecurity/trivy-db>
- <https://ghcr.io/aquasecurity/trivy-java-db>
- <https://ghcr.io/aquasecurity/trivy-checks>

```
TRIVY_TEMP_DIR=$(mktemp -d)
trivy --cache-dir $TRIVY_TEMP_DIR image --download-db-only
tar -cf ./db.tar.gz -C $TRIVY_TEMP_DIR/db metadata.json trivy.db
rm -rf $TRIVY_TEMP_DIR
```
If Trivy is running behind a firewall, you'll need to add the following urls to your allowlist:

=== "oras >= v0.13.0"
Please follow [oras installation instruction][oras].
- `ghcr.io`
- `pkg-containers.githubusercontent.com`

Download `db.tar.gz`:
The databases are pulled by Trivy using the [OCI Distribution](https://github.com/opencontainers/distribution-spec) specification, which is based on simple HTTPS protocol.

```
$ oras pull ghcr.io/aquasecurity/trivy-db:2
```
## Running Trivy in air-gapped environment

=== "oras < v0.13.0"
Please follow [oras installation instruction][oras].
An air-gapped environment refers to situations where the network connectivity from the machine Trivy runs on is blocked or restricted.

Download `db.tar.gz`:
In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis.

```
$ oras pull -a ghcr.io/aquasecurity/trivy-db:2
```
## Offline Mode

### Download the Java index database[^1]
Java users also need to download the Java index database for use in air-gapped environments.
By default, Trivy will attempt to download latest databases. If it fails, the scan might fail. To avoid this behavior, you can tell Trivy to not attempt to download database files:

!!! note
You container image may contain JAR files even though you don't use Java directly.
In that case, you also need to download the Java index database.
- `--skip-db-update` to skip updating the main vulnerability database.
- `--skip-java-db-update` to skip updating the Java vulnerability database.
- `--skip-check-update` to skip updating the misconfiguration database.

=== "Trivy"
```shell
trivy image --skip-db-update --skip-java-db-update --offline-scan --skip-check-update myimage
```

```
TRIVY_TEMP_DIR=$(mktemp -d)
trivy --cache-dir $TRIVY_TEMP_DIR image --download-java-db-only
tar -cf ./javadb.tar.gz -C $TRIVY_TEMP_DIR/java-db metadata.json trivy-java.db
rm -rf $TRIVY_TEMP_DIR
```
=== "oras >= v0.13.0"
Please follow [oras installation instruction][oras].
## Self-Hosting

Download `javadb.tar.gz`:
You can host the databases on your own local OCI registry, in order to prevent Trivy reaching out of your network.

```
$ oras pull ghcr.io/aquasecurity/trivy-java-db:1
```
First, make a copy of the databases in a container registry that is accessible to Trivy. The databases are in:

=== "oras < v0.13.0"
Please follow [oras installation instruction][oras].
- `ghcr.io/aquasecurity/trivy-db:2`
- `ghcr.io/aquasecurity/trivy-java-db:1`
- `ghcr.io/aquasecurity/trivy-checks:0`

Download `javadb.tar.gz`:
Then, tell Trivy to use the local registry:

```
$ oras pull -a ghcr.io/aquasecurity/trivy-java-db:1
```
```shell
trivy image \
--db-repository myregistry.local/trivy-db \
--java-db-repository myregistry.local/trivy-java-db \
--checks-bundle-repository myregistry.local/trivy-checks \
myimage
```

### Authentication
itaysk marked this conversation as resolved.
Show resolved Hide resolved

### Transfer the DB files into the air-gapped environment
The way of transfer depends on the environment.
If the registry requires authentication, you can configure it in as described in the [private registry authentication document](../advanced/private-registries/index.md).

=== "Vulnerability db"
```
$ rsync -av -e ssh /path/to/db.tar.gz [user]@[host]:dst
```
## Manual cache population

=== "Java index db[^1]"
```
$ rsync -av -e ssh /path/to/javadb.tar.gz [user]@[host]:dst
```
You can also download the databases files manually and surgically populate the Trivy cache directory with them.

### Put the DB files in Trivy's cache directory
You have to know where to put the DB files. The following command shows the default cache directory.
### Downloading the DB files

On a machine with internet access, pull the database container archive from the public registry into your local workspace:

Note that these examples operate in the current working directory.

=== "Using ORAS"
This example uses [ORAS](https://oras.land), but you can use any other container registry manipulation tool.

```shell
oras pull ghcr.io/aquasecurity/trivy-db:2
```
$ ssh user@host
$ trivy -h | grep cache
--cache-dir value cache directory (default: "/home/myuser/.cache/trivy") [$TRIVY_CACHE_DIR]
```
=== "Vulnerability db"
Put the DB file in the cache directory + `/db`.

```
$ mkdir -p /home/myuser/.cache/trivy/db
$ cd /home/myuser/.cache/trivy/db
$ tar xvf /path/to/db.tar.gz -C /home/myuser/.cache/trivy/db
x trivy.db
x metadata.json
$ rm /path/to/db.tar.gz
```

=== "Java index db[^1]"
Put the DB file in the cache directory + `/java-db`.

```
$ mkdir -p /home/myuser/.cache/trivy/java-db
$ cd /home/myuser/.cache/trivy/java-db
$ tar xvf /path/to/javadb.tar.gz -C /home/myuser/.cache/trivy/java-db
x trivy-java.db
x metadata.json
$ rm /path/to/javadb.tar.gz
```



In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis, so that the scanner can detect recently-identified vulnerabilities.

### Run Trivy with the specific flags.
In an air-gapped environment, you have to specify `--skip-db-update` and `--skip-java-db-update`[^1] so that Trivy doesn't attempt to download the latest database files.
In addition, if you want to scan `pom.xml` dependencies, you need to specify `--offline-scan` since Trivy tries to issue API requests for scanning Java applications by default.

You should now have a file called `db.tar.gz`. Next, extract it to reveal the db files:

```shell
tar -xzf db.tar.gz
```
$ trivy image --skip-db-update --skip-java-db-update --offline-scan alpine:3.12

You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files.

=== "Using Trivy"
This example uses Trivy to pull the database container archive. The `--cache-dir` flag makes Trivy download the database files into our current working directory. The `--download-db-only` flag tells Trivy to only download the database files, not to scan any images.

```shell
trivy image --cache-dir . --download-db-only
```

## Air-Gapped Environment for misconfigurations
You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files, copy them over to the air-gapped environment.

No special measures are required to detect misconfigurations in an air-gapped environment.
### Populating the Trivy Cache

### Run Trivy with `--skip-check-update` option
In an air-gapped environment, specify `--skip-check-update` so that Trivy doesn't attempt to download the latest misconfiguration checks.
In order to populate the cache, you need to identify the location of the cache directory. If it is under the default location, you can run the following command to find it:

```shell
trivy -h | grep cache
```
$ trivy conf --skip-policy-update /path/to/conf

For the example, we will assume the `TRIVY_CACHE_DIR` variable holds the cache location:

```shell
TRIVY_CACHE_DIR=/home/user/.cache/trivy
```

Put the Trivy DB files in the Trivy cache directory under a `db` subdirectory:

```shell
# ensure cache db directory exists
mkdir -p ${TRIVY_CACHE_DIR}/db
# copy the db files
cp /path/to/trivy.db /path/to/metadata.json ${TRIVY_CACHE_DIR}/db/
```

[allowlist]: ../references/troubleshooting.md
[oras]: https://oras.land/docs/installation
### Java DB

For Java DB the process is the same, except for the following:

1. Image location is `ghcr.io/aquasecurity/trivy-java-db:1`
2. Archive file name is `javadb.tar.gz`
3. DB file name is `trivy-java.db`

## Misconfigurations scanning

Note that the misconfigurations database is also embedded in the Trivy binary (at build time), and will be used as a fallback if the external database is not available. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.
itaysk marked this conversation as resolved.
Show resolved Hide resolved

[^1]: This is only required to scan `jar` files. More information about `Java index db` [here](../coverage/language/java.md)
The misconfiguration scanner can be configured to load checks from a local directory, using the `--config-check` flag. In an air-gapped scenario you can copy the checks library from [Trivy checks repository](https://github.com/aquasecurity/trivy-checks) into a local directory, and load it with this flag. See more in the [Misconfiguration scanner documentation](../scanner/misconfiguration/index.md).
2 changes: 1 addition & 1 deletion docs/docs/compliance/contrib-compliance.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Additional information is provided below.

#### 1. Referencing a check that is already part of Trivy

Trivy has a comprehensive list of checks as part of its misconfiguration scanning. These can be found in the `trivy-policies/checks` directory ([Link](https://github.com/aquasecurity/trivy-checks/tree/main/checks)). If the check is present, the `AVD_ID` and other information from the check has to be used.
Trivy has a comprehensive list of checks as part of its misconfiguration scanning. These can be found in the `trivy-checks/checks` directory ([Link](https://github.com/aquasecurity/trivy-checks/tree/main/checks)). If the check is present, the `AVD_ID` and other information from the check has to be used.

Note: Take a look at the more generic compliance specs that are already available in Trivy. If you are adding new compliance spec to Kubernetes e.g. AWS EKS CIS Benchmarks, chances are high that the check you would like to add to the new spec has already been defined in the general `k8s-ci-v.000.yaml` compliance spec. The same applies for creating specific Cloud Provider Compliance Specs and the [generic compliance specs](https://github.com/aquasecurity/trivy-checks/tree/main/specs/compliance) available.

Expand Down
6 changes: 2 additions & 4 deletions docs/docs/references/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,10 +203,7 @@ Trivy v0.23.0 or later requires Trivy DB v2. Please update your local database o
!!! error
FATAL failed to download vulnerability DB

If trivy is running behind corporate firewall, you have to add the following urls to your allowlist.

- ghcr.io
- pkg-containers.githubusercontent.com
If Trivy is running behind corporate firewall, refer to the necessary connectivity requirements as described [here][network].

### Denied

Expand Down Expand Up @@ -271,4 +268,5 @@ $ trivy clean --all
```

[air-gapped]: ../advanced/air-gap.md
[network]: ../advanced/air-gap.md#network-requirements
[redis-cache]: ../../vulnerability/examples/cache/#cache-backend
25 changes: 13 additions & 12 deletions docs/docs/scanner/misconfiguration/check/builtin.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,22 @@
# Built-in Checks

## Check Sources
Built-in checks are mainly written in [Rego][rego] and Go.
Those checks are managed under [trivy-checks repository][trivy-checks].
## Checks Sources
Trivy has an extensive library of misconfiguration checks that is maintained at <https://github.com/aquasecurity/trivy-checks>.
Trivy checks are mainly written in [Rego][rego], while some checks are written in Go.
See [here](../../../coverage/iac/index.md) for the list of supported config types.

For suggestions or issues regarding policy content, please open an issue under the [trivy-checks][trivy-checks] repository.
## Checks Bundle
When performing a misconfiguration scan, Trivy will automatically download the relevant Checks bundle. The bundle is cached locally and Trivy will reuse it for subsequent scans on the same machine. Trivy takes care of updating the cache automatically, so normally users can be oblivious to it.

## Check Distribution
Trivy checks are distributed as an OPA bundle on [GitHub Container Registry][ghcr] (GHCR).
When misconfiguration detection is enabled, Trivy pulls the OPA bundle from GHCR as an OCI artifact and stores it in the cache.
Those checks are then loaded into Trivy OPA engine and used for detecting misconfigurations.
If Trivy is unable to pull down newer checks, it will use the embedded set of checks as a fallback. This is also the case in air-gap environments where `--skip-policy-update` might be passed.
For CLI flags related to the database, please refer to [this page](../configuration/db.md).
itaysk marked this conversation as resolved.
Show resolved Hide resolved

## Update Interval
## Checks Distribution
Trivy checks are distributed as an [OPA bundle](opa-bundle) hosted in the following GitHub Container Registry: <https://ghcr.io/aquasecurity/trivy-checks>.
Trivy checks for updates to OPA bundle on GHCR every 24 hours and pulls it if there are any updates.

### External connectivity
Trivy needs to connect to the internet to download the bundle. If you are running Trivy in an air-gapped environment, or an tightly controlled network, please refer to the [Advanced Network Scenarios document](../advanced/air-gap.md).
The Checks bundle is also embedded in the Trivy binary (at build time), and will be used as a fallback if Trivy is unable to download the bundle. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.

[rego]: https://www.openpolicyagent.org/docs/latest/policy-language/
[trivy-checks]: https://github.com/aquasecurity/trivy-checks
[ghcr]: https://github.com/aquasecurity/trivy-checks/pkgs/container/trivy-checks
[opa-bundle]: https://www.openpolicyagent.org/docs/latest/management-bundles/
2 changes: 1 addition & 1 deletion docs/docs/scanner/misconfiguration/custom/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,5 @@ Then, you need to pass data paths through `--data` option.
Trivy recursively searches the specified paths for JSON (`*.json`) and YAML (`*.yaml`) files.

```bash
$ trivy conf --policy ./policy --data data --namespaces user ./configs
$ trivy conf --config-check ./checks --data ./data --namespaces user ./configs
itaysk marked this conversation as resolved.
Show resolved Hide resolved
```
8 changes: 4 additions & 4 deletions docs/docs/scanner/misconfiguration/custom/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

## Overview
You can write custom checks in [Rego][rego].
Once you finish writing custom checks, you can pass the policy files or the directory where those policies are stored with `--policy` option.
Once you finish writing custom checks, you can pass the policy files or the directory where those checks are stored with --config-check` option.
itaysk marked this conversation as resolved.
Show resolved Hide resolved

``` bash
trivy conf --policy /path/to/policy.rego --policy /path/to/custom_policies --namespaces user /path/to/config_dir
trivy conf --config-check /path/to/policy.rego --config-check /path/to/custom_checks --namespaces user /path/to/config_dir
```

As for `--namespaces` option, the detail is described as below.
Expand Down Expand Up @@ -93,7 +93,7 @@ By default, only `builtin.*` packages will be evaluated.
If you define custom packages, you have to specify the package prefix via `--namespaces` option. By default, Trivy only runs in its own namespace, unless specified by the user. Note that the custom namespace does not have to be `user` as in this example. It could be anything user-defined.

``` bash
trivy conf --policy /path/to/custom_policies --namespaces user /path/to/config_dir
trivy conf --config-check /path/to/custom_checks --namespaces user /path/to/config_dir
```

In this case, `user.*` will be evaluated.
Expand Down Expand Up @@ -135,7 +135,7 @@ correct and do not reference incorrect properties/values.

#### custom.avd_id and custom.id

The AVD_ID can be used to link the check to the Aqua Vulnerability Database (AVD) entry. For example, the `avd_id` `AVD-AWS-0176` is the ID of the check in the [AWS Vulnerability Database](https://avd.aquasec.com/). If you are [contributing your check to trivy-policies](../../../../community/contribute/checks/overview.md), you need to generate an ID using `make id` in the [trivy-checks](https://github.com/aquasecurity/trivy-checks) repository. The output of the command will provide you the next free IDs for the different providers in Trivy.
The AVD_ID can be used to link the check to the Aqua Vulnerability Database (AVD) entry. For example, the `avd_id` `AVD-AWS-0176` is the ID of the check in the [AWS Vulnerability Database](https://avd.aquasec.com/). If you are [contributing your check to trivy-checks](../../../../community/contribute/checks/overview.md), you need to generate an ID using `make id` in the [trivy-checks](https://github.com/aquasecurity/trivy-checks) repository. The output of the command will provide you the next free IDs for the different providers in Trivy.

The ID is based on the AVD_ID. For instance if the `avd_id` is `AVD-AWS-0176`, the ID is `ID0176`.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/scanner/misconfiguration/custom/schema.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Input Schema

## Overview
Policies can be defined with custom schemas that allow inputs to be verified against them. Adding a policy schema
Checks can be defined with custom schemas that allow inputs to be verified against them. Adding a policy schema
enables Trivy to show more detailed error messages when an invalid input is encountered.

In Trivy we have been able to define a schema for a [Dockerfile](https://github.com/aquasecurity/trivy/tree/main/pkg/iac/rego/schemas)
Expand Down
Loading