aquasecurity · knqyf263 · Aug 9, 2024 · Jul 14, 2024 · Jul 15, 2024 · Jul 16, 2024
@@ -1,142 +1,128 @@
-# Air-Gapped Environment
+# Advanced Network Scenarios
 
-Trivy can be used in air-gapped environments. Note that an allowlist is [here][allowlist].
+Trivy needs to connect to the internet occasionally, in order to download relevant content. This document explains the network connectivity requirements of Trivy and setting up Trivy in particular scenarios.
 
-## Air-Gapped Environment for vulnerabilities
+## Network requirements
 
-### Download the vulnerability database
-At first, you need to download the vulnerability database for use in air-gapped environments.
+Trivy's databases are distributed as OCI images via GitHub Container registry (GHCR):
 
-=== "Trivy"
+- <https://ghcr.io/aquasecurity/trivy-db>
+- <https://ghcr.io/aquasecurity/trivy-java-db>
+- <https://ghcr.io/aquasecurity/trivy-checks>
 
-    ```
-    TRIVY_TEMP_DIR=$(mktemp -d)
-    trivy --cache-dir $TRIVY_TEMP_DIR image --download-db-only
-    tar -cf ./db.tar.gz -C $TRIVY_TEMP_DIR/db metadata.json trivy.db
-    rm -rf $TRIVY_TEMP_DIR
-    ```
+If Trivy is running behind a firewall, you'll need to add the following urls to your allowlist:
 
-=== "oras >= v0.13.0"
-    Please follow [oras installation instruction][oras].
+- `ghcr.io`
+- `pkg-containers.githubusercontent.com`
 
-    Download `db.tar.gz`:
+The databases are pulled by Trivy using the [OCI Distribution](https://github.com/opencontainers/distribution-spec) specification, which is based on simple HTTPS protocol.
 
-    ```
-    $ oras pull ghcr.io/aquasecurity/trivy-db:2
-    ```
+## Running Trivy in air-gapped environment
 
-=== "oras < v0.13.0"
-    Please follow [oras installation instruction][oras].
+An air-gapped environment refers to situations where the network connectivity from the machine Trivy runs on is blocked or restricted.
 
-    Download `db.tar.gz`:
+In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis. 
 
-    ```
-    $ oras pull -a ghcr.io/aquasecurity/trivy-db:2
-    ```
+## Offline Mode
 
-### Download the Java index database[^1]
-Java users also need to download the Java index database for use in air-gapped environments.
+By default, Trivy will attempt to download latest databases. If it fails, the scan might fail. To avoid this behavior, you can tell Trivy to not attempt to download database files:
 
-!!! note
-    You container image may contain JAR files even though you don't use Java directly.
-    In that case, you also need to download the Java index database.
+- `--skip-db-update` to skip updating the main vulnerability database.
+- `--skip-java-db-update` to skip updating the Java vulnerability database.
+- `--skip-check-update` to skip updating the misconfiguration database.
 
-=== "Trivy"
+```shell
+trivy image --skip-db-update --skip-java-db-update --offline-scan --skip-check-update myimage
+```
 
-    ```
-    TRIVY_TEMP_DIR=$(mktemp -d)
-    trivy --cache-dir $TRIVY_TEMP_DIR image --download-java-db-only
-    tar -cf ./javadb.tar.gz -C $TRIVY_TEMP_DIR/java-db metadata.json trivy-java.db
-    rm -rf $TRIVY_TEMP_DIR
-    ```
-=== "oras >= v0.13.0"
-    Please follow [oras installation instruction][oras].
+## Self-Hosting
 
-    Download `javadb.tar.gz`:
+You can host the databases on your own local OCI registry, in order to prevent Trivy reaching out of your network.  
 
-    ```
-    $ oras pull ghcr.io/aquasecurity/trivy-java-db:1
-    ```
+First, make a copy of the databases in a container registry that is accessible to Trivy. The databases are in:
 
-=== "oras < v0.13.0"
-    Please follow [oras installation instruction][oras].
+- `ghcr.io/aquasecurity/trivy-db:2`
+- `ghcr.io/aquasecurity/trivy-java-db:1`
+- `ghcr.io/aquasecurity/trivy-checks:0`
 
-    Download `javadb.tar.gz`:
+Then, tell Trivy to use the local registry:
 
-    ```
-    $ oras pull -a ghcr.io/aquasecurity/trivy-java-db:1
-    ```
+```shell
+trivy image \
+    --db-repository myregistry.local/trivy-db \
+    --java-db-repository myregistry.local/trivy-java-db \
+    --checks-bundle-repository myregistry.local/trivy-checks \
+    myimage
+```
 
+### Authentication
 
-### Transfer the DB files into the air-gapped environment
-The way of transfer depends on the environment.
+If the registry requires authentication, you can configure it in as described in the [private registry authentication document](../advanced/private-registries/index.md).
 
-=== "Vulnerability db"
-    ```
-    $ rsync -av -e ssh /path/to/db.tar.gz [user]@[host]:dst
-    ```
+## Manual cache population
 
-=== "Java index db[^1]"
-    ```
-    $ rsync -av -e ssh /path/to/javadb.tar.gz [user]@[host]:dst
-    ```
+You can also download the databases files manually and surgically populate the Trivy cache directory with them.
 
-### Put the DB files in Trivy's cache directory
-You have to know where to put the DB files. The following command shows the default cache directory.
+### Downloading the DB files
 
+On a machine with internet access, pull the database container archive from the public registry into your local workspace:
+
+Note that these examples operate in the current working directory.
+
+=== "Using ORAS"
+This example uses [ORAS](https://oras.land), but you can use any other container registry manipulation tool.
+
+```shell
+oras pull ghcr.io/aquasecurity/trivy-db:2
 ```
-$ ssh user@host
-$ trivy -h | grep cache
-   --cache-dir value  cache directory (default: "/home/myuser/.cache/trivy") [$TRIVY_CACHE_DIR]
-```
-=== "Vulnerability db"
-    Put the DB file in the cache directory + `/db`.
-
-    ```
-    $ mkdir -p /home/myuser/.cache/trivy/db
-    $ cd /home/myuser/.cache/trivy/db
-    $ tar xvf /path/to/db.tar.gz -C /home/myuser/.cache/trivy/db
-    x trivy.db
-    x metadata.json
-    $ rm /path/to/db.tar.gz
-    ```
-
-=== "Java index db[^1]"
-    Put the DB file in the cache directory + `/java-db`.
-
-    ```
-    $ mkdir -p /home/myuser/.cache/trivy/java-db
-    $ cd /home/myuser/.cache/trivy/java-db
-    $ tar xvf /path/to/javadb.tar.gz -C /home/myuser/.cache/trivy/java-db
-    x trivy-java.db
-    x metadata.json
-    $ rm /path/to/javadb.tar.gz
-    ```
-
-
-
-In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis, so that the scanner can detect recently-identified vulnerabilities. 
-
-### Run Trivy with the specific flags.
-In an air-gapped environment, you have to specify `--skip-db-update` and `--skip-java-db-update`[^1] so that Trivy doesn't attempt to download the latest database files.
-In addition, if you want to scan `pom.xml` dependencies, you need to specify `--offline-scan` since Trivy tries to issue API requests for scanning Java applications by default.
 
+You should now have a file called `db.tar.gz`. Next, extract it to reveal the db files:
+
+```shell
+tar -xzf db.tar.gz
 ```
-$ trivy image --skip-db-update --skip-java-db-update --offline-scan alpine:3.12
+
+You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files.
+
+=== "Using Trivy"
+This example uses Trivy to pull the database container archive. The `--cache-dir` flag makes Trivy download the database files into our current working directory. The `--download-db-only` flag tells Trivy to only download the database files, not to scan any images.
+
+```shell
+trivy image --cache-dir . --download-db-only
 ```
 
-## Air-Gapped Environment for misconfigurations
+You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files, copy them over to the air-gapped environment.
+
+### Populating the Trivy Cache
 
-No special measures are required to detect misconfigurations in an air-gapped environment.
+In order to populate the cache, you need to identify the location of the cache directory. If it is under the default location, you can run the following command to find it:
+
+```shell
+trivy -h | grep cache
+```
 
-### Run Trivy with `--skip-check-update` option
-In an air-gapped environment, specify `--skip-check-update` so that Trivy doesn't attempt to download the latest misconfiguration checks.
+For the example, we will assume the `TRIVY_CACHE_DIR` variable holds the cache location:
 
+```shell
+TRIVY_CACHE_DIR=/home/user/.cache/trivy
 ```
-$ trivy conf --skip-policy-update /path/to/conf
+
+Put the Trivy DB files in the Trivy cache directory under a `db` subdirectory:
+
+```shell
+# ensure cache db directory exists
+mkdir -p ${TRIVY_CACHE_DIR}/db
+# copy the db files
+cp /path/to/trivy.db /path/to/metadata.json ${TRIVY_CACHE_DIR}/db/
 ```
 
-[allowlist]: ../references/troubleshooting.md
-[oras]: https://oras.land/docs/installation
+### Java DB
+
+For Java DB the process is the same, except for the following:
+1. Image location is `ghcr.io/aquasecurity/trivy-java-db:1`
+2. Archive file name is `javadb.tar.gz`
+3. DB file name is `trivy-java.db`
+
+## Misconfigurations scanning
 
-[^1]: This is only required to scan `jar` files. More information about `Java index db` [here](../coverage/language/java.md)
+Note that the misconfigurations database is also embedded in the Trivy binary (at build time), and will be used as a fallback if the external database is not available. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.
@@ -203,10 +203,7 @@ Trivy v0.23.0 or later requires Trivy DB v2. Please update your local database o
 !!! error
     FATAL failed to download vulnerability DB
 
-If trivy is running behind corporate firewall, you have to add the following urls to your allowlist.
-
-- ghcr.io
-- pkg-containers.githubusercontent.com
+If Trivy is running behind corporate firewall, refer to the necessary connectivity requirements as described [here][network].
 
 ### Denied
 
@@ -271,4 +268,5 @@ $ trivy clean --all
 ```
 
 [air-gapped]: ../advanced/air-gap.md
+[network]: ../advanced/air-gap.md#network-requirements
 [redis-cache]: ../../vulnerability/examples/cache/#cache-backend
@@ -1,21 +1,22 @@
 # Built-in Checks 
 
-## Check Sources
-Built-in checks are mainly written in [Rego][rego] and Go.
-Those checks are managed under [trivy-checks repository][trivy-checks].
+## Checks Sources
+Trivy has an extensive library of misconfiguration checks that is maintained at <https://github.com/aquasecurity/trivy-checks>.  
+Trivy checks are mainly written in [Rego][rego], while some checks are written in Go.  
 See [here](../../../coverage/iac/index.md) for the list of supported config types.
 
-For suggestions or issues regarding policy content, please open an issue under the [trivy-checks][trivy-checks] repository.
+## Checks Bundle
+When performing a misconfiguration scan, Trivy will automatically downloads the relevant Checks bundle. The bundle is cached locally and Trivy will reuse it for subsequent scans on the same machine. Trivy takes care of updating the cache automatically so normally can be oblivious to it.
 
-## Check Distribution
-Trivy checks are distributed as an OPA bundle on [GitHub Container Registry][ghcr] (GHCR).
-When misconfiguration detection is enabled, Trivy pulls the OPA bundle from GHCR as an OCI artifact and stores it in the cache.
-Those checks are then loaded into Trivy OPA engine and used for detecting misconfigurations.
-If Trivy is unable to pull down newer checks, it will use the embedded set of checks as a fallback. This is also the case in air-gap environments where `--skip-policy-update` might be passed.
+For CLI flags related to the database, please refer to [this page](../configuration/db.md).
 
-## Update Interval
+## Checks Distribution
+Trivy checks are distributed as an [OPA bundle](opa-bundle) hosted in the following GitHub Container Registry: <https://ghcr.io/aquasecurity/trivy-checks>.  
 Trivy checks for updates to OPA bundle on GHCR every 24 hours and pulls it if there are any updates.
 
+### External connectivity
+Trivy needs to connect to the internet to download the bundle. If you are running Trivy in an air-gapped environment, or an tightly controlled network, please refer to the [Advanced Network Scenarios document](../advanced/air-gap.md).  
+The Checks bundle is also embedded in the Trivy binary (at build time), and will be used as a fallback if Trivy is unable to download the bundle. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.
+
 [rego]: https://www.openpolicyagent.org/docs/latest/policy-language/
-[trivy-checks]: https://github.com/aquasecurity/trivy-checks
-[ghcr]: https://github.com/aquasecurity/trivy-checks/pkgs/container/trivy-checks
+[opa-bundle]: https://www.openpolicyagent.org/docs/latest/management-bundles/
@@ -158,45 +158,22 @@ Trivy can detect vulnerabilities in Kubernetes clusters and components by scanni
 
 [^1]: Some manual triage and correction has been made.
 
-## Database
-Trivy downloads [the vulnerability database](https://github.com/aquasecurity/trivy-db) every 6 hours.
-Trivy uses two types of databases for vulnerability detection:
-
-- Vulnerability Database
-- Java Index Database
-
-This page provides detailed information about these databases.
-
-### Vulnerability Database
-Trivy utilizes a database containing vulnerability information.
-This database is built every six hours on [GitHub](https://github.com/aquasecurity/trivy-db) and is distributed via [GitHub Container registry (GHCR)](https://ghcr.io/aquasecurity/trivy-db).
-The database is cached and updated as needed.
-As Trivy updates the database automatically during execution, users don't need to be concerned about it.
+## Databases
+Trivy utilizes several databases containing information relevant for vulnerability scanning.  
+When performing a vulnerability scan, Trivy will automatically downloads the relevant databases. The databases are cached locally and Trivy will reuse them for subsequent scans on the same machine. Trivy takes care of updating the databases cache automatically so normally can be oblivious to it.
 
 For CLI flags related to the database, please refer to [this page](../configuration/db.md).
 
-#### Private Hosting
-If you host the database on your own OCI registry, you can specify a different repository with the `--db-repository` flag.
-The default is `ghcr.io/aquasecurity/trivy-db`.
-
-```shell
-$ trivy image --db-repository YOUR_REPO YOUR_IMAGE
-```
-
-If authentication is required, it can be configured in the same way as for private images.
-Please refer to [the documentation](../advanced/private-registries/index.md) for more details.
+### Vulnerability Database
+This is Trivy's main database which contains vulnerability information, as collected from the datasources mentioned above.  
+It is built every six hours on [GitHub](https://github.com/aquasecurity/trivy-db).  
 
 ### Java Index Database
-This database is only downloaded when scanning JAR files so that Trivy can identify the groupId, artifactId, and version of JAR files.
-It is built once a day on [GitHub](https://github.com/aquasecurity/trivy-java-db) and distributed via [GitHub Container registry (GHCR)](https://ghcr.io/aquasecurity/trivy-java-db).
-Like the vulnerability database, it is automatically downloaded and updated when needed, so users don't need to worry about it.
-
-#### Private Hosting
-If you host the database on your own OCI registry, you can specify a different repository with the `--java-db-repository` flag.
-The default is `ghcr.io/aquasecurity/trivy-java-db`.
+When scanning JAR files, Trivy relies on a dedicated database for identifying the groupId, artifactId, and version of the scanned JAR files. This database is only used when scanning JAR files, however your scanned artifacts might contain JAR files that you're not aware of.  
+This database is built once a day on [GitHub](https://github.com/aquasecurity/trivy-java-db).  
 
-If authentication is required, you need to run `docker login YOUR_REGISTRY`.
-Currently, specifying a username and password is not supported.
+### External connectivity
+Trivy needs to connect to the internet to download the databases. If you are running Trivy in an air-gapped environment, or an tightly controlled network, please refer to the [Advanced Network Scenarios document](../advanced/air-gap.md).
 
 ## Configuration
 This section describes vulnerability-specific configuration.

@@ -137,7 +137,7 @@ nav:
           - Developer guide: docs/plugin/developer-guide.md
       - Advanced:
           - Modules: docs/advanced/modules.md
-          - Air-Gapped Environment: docs/advanced/air-gap.md
+          - Advanced Network Scenarios: docs/advanced/air-gap.md
           - Container Image:
               - Embed in Dockerfile: docs/advanced/container/embed-in-dockerfile.md
               - Unpacked container image filesystem: docs/advanced/container/unpacked-filesystem.md