Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.4.0] Implement disk cache garbage collection. #23833

Merged
merged 17 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
3855b82
[7.4.0] Implement core garbage collection logic for the disk cache.
tjgq Sep 23, 2024
57a4b61
[7.4.0] Deflake DiskCacheGarbageCollectorTest.
tjgq Sep 24, 2024
2af4a4b
[7.4.0] Implement a helper class to manage a shared or exclusive lock…
tjgq Sep 25, 2024
011199e
[7.4.0] Wire DiskCacheGarbageCollector into RemoteModule as an IdleTask.
tjgq Sep 26, 2024
b6e89ba
[7.4.0] Collect and log statistics for every disk cache garbage colle…
tjgq Sep 26, 2024
72c7020
[7.4.0] Tie-break by path, so that AC entries are garbage collected b…
tjgq Sep 26, 2024
165c480
[7.4.0] Acquire an exclusive lock on the disk cache while running a g…
tjgq Sep 26, 2024
a337a5a
[7.4.0] Add a standalone disk cache garbage collection utility.
tjgq Sep 26, 2024
9baaea3
[7.4.0] Use a dedicated thread pool for disk cache garbage collection.
tjgq Sep 27, 2024
2b002ea
[7.4.0] Add an end-to-end test for disk cache garbage collection.
tjgq Sep 27, 2024
50185c3
[7.4.0] Amend the description of DiskCacheClient to reflect the addit…
tjgq Sep 27, 2024
b296cac
[7.4.0] Correctly handle non-ASCII paths in DiskCacheLock.
tjgq Sep 30, 2024
883a903
[7.4.0] Document disk cache garbage collection.
tjgq Sep 30, 2024
09ae592
[7.4.0] Fix export of //src/tools/diskcache and include it in pre/pos…
tjgq Sep 30, 2024
cc773a3
[7.4.0] Use a LongAdder instead of an AtomicLong.
tjgq Oct 1, 2024
86a2408
[7.4.0] Correctly handle path encoding in DiskCacheLock.
tjgq Oct 1, 2024
53839d3
[7.4.0] Check the mtime again immediately before garbage collecting a…
tjgq Oct 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .bazelci/postsubmit.yml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.bazelci/postsubmit.yml

Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ tasks:
- "//src:bazel_jdk_minimal"
- "//src:test_repos"
- "//src/main/java/..."
- "//src/tools/diskcache/..."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.bazelci/postsubmit.yml

- "//src/tools/execlog/..."
test_flags:
- "--config=ci-linux"
Expand Down
1 change: 1 addition & 0 deletions .bazelci/presubmit.yml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.bazelci/presubmit.yml

Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ tasks:
- "//src:bazel_jdk_minimal"
- "//src:test_repos"
- "//src/main/java/..."
- "//src/tools/diskcache/..."
- "//src/tools/execlog/..."
test_flags:
- "--config=ci-linux"
Expand Down
17 changes: 14 additions & 3 deletions site/en/remote/caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -292,9 +292,8 @@ This feature is unsupported on Windows.

Bazel can use a directory on the file system as a remote cache. This is
useful for sharing build artifacts when switching branches and/or working
on multiple workspaces of the same project, such as multiple checkouts. Since
Bazel does not garbage-collect the directory, you might want to automate a
periodic cleanup of this directory. Enable the disk cache as follows:
on multiple workspaces of the same project, such as multiple checkouts.
Enable the disk cache as follows:

```posix-terminal
build --disk_cache={{ '<var>' }}path/to/build/cache{{ '</var>' }}
Expand All @@ -305,6 +304,18 @@ You can pass a user-specific path to the `--disk_cache` flag using the `~` alias
when enabling the disk cache for all developers of a project via the project's
checked in `.bazelrc` file.

### Garbage collection {:#disk-cache-gc}

Starting with Bazel 7.4, you can use `--experimental_disk_cache_gc_max_size` and
`--experimental_disk_cache_gc_max_age` to set a maximum size for the disk cache
or for the age of individual cache entries. Bazel will automatically garbage
collect the disk cache while idling between builds; the idle timer can be set
with `--experimental_disk_cache_gc_idle_delay` (defaulting to 5 minutes).

As an alternative to automatic garbage collection, we also provide a [tool](
https://github.com/bazelbuild/bazel/tree/master/src/tools/diskcache) to run a
garbage collection on demand.

## Known issues {:#known-issues}

**Input file modification during a build**
Expand Down
1 change: 1 addition & 0 deletions src/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,7 @@ filegroup(
"//src/tools/android:srcs",
"//src/tools/android/java/com/google/devtools/build/android:srcs",
"//src/tools/bzlmod:srcs",
"//src/tools/diskcache:srcs",
"//src/tools/execlog:srcs",
"//src/tools/launcher:srcs",
"//src/tools/remote:srcs",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@
import com.google.devtools.build.lib.remote.circuitbreaker.CircuitBreakerFactory;
import com.google.devtools.build.lib.remote.common.RemoteCacheClient;
import com.google.devtools.build.lib.remote.common.RemoteExecutionClient;
import com.google.devtools.build.lib.remote.disk.DiskCacheClient;
import com.google.devtools.build.lib.remote.disk.DiskCacheGarbageCollectorIdleTask;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/BUILD

import com.google.devtools.build.lib.remote.downloader.GrpcRemoteDownloader;
import com.google.devtools.build.lib.remote.http.DownloadTimeoutException;
import com.google.devtools.build.lib.remote.http.HttpException;
Expand Down Expand Up @@ -335,6 +337,14 @@ public void beforeCommand(CommandEnvironment env) throws AbruptExitException {
}
}

if (enableDiskCache) {
var gcIdleTask =
DiskCacheGarbageCollectorIdleTask.create(remoteOptions, env.getWorkingDirectory());
if (gcIdleTask != null) {
env.addIdleTask(gcIdleTask);
}
}

if (!enableDiskCache && !enableHttpCache && !enableGrpcCache && !enableRemoteExecution) {
// Quit if no remote caching or execution was enabled.
actionContextProvider =
Expand Down Expand Up @@ -954,9 +964,9 @@ public void afterCommand() {
}

private static void afterCommandTask(
RemoteActionContextProvider actionContextProvider,
TempPathGenerator tempPathGenerator,
AsynchronousMessageOutputStream<LogEntry> rpcLogFile)
@Nullable RemoteActionContextProvider actionContextProvider,
@Nullable TempPathGenerator tempPathGenerator,
@Nullable AsynchronousMessageOutputStream<LogEntry> rpcLogFile)
throws AbruptExitException {
if (actionContextProvider != null) {
actionContextProvider.afterCommand();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,17 @@ java_library(
name = "disk",
srcs = glob(["*.java"]),
deps = [
"//src/main/java/com/google/devtools/build/lib/concurrent",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/main/java/com/google/devtools/build/lib/remote/disk/BUILD

"//src/main/java/com/google/devtools/build/lib/exec:spawn_runner",
"//src/main/java/com/google/devtools/build/lib/remote:store",
"//src/main/java/com/google/devtools/build/lib/remote/common",
"//src/main/java/com/google/devtools/build/lib/remote/common:cache_not_found_exception",
"//src/main/java/com/google/devtools/build/lib/remote/options",
"//src/main/java/com/google/devtools/build/lib/remote/util",
"//src/main/java/com/google/devtools/build/lib/server:idle_task",
"//src/main/java/com/google/devtools/build/lib/util:string",
"//src/main/java/com/google/devtools/build/lib/vfs",
"//third_party:flogger",
"//third_party:guava",
"//third_party:jsr305",
"//third_party/protobuf:protobuf_java",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.bazelci/postsubmit.yml

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,13 @@
* when they collide.
*
* <p>The mtime of an entry reflects the most recent time the entry was stored *or* retrieved. This
* property may be used to trim the disk cache to the most recently used entries. However, it's not
* safe to trim the cache at the same time a Bazel process is accessing it.
* property may be used to garbage collect the disk cache by deleting the least recently accessed
* entries. This may be done by Bazel itself (see {@link DiskCacheGarbageCollectorIdleTask}), by
* another Bazel process sharing the disk cache, or by an external process. Although we could have
* arranged for an ongoing garbage collection to block a concurrent build, we judge it to not be
* worth the extra complexity; assuming that the collection policy is not overly aggressive, the
* likelihood of a race condition is fairly small, and an affected build is able to automatically
* recover by retrying.
*/
public class DiskCacheClient implements RemoteCacheClient {

Expand Down
Loading
Loading