Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incremental compilation fixes #836

Merged
merged 4 commits into from
Feb 9, 2024
Merged

incremental compilation fixes #836

merged 4 commits into from
Feb 9, 2024

Conversation

RBusarow
Copy link
Collaborator

@RBusarow RBusarow commented Jan 3, 2024

We start off with new feature toggle: trackSourceFiles.

It's present in the Gradle DSL and as a Gradle property.

// build.gradle.kts
anvil {
  trackSourceFiles = true // default is false
}

The majority of changes are behind that toggle, but not all.

Changes that ignore trackSourceFiles

GeneratedFileWithSources replaces GeneratedFile

In order to support incremental compilation, we need to track what source files resulted in what generated files, then aggregate that data in CodeGenerationExtension. The existing GeneratedFile type is perfectly situated for this, but I wanted to make it explicit whether the code being generated is going to be incremental or not, so I extracted an interface from it and introduced a new type.

If trackSourceFiles is enabled but the consuming project is using a custom generator that returns the deprecated GeneratedFile, the build will fail.

It is possible to return a GeneratedFileWithSources with an empty list of source files. Anvil will then treat the generated file as if it was generated from all source files in the source set, and it will be invalidated by any change.

trackSourceFiles == true trackSourceFiles == false
GeneratedFile ❌ not supported ✅ supported
GeneratedFileWithSources ✅ supported ✅ supported
GeneratedFileWithSources
with an empty sources list
✅ supported ✅ supported

build/anvil/src-gen-<variant> has moved

We now write a cache file as well, and need to put it somewhere. The new structure is pretty comparable to what KSP and KGP do.

Prior to these changes, the generated output's folder structure resembled this:

<project-directory>
  ├ build
  │ └ anvil
  │   ├ src-gen-debug
  │   │ └ com/example/DebugClass_Factory.kt
  │   └ src-gen-main
  │     └ com/example/InjectClass_Factory.kt
  ├ src
  │ ├ debug
  │ │ └ kotlin
  │ │   └ com/example/DebugClass.kt
  │ └ main
  │   └ kotlin
  │     └ com/example/InjectClass.kt
  └ build.gradle.kts

Now, the src-gen-main directory is gone and we have this:

<project-directory>
  ├ build
  │ └ anvil
  │   ├ debug
  │   │ └ caches
  │   │ │ └ generated-files-cache.bin
  │   │ └ generated
  │   │   └ com/example/DebugClass_Factory.kt
  │   └ main
  │     └ caches
  │     │ └ generated-files-cache.bin
  │     └ generated
  │       └ com/example/InjectClass_Factory.kt
  ├ src
  │ ├ debug
  │ │ └ kotlin
  │ │   └ com/example/DebugClass.kt
  │ └ main
  │   └ kotlin
  │     └ com/example/InjectClass.kt
  └ build.gradle.kts

New Incremental and Caching Behavior

Tracking Anvil output in Gradle

AnvilPlugin will add the generated source code and cache binary file as TaskOutputs of their corresponding compileKotlin task:

kotlinCompilation.compileTaskProvider.configure { task ->
  if (variant.variantFilter.trackSourceFiles) {
    task.outputs.dir(srcGenDir)
    task.outputs.dir(anvilCacheDir)
  }
}

This means that Gradle will handle cache/restore logic on its own when it's a fresh build, when it's from a (Gradle) build cache, or when the task is already up to date.

This is not enough to handle incremental builds, like when a single source file is changed or if a source file is deleted.

File hashing

The files that the Kotlin compiler gives us in CodeGenerationExtension.onAnalysisCompleted aren't only the changed files. There is no built-in way for us to compare those inputs to the inputs of the last invocation of onAnalysisCompleted.

We now create an MD5 hash of every file that goes through CodeGenerationExtension, whether it's generated or human-written. These hashes are stored in build/anvil/<variant name>/caches/generated-file-cache.bin, which is written to disk after every invocation of onAnalysisCompleted.

Tracking source <--> generated relationships

Thanks to GeneratedFileWithSources, we know exactly which source files resulted in which generated files. These relationships are also stored in build/anvil/<variant name>/caches/generated-file-cache.bin.

flowchart TD

  gen1.kt --> gen4.kt

  gen2.kt --> gen6.kt
  gen1.kt --> gen5.kt
  gen2.kt --> gen5.kt
  gen3.kt --> gen6.kt

  source1.kt --> gen1.kt
  source2.kt --> gen2.kt
  source2.kt --> gen3.kt
  source3.kt --> gen3.kt 
Loading

Deleting out-of-date outputs

A file is considered to be changed if it is in the input files and its current MD5 hash is different from its cached hash, or if the cache doesn't contain a hash for that file.

A file is considered to be deleted if it's present in the cache but not on disk, AND it is not tracked as a generated file in the cache.

For every changed or deleted file, we delete all downstream generated files recursively. For example, in the above diagram, if Source_2.kt is changed, we delete gen2.kt, gen3.kt, gen5.kt, and gen6.kt.

flowchart TD
  
  classDef changed fill:#A00;
  
  source1[source1.kt]
  source2[source2.kt]:::changed
  source3[source3.kt]
  gen1[gen1.kt]
  gen2[gen2.kt]:::changed
  gen3[gen3.kt]:::changed
  gen4[gen4.kt]
  gen5[gen5.kt]:::changed
  gen6[gen6.kt]:::changed

  gen1 --> gen4

  gen2 --> gen6
  gen1 --> gen5
  gen2 --> gen5
  gen3 --> gen6

  source1 --> gen1
  source2 --> gen2
  source2 --> gen3
  source3 --> gen3 
Loading

Restoring missing outputs

Because we include the generated files as TaskOutputs of the compileKotlin task, Gradle will delete those files any time it re-runs the task. This would break incremental builds, where we may only receive a subset of the source files, and therefore only re-generate a subset of the output.

Before we generate any new code, we restore any missing generated files from the generated-file-cache.bin cache.

TODO

  • finalize the default value of trackSourceFiles

@RBusarow RBusarow force-pushed the rick/incremental-fixes branch 3 times, most recently from 215c6ef to e91d89b Compare January 3, 2024 22:01
@RBusarow RBusarow force-pushed the rick/incremental-fixes branch 4 times, most recently from 0e31737 to a8662a7 Compare January 23, 2024 19:05
@RBusarow RBusarow force-pushed the rick/incremental-fixes branch 4 times, most recently from dce7f22 to f0a7e6d Compare January 30, 2024 23:17
@RBusarow RBusarow marked this pull request as ready for review January 30, 2024 23:25
@RBusarow RBusarow changed the title [WIP] incremental compilation fixes incremental compilation fixes Jan 30, 2024
@RBusarow RBusarow force-pushed the rick/incremental-fixes branch from f0a7e6d to 243b078 Compare January 30, 2024 23:53
Copy link
Collaborator

@handstandsam handstandsam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed on a call with @RBusarow

RBusarow added a commit that referenced this pull request Feb 6, 2024
RBusarow added a commit that referenced this pull request Feb 6, 2024
@RBusarow RBusarow force-pushed the rick/incremental-fixes branch from 243b078 to 756ab5a Compare February 6, 2024 00:24
Copy link
Member

@JoelWilcox JoelWilcox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, still working my way through most of the tests

* - Generated code is cached in a way that Gradle understands,
* and will be restored from cache along with other build artifacts.
*
* By default this feature is enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking maybe we should default this to disabled first and have it be opt-in while we get feedback on it, verify no major issues pop up, and also do benchmarking for performance impact. Then we can switch it to enabled by default in a follow-up release

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my thought as well, except that while testing against a snapshot or otherwise-special build, it's more convenient to just have it enabled. So I was thinking leave it enabled by default up until release, then disable it by default, then enable it in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my thought as well, except that while testing against a snapshot or otherwise-special build, it's more convenient to just have it enabled. So I was thinking leave it enabled by default up until release, then disable it by default, then enable it in a follow-up.

It would be easy to get released in the wrong state that way though, and just creates additional work along the way. Shouldn't enabling for the snapshot be really easy now by just setting it in your gradle.properties file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding on to the previous comment, I think changing to false now is also preferred so that we're testing a snapshot as close to release as possible which lets us verify the expected default behavior etc

@RBusarow RBusarow force-pushed the rick/incremental-fixes branch from 89df450 to 22e4f97 Compare February 7, 2024 22:17
RBusarow added a commit that referenced this pull request Feb 7, 2024
RBusarow added a commit that referenced this pull request Feb 8, 2024
@RBusarow RBusarow force-pushed the rick/incremental-fixes branch from e0c01fb to 29da0d0 Compare February 8, 2024 16:23
is RelativeFile -> {
if (sourceFile == RelativeFile.ANY) return true
val currentMd5 = sourceFile.md5()
val previousMd5 = filesToMd5.put(sourceFile, currentMd5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be using get() right? Right now it has a side effect of modifying the tracked MD5

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional, but it's tricky.

This function must always calculate the file's current md5 in order to do the comparison. As such, we might as well cache it. The filesToMd5.put(...) returns the previously-stored value or null, so we still get to do the same comparison -- and now we get the cache update for free.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 that makes sense for optimizing. I don't remember if there's already a test for this, but if not it's probably worth adding one to make it explicit that this behavior is intentional. I would have assumed it was a bug and changed it to get if I was reading this sometime in the future since the method name hasChanged suggests it's only checking things

* - Generated code is cached in a way that Gradle understands,
* and will be restored from cache along with other build artifacts.
*
* By default this feature is enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding on to the previous comment, I think changing to false now is also preferred so that we're testing a snapshot as close to release as possible which lets us verify the expected default behavior etc

@RBusarow RBusarow force-pushed the rick/incremental-fixes branch 2 times, most recently from 2ba5838 to af5cf9c Compare February 9, 2024 23:39
@RBusarow RBusarow force-pushed the rick/incremental-fixes branch from af5cf9c to d7ca03d Compare February 9, 2024 23:40
@RBusarow RBusarow enabled auto-merge February 9, 2024 23:40
@RBusarow RBusarow merged commit 53672c0 into main Feb 9, 2024
17 checks passed
@RBusarow RBusarow deleted the rick/incremental-fixes branch February 9, 2024 23:55
@Laimiux
Copy link

Laimiux commented Feb 23, 2024

I'm not 100% sure if it's related to this pull request, but it's very likely. I've used the latest beta with trackSourceFiles and started to see a new error pretty frequently (same with other members of our team).

java.io.FileNotFoundException: /nvme/buildkite/builds/65536mb-10240cpu-android_8f88a77ca73a_b172b2e0/instacart/instacart-android/instacart-collections/impl/../../../../../../../../Users/laimonasturauskas/instacart-android/instacart-collections/impl/src/main/java/com/instacart/client/collections/aisle/ICAisleAndDisplayAdRowsIntegration.kt (No such file or directory)
        at java.base/java.io.FileInputStream.open0(Native Method)
        at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
        at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
        at kotlin.io.FilesKt__FileReadWriteKt.readBytes(FileReadWrite.kt:63)
        at com.squareup.anvil.compiler.codegen.incremental.GeneratedFileCache.md5(GeneratedFileCache.kt:167)
        at com.squareup.anvil.compiler.codegen.incremental.GeneratedFileCache.md5(GeneratedFileCache.kt:170)
        at com.squareup.anvil.compiler.codegen.incremental.GeneratedFileCache.hasChanged(GeneratedFileCache.kt:153)
        at com.squareup.anvil.compiler.codegen.incremental.GeneratedFileCache.hasChanged(GeneratedFileCache.kt:150)
        at com.squareup.anvil.compiler.codegen.incremental.FileCacheOperations.restoreFromCache(FileCacheOperations.kt:39)
        at com.squareup.anvil.compiler.codegen.CodeGenerationExtension.analysisCompleted(CodeGenerationExtension.kt:97)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compiler doesn't pick incremental changes to multibindings in some circumstances
4 participants