Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul transitive module handling in dependency resolution #4145

Merged

Conversation

alexarchambault
Copy link
Collaborator

@alexarchambault alexarchambault commented Dec 16, 2024

This PR changes the way transitive dependencies, but also transitive BOM dependencies and dependency management, are passed to coursier.

Before this commit, Mill was walking its own module graph, and dependencies / BOM dependencies / dep management were manually aggregated beforehand. During dependency resolution, coursier also does such aggregations for external dependencies.

Having both Mill and coursier handle that concern could lead to discrepancies or duplicate work (in #4068 in particular). So this commit passes non-processed dependencies / BOM dependencies / dep management as-is to coursier, and lets coursier handle / compose those.

With this PR, dependencies / BOM dependencies / dep management are passed around solely via JavaModule#coursierProject / JavaModule#transitiveCoursierProjects. This PR deprecates JavaModule#{transitiveIvyDeps,transitiveCompileIvyDeps,transitiveRunIvyDeps,processDependency,processedIvyDeps} (some only recently introduced in #3924).

This changes the way transitive dependencies, but also transitive BOM
dependencies and dependency management, are passed to coursier.

Before this commit, Mill was walking its own module graph, and
dependencies / BOM dependencies / dep management were manually aggregated
beforehand. During dependency resolution, coursier also does such
aggregations.

Having both Mill and coursier handle that concern could lead to discrepancies.
So this commit passes non-processed dependencies / BOM dependencies /
dep management as-is to coursier, and lets coursier handle / compose
those.
@alexarchambault

This comment was marked as outdated.

@alexarchambault

This comment was marked as outdated.

@alexarchambault

This comment was marked as resolved.

These ought to be replaced by having those plugins use compile-time
dependencies for most of their own dependencies
By marking all worker dependencies as provided (aka compile-time only)
@alexarchambault

This comment was marked as resolved.

@alexarchambault alexarchambault marked this pull request as ready for review January 9, 2025 10:42
alexarchambault added a commit that referenced this pull request Jan 13, 2025
This PR makes Mill stop relying on the `resolveFilter` parameter of
`CoursierSupport#resolveDependencies`.

This parameter manually filters the artifacts provided by coursier, but
not the ones originating from "local dependencies" (Mill's hack to use
freshly built workers and plugins during tests), where it seems such a
filtering was done in the build (by customizing `testDepPaths`). "Local
dependencies" cannot be kept as-is in
#4145 (where such dependencies
aren't necessarily root dependencies anymore, making it harder to handle
them in a different fashion). Hence the need to stop using
`resolveFilter`.

Rather than filtering artifacts after dependency resolution, this PR
marks unwanted dependencies of workers as compile-time only. That way,
they're not pulled by dependency resolution, and we don't need to
manually filter artifacts.
@lihaoyi
Copy link
Member

lihaoyi commented Jan 14, 2025

I think this looks good to me. There are some classpath re-orderings, but we don't generally promise stable classpath orderings anyway.

The transitiveIvyDeps change is potentially breaking if people override transitiveIvyDeps and expect it to take effect, but I think that's probably a rare enough case that we can overlook it

@@ -84,6 +84,8 @@ trait CoursierModule extends mill.Module {
*/
def mapDependencies: Task[Dependency => Dependency] = Task.Anon { (d: Dependency) => d }

def internalRepositories: Task[Seq[Repository]] = Task.Anon(Nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs is missing. What is this? How it differentiates to repositoriesTask?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some. The internal in-memory repo with the coursier projects of Mill's modules is meant to go there.

Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the role of the internal repositories? It looks like some essential implementation detail, but is exposed as task. Maybe, it's visibility can be reduced or it can be moved into some worker.

You introduced a new def coursierDependency which seems to represent the module itself. Is this in the spirit of PublishModule.publishSelfDependency? It seems to ignore any user settings regarding the artifact name and organization. Is it only used internally? What impact has it, when it's different from what publishSelfDependency returns?

Comment on lines +426 to +428
* The `coursier.Dependency` to use to refer to this module
*/
def processDependency(
overrideVersions: Boolean = false
): Task[coursier.core.Dependency => coursier.core.Dependency] = Task.Anon {
val bomDeps0 = allBomDeps().toSeq.map(_.withConfig(Configuration.compile))
val depMgmt = processedDependencyManagement(
depManagement().toSeq.map(bindDependency()).map(_.dep)
def coursierDependency: cs.Dependency =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ignoring all user customizations to artifactName and artifactId.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's on purpose for now, yes

@alexarchambault
Copy link
Collaborator Author

Can you explain the role of the internal repositories? It looks like some essential implementation detail, but is exposed as task. Maybe, it's visibility can be reduced or it can be moved into some worker.

It has some scaladoc in JavaModule (going to copy it in CoursierModule). It's meant to have the internal repo that knows about this module and it's module dependencies. It's specific to each module, so I'm not sure it can go in a worker. Maybe it could be made protected, yes 🤔

@alexarchambault
Copy link
Collaborator Author

You introduced a new def coursierDependency which seems to represent the module itself. Is this in the spirit of PublishModule.publishSelfDependency? It seems to ignore any user settings regarding the artifact name and organization. Is it only used internally? What impact has it, when it's different from what publishSelfDependency returns?

It's only used internally, yes. I thought about making it use the same coordinates as for publishing, but I fear it might introduce ambiguity, if users mistakenly give the same coordinates to two modules. Also, all JavaModules don't extend PublishModule, so we need to have an internal naming scheme for those.

There shouldn't be any impact… These internal coordinates leak in the dependency trees, but some post-processing (added in this PR) hides those (it actually only keeps the internal name, built out of the module segments, which should look familiar enough to users).

Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexarchambault Thanks for the insights. I think, we should mark all these internal methods with @internal and reduce their visibility as much as possible so that we can move/remove them later. Ideally, we can group and hide them in some inner coursier object in the future.

Besides, this PR looks good to me.

Copy link
Collaborator Author

@alexarchambault alexarchambault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments about tricky points or odd changes, just in case

@@ -243,8 +243,7 @@ out/mill-server

> cat out/mill-build/methodCodeHashSignatures.dest/current/spanningInvalidationTree.json
{
"call scala.runtime.BoxesRunTime.boxToInteger(int)java.lang.Integer": {},
"call scala.Predef$#println(java.lang.Object)void": {
"call scala.runtime.BoxesRunTime.boxToInteger(int)java.lang.Integer": {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to mention that I'm not familiar enough with the hash signature stuff to really understand why this change is needed. Oddly enough, that needs to be reverted in #4155.

},
{
"name": "org.hamcrest:hamcrest-core",
"version": "1.3"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file are a good thing: rather than including only dependencies directly added in Mill modules (via transitiveIvyDeps and all), this adds the whole dependency graph (including transitive dependencies pulled externally). IIUC, this is what is expected from that BSP request.

@@ -11,7 +11,7 @@
<orderEntry type="library" name="scala-js-SDK-3.3.1" level="project"/>
<orderEntry type="library" name="portable-scala-reflect_sjs1_2.13-1.1.3.jar" level="project"/>
<orderEntry type="library" name="scala-library-2.13.13.jar" level="project"/>
<orderEntry type="library" name="scala3-library_sjs1_3-3.3.1.jar" level="project"/>
<orderEntry type="library" name="scala3-library_sjs1_3-3.3.3.jar" level="project"/>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't investigate why the version is bumped here. That doesn't look like a major issue though…

|│ └─ com.lihaoyi:upickle-core_2.13:1.4.0
|│ └─ com.lihaoyi:geny_2.13:0.6.10
|└─ org.scala-lang:scala-library:2.13.10
"""├─ StandaloneScala213
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know see the moduleDep in this tree (and some of the next ones too), which is a good thing IMO

*/
def webJarDeps = Task {
transitiveIvyDeps().filter(_.dep.module.organization.value == "org.webjars")
ivyDeps()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This slightly changes the semantics of the previous code. Before, ivyDeps of the moduleDeps (and their own moduleDeps, etc.) were taken into account. Now, only the ivyDeps of the current module are.

Alternatively, we could have a look at the whole dependency graph (including external transitive dependencies). But only looking at direct dependencies of the Mill modules looks odd to me (what if the Mill project is split or is a merge of two Mill projects - that changes the value of this task, while it shouldn't IMO).

@lihaoyi
Copy link
Member

lihaoyi commented Jan 16, 2025

@alexarchambault feel free to merge this whenever you're ready

@alexarchambault alexarchambault merged commit 1d12ccb into com-lihaoyi:main Jan 16, 2025
28 checks passed
@alexarchambault alexarchambault deleted the java-module-cs-project branch January 16, 2025 11:26
@lefou lefou added this to the 0.12.6 milestone Jan 16, 2025
Comment on lines +595 to +597
ProblemFilter.exclude[ReversedMissingMethodProblem]("mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$internalRepositories"),
ProblemFilter.exclude[ReversedMissingMethodProblem]("mill.scalanativelib.ScalaNativeModule.mill$scalanativelib$ScalaNativeModule$$super$coursierProject"),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexarchambault these turn out not to be safe, we probably need to fix the code to not add calls to super since they are not binary compatible

CC @sake92

Copy link
Member

@lefou lefou Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I said it before, but I suggest to generate the access to super.<overridden> from our macros as soon as we can break bin-compat, so that we better prepare the binary code for ad-hoc overriding of tasks. The super-calls can be left unused, so they should not affect functionality, but we at least we trigger the super-accessors to be present in the binary code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to address that in #4380, although the workarounds aren't fully satisfying IMO (they assume users don't add overrides between the trait defining the method and the one overriding it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other such changes adding calls to super have been introduced in the past, apparently:

mill/build.mill

Lines 523 to 580 in 9815c35

ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalajslib.ScalaJSModule.mill$scalajslib$ScalaJSModule$$super$scalaLibraryIvyDeps"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.ScalaModule.mill$scalalib$ScalaModule$$super$zincAuxiliaryClassFileExtensions"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalajslib.ScalaJSModule.mill$scalajslib$ScalaJSModule$$super$zincAuxiliaryClassFileExtensions"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalanativelib.ScalaNativeModule.mill$scalanativelib$ScalaNativeModule$$super$zincAuxiliaryClassFileExtensions"
),
// (6x) See https://github.com/com-lihaoyi/mill/pull/3064
// Moved targets up in trait hierarchy, but also call them via super, which I think is safe
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$zincWorker"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$runClasspath"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$runUseArgsFile"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$forkArgs"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$forkEnv"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$forkWorkingDir"
),
// (8x)
// Moved targets up in trait hierarchy, but also call them via super, which I think is safe
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$localRunClasspath"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$runLocal"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$run"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$doRunBackground"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$runBackgroundLogToConsole"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$runMainBackground"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$runMainLocal"
),
ProblemFilter.exclude[ReversedMissingMethodProblem](
"mill.scalalib.JavaModule.mill$scalalib$JavaModule$$super$runMain"
),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants