-
-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to relocate/shade in assembly #947
Conversation
09270d7
to
e8edff1
Compare
@joan38 Did you look at jarjar-abrams, instead of jar-relocator? It allows to shade / relocate classes, but it also adjusts the scala annotations accordingly, so that the shaded classes can be used from scala code too (if library A is shaded in library B, and scala users depend on B, they'll be able to reference the shaded classes of A thanks to that. This doesn't work with other shading / relocating libraries, because of the discrepancy after shading between the - shaded - class names and the - non-shaded - scala annotations). |
Hi @alexarchambault, I saw it while trying to implement the same as sbt-assembly and it's usage here and here and got stuck on how to use the lib (missing documentation). I'm a little busy with other stuff at work right now but I can give it a try later. If I need some help can I ask you? |
Sound like https://github.com/pantsbuild/jarjar is archived and has no activity since 2018. Is it safe to use that? |
Someone will probably fork it at some point… It's originally a fork itself too. |
Sure! |
As another example, https://github.com/coursier/sbt-shading uses pantsbuild/jarjar. I need to migrate it to jarjar-abrams. |
529060b
to
efd7894
Compare
@alexarchambault I have done the switch to jarjar-abrams but I didn't want to unzip, write jars, modify/shade jars, recopy jars. It's too many operations on disk. So I copy and modified the Also if it's easier for you, I'd be happy to schedule a 30min call with you guys, let me know. |
@joan38 Feel free to send a PR to jarjar-abrams. I've split it out of sbt-assembly so everyone can use it. |
Following from the Gitter chat with @lihaoyi, this "looks reasonable at a glance" but would like someone else to also review. |
I never really reviewed mill code, but I can have a look. It would have been nice to minimize the diff somehow… It seems there's quite a few added / removed empty lines, renamed variables, re-formatting, that hinder reviewing. |
@alexarchambault I just pushed the change in 2 separate commits for easier review. |
main/src/modules/Assembly.scala
Outdated
.entries() | ||
.asScala | ||
.filterNot(_.isDirectory) | ||
.map(entry => entry.getName -> jarFile.getInputStream(entry)) | ||
.map(entry => jarFile.getInputStream(entry) -> entry.getName) | ||
Shader.shadeInputStreams(shadeRules, mappings.toSeq, verbose = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shading jar entries
main/src/modules/Assembly.scala
Outdated
.walk(path) | ||
.filter(os.isFile) | ||
.map(subPath => subPath.relativeTo(path).toString -> os.read.inputStream(subPath)) | ||
.map(subPath => os.read.inputStream(subPath) -> subPath.relativeTo(path).toString) | ||
Shader.shadeInputStreams(shadeRules, pathsWithMappings, verbose = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shading class files (that are not jars)
private def classpathIterator(inputPaths: Agg[os.Path]): Agg[(String, InputStream)] = | ||
private def classpathIterator(inputPaths: Agg[os.Path], assemblyRules: Seq[Assembly.Rule]): Agg[(String, InputStream)] = { | ||
val shadeRules = assemblyRules.collect { | ||
case Rule.Relocate(from, to) => ShadePattern.Rename(List(from -> to)).inAll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer exposing our own API (Rule.Relocate
) and mapping it to jarjar-abram's ShadePattern.Rename
in case we are swapping the shading lib.
@@ -29,6 +30,8 @@ object Assembly { | |||
|
|||
case class Exclude(path: String) extends Rule | |||
|
|||
case class Relocate(from: String, to: String) extends Rule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Later we can add more config on this to apply the shading only to specific libs.
55514ce
to
4e60faf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main/src/modules/Assembly.scala
Outdated
@@ -77,52 +70,39 @@ object Assembly { | |||
} | |||
} | |||
|
|||
private def classpathIterator(inputPaths: Agg[os.Path]): Generator[AssemblyEntry] = { | |||
Generator.from(inputPaths) | |||
private def classpathIterator(inputPaths: Agg[os.Path]): Agg[(String, InputStream)] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change that Generator
to an Agg
? Generator
seemed safer, in case the os.walk
returns many files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the first level inputPaths
back to Generator.from(inputPaths)
but the os.walk
is not possible unless jarjar-abram supports Generator
. It looks for a Seq
maybe we should be more open and require a Iterable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with the signature in my comment below, we shouldn't need to have jarjar-abrams be aware of the collection here.
main/src/modules/Assembly.scala
Outdated
.entries() | ||
.asScala | ||
.filterNot(_.isDirectory) | ||
.map(entry => entry.getName -> jarFile.getInputStream(entry)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The handling of the InputStream
, via a () => InputStream
seemed safer before this. I think they're then opened one-by-one down the line, rather than all at once with these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jarjar-abram is looking for a Seq[(String, InputStream)]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Maybe then, Shader.shadeInputStreams
in jarjar-abrams could be changed to something like
def processByteCode(
rules: Seq[ShadeRule],
verbose: Boolean
): (String, Array[Byte]) => Option[(String, Array[Byte])]
This method could be used to process the InputStream
s one by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done (String, InputStream) => Option[(String, InputStream)]
in eed3si9n/jarjar-abrams#6.
I monkey patched in this PR for the time being to see how it plays. What's your thought on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed again with Array[Byte]
let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this, I feel like I prefer the InputStream
over Array[Byte]
because at least with an InputStream
we are not loading all the jars in memory if there is no rules. And also in the future we can think of jarjar-abram or another one that could process the shading on the fly inside an InputStream
. Maybe?
I added this commit to see how it looks with InputStream
. @alexarchambault let me know what you think.
a285fd7
to
8ae6772
Compare
@lefou this is good to go. |
@joan38 We have failing test cases which look related to the removed append logic. E.g.
|
Ah I didn't see it. Now it's fixed. |
@joan38 I find it really hard reviewing your PRs, as you constantly rebase exising commit(s). Adding new commits would better allow tracking what changed and would speed up review quite a bit. In most cases we squash when merging anyways, so adding commits is not an issue. |
Thanks for the feedback @lefou, I'll push again separating and avoid amend commits in the futur. |
2004f7b
to
fda8193
Compare
@@ -267,19 +262,12 @@ object Jvm { | |||
def createAssembly(inputPaths: Agg[os.Path], | |||
manifest: JarManifest = JarManifest.Default, | |||
prependShellScript: String = "", | |||
base: Option[os.Path] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to remove this because we need to process both the project's classes and dependencies with the same shader instance.
@@ -293,30 +281,18 @@ object Jvm { | |||
manifest.build.write(manifestOut) | |||
manifestOut.close() | |||
|
|||
Assembly.groupAssemblyEntries(inputPaths, assemblyRules).view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for .view
because we execute only a foreach right after
val separated = | ||
if (entries.isEmpty) Nil | ||
else | ||
entries.head +: entries.tail.flatMap { e => | ||
List(JarFileEntry(e.mapping, () => new ByteArrayInputStream(separator.getBytes)), e) | ||
} | ||
val concatenated = new SequenceInputStream( | ||
Collections.enumeration(separated.map(_.inputStream).asJava)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concatenation is handled in groupAssemblyEntries
now.
@@ -335,16 +311,14 @@ object Jvm { | |||
PathRef(output) | |||
} | |||
|
|||
private def writeEntry(p: java.nio.file.Path, is: InputStream, append: Boolean): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need append
now since we are doing the InputStreams
concat and we are not merging with the upstream jar.
I pushed again and added comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for your extra comments.
We now no longer use the upstreamAssembly
target, but keep it. Give me some time to think about it. Maybe we can deprecate and remove that target in the future. If I understood it correctly, it's purpose was optimization, which we've lost now. Otherwise, I'm going to merge.
Alright after writing a reply to your comment @lefou I realized the importance of I actually managed to revert back this benefit keeping the shading. Have a look. |
62ff286
to
81b0023
Compare
Build is green now @lefou if you want to take a look. |
Not sure how to read that statement. Just to clarify: In its current state this PR makes building assemblies significantly slower, right? For the sake of class relocation support. |
It used to be slower in the case of running assembly then modifying your code and rerunning assembly. Because we would reprocess all the dependencies again. Now it's back to what it was before, and we have shading. So no compromise done. |
Ok, great! Thank you very much! |
Thanks @lefou, @alexarchambault and @eed3si9n ! |
Resolves #355
This gives the ability to relocate/shade in assembly on a global level.
This is a good first step but we might want later to provide the ability to shade only for a given lib. This is currently not possible because at the time we get the class path we lost the information of the dependency tree.
Example of usage: