Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch a perf bottleneck #79

Closed
wants to merge 1 commit into from
Closed

patch a perf bottleneck #79

wants to merge 1 commit into from

Conversation

fommil
Copy link

@fommil fommil commented Oct 13, 2017

I did some analysis that looked the same as sbt/zinc#371 then I came up with this patch.

I've spent the last few hours trying to apply this as a monkey patch, but I can't get it to work, so I have absolutely no idea if it improves anything or not.

UPDATE this needs to actually calculate the hash. My point is that using NIO should be a lot faster than FileFilterStream, which is using 99% of my CPU in the profile (not the calculation of the hash).

@@ -52,7 +53,7 @@ object Hash {
def apply(as: Array[Byte]): Array[Byte] = apply(new ByteArrayInputStream(as))

/** Calculates the SHA-1 hash of the given file.*/
def apply(file: File): Array[Byte] = Using.fileInputStream(file)(apply)
def apply(file: File): Array[Byte] = Files.readAllBytes(file.toPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wat. This reads all the bytes in the file. It doesn't calculate the SHA-1 hash for it..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well then apply the hash to it afterwards... the slow bit is the FileFilterStream

like I said, I have absolutely no way of validating anything about this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  def apply(file: File): Array[Byte] = apply(Files.readAllBytes(file.toPath))

@fommil
Copy link
Author

fommil commented Oct 14, 2017

using sbt/launcher#43

(and including the sbt.boot.properties from my system's sbt-launch.jar) I am able to start up sbt with a command like

/usr/lib/jvm/java-8-openjdk/bin/java -Dsbt.launcher.monkey=/home/fommil/Projects/io/io/target/scala-2.12/io_2.12-1.1.0-4cdfe23e77f75495dc363fb99748474bedca8edb.jar -XX:MaxMetaspaceSize=1g -Xss2m -Xms1g -Xmx2g -XX:ReservedCodeCacheSize=128m -XX:+CMSClassUnloadingEnabled -jar /home/fommil/Projects/launcher/target/sbt-launch-1.0.2-SNAPSHOT.jar

and get a CPU profile in yourkit.

Unfortunately, it looks like the bottleneck is still there in FilterInputStream coming when the Array[Byte] is wrapped up. In fact, the performance is worse with this patch (and I'm not sure why that should be the case)

So the actual bottleneck is the digest calculation.

I don't understand why we calculate this again for every file, even if the size and timestamp has not changed.

@fommil fommil closed this Oct 14, 2017
@fommil
Copy link
Author

fommil commented Oct 14, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants