patch a perf bottleneck #79

fommil · 2017-10-13T14:24:26Z

I did some analysis that looked the same as sbt/zinc#371 then I came up with this patch.

I've spent the last few hours trying to apply this as a monkey patch, but I can't get it to work, so I have absolutely no idea if it improves anything or not.

UPDATE this needs to actually calculate the hash. My point is that using NIO should be a lot faster than FileFilterStream, which is using 99% of my CPU in the profile (not the calculation of the hash).

dwijnand · 2017-10-13T14:26:40Z

io/src/main/scala/sbt/io/Hash.scala

@@ -52,7 +53,7 @@ object Hash {
  def apply(as: Array[Byte]): Array[Byte] = apply(new ByteArrayInputStream(as))

  /** Calculates the SHA-1 hash of the given file.*/
-  def apply(file: File): Array[Byte] = Using.fileInputStream(file)(apply)
+  def apply(file: File): Array[Byte] = Files.readAllBytes(file.toPath)


Wat. This reads all the bytes in the file. It doesn't calculate the SHA-1 hash for it..

well then apply the hash to it afterwards... the slow bit is the FileFilterStream

like I said, I have absolutely no way of validating anything about this PR.

def apply(file: File): Array[Byte] = apply(Files.readAllBytes(file.toPath))

fommil · 2017-10-14T10:14:06Z

using sbt/launcher#43

(and including the sbt.boot.properties from my system's sbt-launch.jar) I am able to start up sbt with a command like

/usr/lib/jvm/java-8-openjdk/bin/java -Dsbt.launcher.monkey=/home/fommil/Projects/io/io/target/scala-2.12/io_2.12-1.1.0-4cdfe23e77f75495dc363fb99748474bedca8edb.jar -XX:MaxMetaspaceSize=1g -Xss2m -Xms1g -Xmx2g -XX:ReservedCodeCacheSize=128m -XX:+CMSClassUnloadingEnabled -jar /home/fommil/Projects/launcher/target/sbt-launch-1.0.2-SNAPSHOT.jar

and get a CPU profile in yourkit.

Unfortunately, it looks like the bottleneck is still there in FilterInputStream coming when the Array[Byte] is wrapped up. In fact, the performance is worse with this patch (and I'm not sure why that should be the case)

So the actual bottleneck is the digest calculation.

I don't understand why we calculate this again for every file, even if the size and timestamp has not changed.

fommil · 2017-10-14T10:17:12Z

relevant https://stackoverflow.com/questions/16321299/java-nio-vs-non-nio-performance

patch a perf bottleneck

b646c14

dwijnand reviewed Oct 13, 2017

View reviewed changes

fommil mentioned this pull request Oct 13, 2017

make it possible to monkey patch sbt's classes sbt/sbt#3638

Closed

fommil closed this Oct 14, 2017

This was referenced Oct 14, 2017

Switch from File APIs to NIO and NIO.2 APIs #37

Closed

Timestamp hashes sbt/zinc#434

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

patch a perf bottleneck #79

patch a perf bottleneck #79

fommil commented Oct 13, 2017 •

edited

Loading

dwijnand Oct 13, 2017

fommil Oct 13, 2017

dwijnand Oct 13, 2017

fommil Oct 14, 2017

fommil commented Oct 14, 2017 •

edited

Loading

fommil commented Oct 14, 2017

patch a perf bottleneck #79

patch a perf bottleneck #79

Conversation

fommil commented Oct 13, 2017 • edited Loading

dwijnand Oct 13, 2017

Choose a reason for hiding this comment

fommil Oct 13, 2017

Choose a reason for hiding this comment

dwijnand Oct 13, 2017

Choose a reason for hiding this comment

fommil Oct 14, 2017

Choose a reason for hiding this comment

fommil commented Oct 14, 2017 • edited Loading

fommil commented Oct 14, 2017

fommil commented Oct 13, 2017 •

edited

Loading

fommil commented Oct 14, 2017 •

edited

Loading