Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError when running the scanner #3091

Closed
uszeiss opened this issue Sep 18, 2020 · 28 comments
Closed

OutOfMemoryError when running the scanner #3091

uszeiss opened this issue Sep 18, 2020 · 28 comments
Assignees
Labels
bug Issues that are considered to be bugs scanner About the scanner tool

Comments

@uszeiss
Copy link
Contributor

uszeiss commented Sep 18, 2020

Hi,

I have a fairly large Maven project (about 500 entries in the dependency tree) that I want to scan with ORT. The analyzer works as expected and produces an analyzer-result.json file of about 1 M.

When I run the scanner, though, It runs for a while and then terminates with the following error message:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

The exact point of failure varies with every run, but generally it happens at a rather advanced stage of the scan (say at 392/486 packages scanned).

A typical stack trace looks like this:

15:26:15.236 [FileBasedStorage with XZCompressedLocalFileStorage backend-4] INFO  kotlinx.coroutines.CoroutineScope - Looking for stored scan results for Maven:org.eclipse.persistence:org.eclipse.persistence.jpa
.modelgen.processor:2.7.3 and ScannerDetails(name=ScanCode, version=3.0.2, configuration=--copyright --license --ignore *.ort.yml --info --strip-root --timeout 300 --ignore HERE_NOTICE --ignore META-INF/DEPENDEN
CIES --json-pp --license-diag) (384/486).
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:944)
        at java.base/java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2266)
        at java.base/java.lang.ProcessImpl.initStreams(ProcessImpl.java:389)
        at java.base/java.lang.ProcessImpl.lambda$new$0(ProcessImpl.java:352)
        at java.base/java.lang.ProcessImpl$$Lambda$144/0x0000000800551040.run(Unknown Source)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:351)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
        at org.ossreviewtoolkit.utils.ProcessCapture.<init>(ProcessCapture.kt:86)
        at org.ossreviewtoolkit.utils.ProcessCapture.<init>(ProcessCapture.kt:29)
        at org.ossreviewtoolkit.utils.ProcessCapture.<init>(ProcessCapture.kt:33)
        at org.ossreviewtoolkit.utils.CommandLineTool$DefaultImpls.run(CommandLineTool.kt:100)
        at org.ossreviewtoolkit.scanner.LocalScanner.run(LocalScanner.kt:78)
        at org.ossreviewtoolkit.utils.CommandLineTool$DefaultImpls.getVersion(CommandLineTool.kt:106)
        at org.ossreviewtoolkit.scanner.LocalScanner.getVersion(LocalScanner.kt:134)
        at org.ossreviewtoolkit.scanner.LocalScanner.getDetails(LocalScanner.kt:151)
        at org.ossreviewtoolkit.scanner.LocalScanner.scanPackage(LocalScanner.kt:194)
        at org.ossreviewtoolkit.scanner.LocalScanner$scanPackages$2$invokeSuspend$$inlined$map$lambda$1.invokeSuspend(LocalScanner.kt:166)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
        at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274)
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:84)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
        at org.ossreviewtoolkit.scanner.Scanner.scanOrtResult(Scanner.kt:105)
        at org.ossreviewtoolkit.commands.ScannerCommand.run(ScannerCommand.kt:165)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:171)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:180)

But I have seem these stack traces, too:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:61)
        at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:348)
        at java.base/sun.nio.cs.StreamDecoder.<init>(StreamDecoder.java:251)
        at java.base/sun.nio.cs.StreamDecoder.<init>(StreamDecoder.java:231)
        at java.base/sun.nio.cs.StreamDecoder.forInputStreamReader(StreamDecoder.java:78)
        at java.base/java.io.InputStreamReader.<init>(InputStreamReader.java:116)
        at kotlin.io.FilesKt__FileReadWriteKt.readText(FileReadWrite.kt:125)
        at kotlin.io.FilesKt__FileReadWriteKt.readText$default(FileReadWrite.kt:125)
        at org.ossreviewtoolkit.utils.ProcessCapture.getStdout(ProcessCapture.kt:67)
        at org.ossreviewtoolkit.utils.CommandLineTool$DefaultImpls.getVersion(CommandLineTool.kt:109)
        at org.ossreviewtoolkit.scanner.LocalScanner.getVersion(LocalScanner.kt:134)
        at org.ossreviewtoolkit.scanner.LocalScanner.getDetails(LocalScanner.kt:151)
        at org.ossreviewtoolkit.scanner.LocalScanner.scanPackage(LocalScanner.kt:194)
        at org.ossreviewtoolkit.scanner.LocalScanner$scanPackages$2$invokeSuspend$$inlined$map$lambda$1.invokeSuspend(LocalScanner.kt:166)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
        at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274)
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:84)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
        at org.ossreviewtoolkit.scanner.Scanner.scanOrtResult(Scanner.kt:105)
        at org.ossreviewtoolkit.commands.ScannerCommand.run(ScannerCommand.kt:164)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:171)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:180)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:16)
        at com.github.ajalt.clikt.core.CliktCommand.parse(CliktCommand.kt:395)
        at com.github.ajalt.clikt.core.CliktCommand.parse$default(CliktCommand.kt:392)
        at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:410)
        at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:435)
        at org.ossreviewtoolkit.OrtMainKt.main(OrtMain.kt:174)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space                
        at org.ossreviewtoolkit.utils.CommandLineTool$DefaultImpls.getVersion(CommandLineTool.kt:137)
        at org.ossreviewtoolkit.scanner.LocalScanner.getVersion(LocalScanner.kt:134)
        at org.ossreviewtoolkit.scanner.LocalScanner.getDetails(LocalScanner.kt:151)
        at org.ossreviewtoolkit.scanner.LocalScanner.scanPackage(LocalScanner.kt:194)
        at org.ossreviewtoolkit.scanner.LocalScanner$scanPackages$2$invokeSuspend$$inlined$map$lambda$1.invokeSuspend(LocalScanner.kt:166)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
        at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274)
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:84)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
        at org.ossreviewtoolkit.scanner.Scanner.scanOrtResult(Scanner.kt:105)
        at org.ossreviewtoolkit.commands.ScannerCommand.run(ScannerCommand.kt:164)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:171)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:180)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:16)
        at com.github.ajalt.clikt.core.CliktCommand.parse(CliktCommand.kt:395)
        at com.github.ajalt.clikt.core.CliktCommand.parse$default(CliktCommand.kt:392)
        at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:410)
        at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:435)
        at org.ossreviewtoolkit.OrtMainKt.main(OrtMain.kt:174)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space                        
        at kotlin.io.TextStreamsKt.copyTo(ReadWrite.kt:122)                 
        at kotlin.io.TextStreamsKt.copyTo$default(ReadWrite.kt:120)                                  
        at kotlin.io.TextStreamsKt.readText(ReadWrite.kt:107)                       
        at kotlin.io.FilesKt__FileReadWriteKt.readText(FileReadWrite.kt:125)        
        at kotlin.io.FilesKt__FileReadWriteKt.readText$default(FileReadWrite.kt:125) 
        at org.ossreviewtoolkit.utils.ProcessCapture.getStdout(ProcessCapture.kt:67)                                                                                                                         
        at org.ossreviewtoolkit.utils.ProcessCapture.<init>(ProcessCapture.kt:133)               
        at org.ossreviewtoolkit.utils.ProcessCapture.<init>(ProcessCapture.kt:29)
        at org.ossreviewtoolkit.utils.ProcessCapture.<init>(ProcessCapture.kt:33)        
        at org.ossreviewtoolkit.utils.CommandLineTool$DefaultImpls.run(CommandLineTool.kt:100)
        at org.ossreviewtoolkit.scanner.LocalScanner.run(LocalScanner.kt:78)    
        at org.ossreviewtoolkit.utils.CommandLineTool$DefaultImpls.getVersion(CommandLineTool.kt:106)
        at org.ossreviewtoolkit.scanner.LocalScanner.getVersion(LocalScanner.kt:134)    
        at org.ossreviewtoolkit.scanner.LocalScanner.getDetails(LocalScanner.kt:151)
        at org.ossreviewtoolkit.scanner.LocalScanner.scanPackage(LocalScanner.kt:194)
        at org.ossreviewtoolkit.scanner.LocalScanner$scanPackages$2$invokeSuspend$$inlined$map$lambda$1.invokeSuspend(LocalScanner.kt:166)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
        at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274)
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:84)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
        at org.ossreviewtoolkit.scanner.Scanner.scanOrtResult(Scanner.kt:105)
        at org.ossreviewtoolkit.commands.ScannerCommand.run(ScannerCommand.kt:164)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:171)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:180)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:16)
        at com.github.ajalt.clikt.core.CliktCommand.parse(CliktCommand.kt:395)
        at com.github.ajalt.clikt.core.CliktCommand.parse$default(CliktCommand.kt:392)
        at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:410)

(Sorry for pasting so excessively.)

Based on this and this issue, I put a gradle.properties next to the pom.xml file of the project to be scanned and, to be on the safe side, I tried setting JAVA_OPTS="-Xms1024M -Xmx6g -XX:MaxMetaspaceSize=2g -Dkotlin.daemon.jvm.options='-Xmx2g'" before running the scanner, but without success.

Is there an error somewhere in my memory options? Do I need to allow for even more memory?

I work with the latest ORT code on Ubuntu 20.04 which runs in a Virtual Box environment under Windows. The VM has 10 G available as main memory. A colleague of mine tried to scan the project under Windows and ran into the same problem.

I did not experience this issue when I was scanning the same project with ORT a few months ago. I don't know what has changed since then.

I'd appreciate any help.

Thanks and keep up the great work.

@fb33
Copy link
Contributor

fb33 commented Sep 28, 2020

Hello, just for your information, I'm using for some projects up to 10g !
-Xms512M -Xmx10g -XX:MaxMetaspaceSize=6g -Dkotlin.daemon.jvm.options='-Xmx8g'

@uszeiss
Copy link
Contributor Author

uszeiss commented Sep 28, 2020

Thanks for your feedback, @fb33.
Do you work on Linux as well? And into which file exactly do you put these parameters?

@fb33
Copy link
Contributor

fb33 commented Sep 28, 2020

it works with docker. I pass it with an env var.
Here's an example from my jenkinsfile:

                        def result = sh(
                            returnStatus:true,
                            script : "docker run --rm $CPUS -v $WORKSPACE/project:/project -v $WORKSPACE/ort_home:/root/.ort -e \"JAVA_OPTS=$ORT_JAVA_OPTS\" $ORT_IMAGE:$ORT_IMAGE_VERSION $LOG_LEVEL -c /project/ort-scanner.conf scan --skip-excluded -f JSON,YAML -i /project/reports/analyzer/analyzer-result.yml -o /project/reports/scanner"
                        )

where $ORT_JAVA_OPTS contains the options

@uszeiss
Copy link
Contributor Author

uszeiss commented Sep 30, 2020

I'm running some tests with the newest version (supporting ScanCode 3.2.1rc2) to check if the problem persists.

@sschuberth sschuberth added bug Issues that are considered to be bugs scanner About the scanner tool labels Oct 1, 2020
@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 1, 2020

When I run the scanner on my project for the first time, it will take about 20 hours and will finish successfully. If I run the scanner a second time on the same project, leaving any project files, ORT caches, and downloaded files untouched, it will crash due to insufficient memory as described above. This seems weird to me as the second run should use the cached results and finish quickly.

Any ideas?

@sschuberth
Copy link
Member

sschuberth commented Oct 1, 2020

Interesting. Which scan storage have you configured? "FileBasedStorage with XZCompressedLocalFileStorage" as mentioned above?

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 1, 2020

I assume I use the default local file storage as I have no fileBasedStorage section in my configuration file. I do have a scanner archive storage configured via

storage {
  localFileStorage {
    directory = ~/.ort/scanner/results
    compression = false
  }
}

but this seems to be something else...

@sschuberth
Copy link
Member

Ok, I was asking because our log above shows

15:26:15.236 [FileBasedStorage with XZCompressedLocalFileStorage backend-4] INFO  kotlinx.coroutines.CoroutineScope - Looking for stored scan results for Maven:org.eclipse.persistence:org.eclipse.persistence.jpa
.modelgen.processor:2.7.3 and ScannerDetails(name=ScanCode, version=3.0.2, configuration=--copyright --license --ignore *.ort.yml --info --strip-root --timeout 300 --ignore HERE_NOTICE --ignore META-INF/DEPENDEN
CIES --json-pp --license-diag) (384/486).

But now you say you have compression = false, which means no XZ compression should be used. Did you change the configuration meanwhile from compressed to uncompressed, but the OOM error occurs with both configurations?

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 1, 2020

Sorry, I think I mixed up the scanner archive storage and the scanner storage. Based on this reference config file, they are not the same. I'll try to add a scanner storage with no compression and run the scanner again.

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 2, 2020

Same result with the uncompressed local file storage: the second run terminates with an OutOfMemoryError. Do you think it might make sense to try the Postgres storage to avoid such an error?

@sschuberth
Copy link
Member

Using Postgres anyway is a good idea, and it might help us to narrow down the root cause, so yes.

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 3, 2020

Alright, give me about 24 hours and I'll report back.

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 5, 2020

Same result with Postgres, unfortunately. So it does not seem to be related to the file storage.

Naively, I'd expect the initial scan to use more memory than the subsequent one (which uses cached results), yet the initial scan runs fine whereas the second runs crashes.

Is there anything else I could test?

@sschuberth
Copy link
Member

Again, interesting. To me that means that the potential memory leak could be in the general code that deserializes already stored scan results. Is there a way you could profile memory use / create a heap dump and inspect which objects take up that much memory?

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 5, 2020

I'll see what I can do. Meanwhile, here's the log output for the second scan: scan.log

I removed the irrelevant parts in which the scanner looks for stored results. What's interesting to me is that the ScanCode version is requested multiple times before the OutOfMemoryError occurs.

@sschuberth
Copy link
Member

I see that you rely on ScanCode being bootstrapped, which is something I do not recommend to do, mostly for performance reasons. Could you install / configure ScanCode 3.2.1rc2 beforehand, put it in PATH, and then try to run ORT to see if the error persists?

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 6, 2020

Thanks for all of your recommendations. I finally made it work by setting my JAVA_OPTS to -Xmx8g -Dkotlin.daemon.jvm.options='-Xmx8g'.

I used VisualVM to monitor the memory consumption of the ORT scanner. The scan runs for about 10 minutes. The consumed memory is quite low (< 300M) for the first 5 minutes, then it quickly goes up to about 1G, and a minute later it jumps to about 7.5G - all while it is looking for stored scan results.

I could provide a heap dump if you're interested but the file is about 1G. The top 3 classes by size of instances are

  1. LinkedHashMap$Entry
  2. byte[]
  3. HashMap$Node

I don't quite understand why the memory consumption jumps significantly or why this second scanner run takes up much more memory than the first run, but for now I'm happy it works at all.

Thanks again for your support. It helped me a lot.

@sschuberth
Copy link
Member

Maybe it would be good to archive that heap dump for a little while, if possible. We have some (probably unrelated, but still) changes to the scan storage coming soon; maybe you could also test whether the issue can still be observed afterwards.

@uszeiss
Copy link
Contributor Author

uszeiss commented Oct 6, 2020

Sure, no problem. Let me know when I can run a re-test.

@sschuberth
Copy link
Member

sschuberth commented Dec 1, 2020

@mnonnenmacher realized that in many stack traces LocalScanner.getVersion() is contained mostly at the top. I recently made a change to call that function far less often.

So, can any of the affected reporters please check if the OutOfMemoryError persists with a recent version of ORT?

@uszeiss
Copy link
Contributor Author

uszeiss commented Dec 2, 2020

Thanks for the feedback. I'm running a scan of a large project using the ORT code, and I will report back when it's done.

@uszeiss
Copy link
Contributor Author

uszeiss commented Dec 2, 2020

My scan has finished successfully, I have seen no OOMEs. Thanks!

@sschuberth
Copy link
Member

Quite remarkable, thanks for testing! I'll still keep this open for a bit more time, until we're certain the issue does not occur anymore.

@fb33
Copy link
Contributor

fb33 commented Dec 2, 2020

Hello,
For information, I also test with the last version of master, and with my context I didn't see any amelioration.
I still have to configure 14Go for the pod that execute ORT. If I down to 10Go, the pod is killed by k8s due to excessive memory consuption.

@sschuberth
Copy link
Member

sschuberth commented Dec 2, 2020

FYI, additionally @oheger-bosch has filed a PR at #3412 that aims to reduce the scanner's memory usage in general.

@oheger-bosch
Copy link
Member

Regarding the getVersion() invocation mentioned by @mnonnenmacher and @sschuberth, I did some testing. I ran a simple test that executes this method in a loop. What I found out is that there is indeed a small memory leak caused by ProcessCapture calling deleteOnExit() on the temporary files it creates to capture the process output. deleteOnExit() causes the name of the file affected to be recorded in a static Set managed by the internal DeleteOnExitHook system class, so that it can be deleted later.

So each usage of the ProcessCapture class adds three file names to this set (for stdout, stderr, and the temporary directory). This could be cured by ensuring manually that the files are cleaned up, e.g. using a try-finally construct. However, my gut feeling is that the amount of memory wasted here is not really significant.

@sschuberth
Copy link
Member

What I found out is that there is indeed a small memory leak caused by ProcessCapture calling deleteOnExit() on the temporary files it creates to capture the process output.

Sorry for getting a bit off-topic, but FYI, the reason why files are used in the first place to redirect output is to avoid the need for "pumper" threads to pull data out of the stdout and stderr pipes. Another approach I was considering to solve that was to use https://github.com/brettwooldridge/NuProcess instead, but I never found the time to give that library a try.

@sschuberth
Copy link
Member

I'm closing this for now; please reopen if the issue reoccurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues that are considered to be bugs scanner About the scanner tool
Projects
None yet
Development

No branches or pull requests

4 participants