Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent versions of gemma are slower #136

Closed
pjotrp opened this issue Feb 21, 2018 · 4 comments
Closed

Recent versions of gemma are slower #136

pjotrp opened this issue Feb 21, 2018 · 4 comments
Assignees
Milestone

Comments

@pjotrp
Copy link
Member

pjotrp commented Feb 21, 2018

As referred in #130 creating a new issue here. @pcarbo is 0.96 also faster?

@pjotrp pjotrp self-assigned this Feb 21, 2018
@pjotrp pjotrp added this to the 0.98 release milestone Feb 21, 2018
@pcarbo
Copy link
Collaborator

pcarbo commented Feb 21, 2018

@pjotrp GEMMA v0.96 is also faster (total runtime was 73 seconds). So it must be a change that was introduced after v0.96.

@pjotrp
Copy link
Member Author

pjotrp commented Feb 21, 2018

Thanks confirming. I am not on Macos, so had to rule that out. We did change some calls to BLAS etc. Culprit should be pretty obvious.

@pjotrp
Copy link
Member Author

pjotrp commented Jul 14, 2018

Hi @pcarbo, it is interesting to compare 0.96, 0.97 and 0.98-prerelease. I have some metrics here https://github.com/genenetwork/GEMMA/blob/gemma-0.98-preview/test/performance/releases.org

Somehow lapack bled into the 0.97 release. So openblas was not used to full potential. A bit embarrassing for someone who claims to control the dependency graph(!?)

0.96 is a single core Eigenlib version and 0.97 went multi-core with openblas. Unfortunately I linked in lapack and an older BLAS which slowed things down. In 0.98 openblas is mostly used and is faster.

I can't comment on the MacOS versions, but it is good to check with ldd what was linked in. Another reason not to do static releases.

@pjotrp
Copy link
Member Author

pjotrp commented Jul 14, 2018

@pcarbo running the data of #130 shows for the different versions

version 0.97 (with -no-check switch)

real    0m46.849s
user    2m29.088s
sys     1m42.528s

Running 0.98-prerelease with the -no-check switch is even faster

real    0m37.588s
user    2m54.096s
sys     0m17.100s

But Eigenlib versions 0.95 and 0.96 are actually much slower on Linux - and use only one core

version 0.96:

real    1m53.732s
user    1m52.568s
sys     0m1.144s

version 0.95:

real    1m53.643s
user    1m52.524s
sys     0m1.104s

Conclusion: 0.97 is pretty fast, but you need to switch checking off with the -no-check switch. Without checking later versions compare to 0.96 and earlier behavior without validation. Only do so at your own risk because they are useful, see #72 for example. The overall speed gains are due to the use of multi-core openblas since 0.97.

The command I used with the releases on github looks like

time ../gemma-release-0.97/gemma-gn2-0.97-c760aa0-xqhsidq7h5/bin/gemma -lmm 4 -g gemma.130.g.small.bimbam   -p gemma.130.p.txt -k gemma.130.k.cXX.txt -notsnp -n 1 -r2 1 -no-check

I think we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants