Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to use python 3.10 #11

Merged
merged 7 commits into from
Jun 6, 2024
Merged

Conversation

TimSC
Copy link
Collaborator

@TimSC TimSC commented Aug 23, 2023

Update to use python 3.10, recent torch libraries, plus minor fixes

@TimSC TimSC changed the title Update to use python 10 Update to use python 3.10 Aug 23, 2023
@ZhenglinZhou
Copy link
Owner

Hi @TimSC, many thanks for your PR!

But I am worried about the performance difference when changing the torch version. Do you mind evaluating this new version on WFLW?

If you have any questions, feel free to leave a comment or email me ([email protected]).

@TimSC
Copy link
Collaborator Author

TimSC commented Sep 10, 2023

Good point, I didn't evaluate the speed on either pytorch version. I might check my fixes work on python 3.8/pytorch 1.6 using anaconda as well. Do you have a way to properly evaluate the speed? Are we talking about training or testing speed or both?

I'm immediately suspecting my change to _covars.symeig(eigenvectors=True) was not right. There are several related decomposition functions depending on the matrix properties. I just picked the one that was recommended, rather than the fastest one.

@TimSC
Copy link
Collaborator Author

TimSC commented Sep 22, 2023

I ran my branch with torch 1.7 and torch 2.0.1. (I could not run torch 1.6 because there was no prebuilt version for CUDA 11, which my GPU requires). It looks like there is a performance difference.

One epoch takes 10 minutes with torch 1.7 and 33 minutes with torch 2.0.1. I was using the WFLW dataset with batch_size=16. Any idea why that might be? (Possibly _covars.symeig or possibly not.)

log-py3.7-torch1.7.txt
log-py3.10-torch2.0.1.txt

@TimSC
Copy link
Collaborator Author

TimSC commented Sep 23, 2023

I tried various versions of torch and found the performance is consistent between 1.7 and 1.13. However, there is a performance drop moving to torch 2.0. This may be because torch 2 models require torch.compile to be fast but attempting model compilation hits a different error. As far as this PR is concerned, we may as well stick with torch 1.13.

I'm now thinking _covars.symeig is not the cause of the performance problem.

log-py3.7-torch1.7.0-batchsize8-oldeig.txt
log-py3.7-torch1.9.0-batchsize8-neweig.txt
log-py3.7-torch1.9.0-batchsize8-oldeig.txt
log-py3.7-torch1.13.0-batchsize8.txt
log-py3.10-torch1.13.1-batchsize8.txt
log-py3.10-torch2.0.1-batch8.txt

I switched to batchsize of 8 because some versions of torch were running out of GPU memory.

…rformance, compile in torch 2.0 not working
@ZhenglinZhou ZhenglinZhou merged commit 9b12574 into ZhenglinZhou:master Jun 6, 2024
@ZhenglinZhou
Copy link
Owner

Hi @TimSC, thank you very much! I believe this update may make STAR more convenient.

@ZhenglinZhou
Copy link
Owner

Hi, @TimSC. Thanks again! However, I noticed that you are not listed as a contributor for STAR. It's strange. I hope you can be a collaborator of this project. A collaborator invitation has been sent, please kindly accept it. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants