Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change .ravel() back to .flatten() for gates #79

Merged
merged 1 commit into from
Mar 30, 2022
Merged

Conversation

stavros11
Copy link
Member

The latest version of qibojit fails when running multigpu circuits on a machine with multiple physical GPUs. For example the following minimal example:

from qibo import models, gates

circuit = models.Circuit(3, accelerators={"/GPU:0": 1, "/GPU:1": 1})
circuit.add(gates.H(0))

final_state = c()
print(final_state)

fails with

cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorIllegalAddress: an illegal memory access was encountered

After checking earlier qibojit versions I realized that the error is associated with #74 and particularly because .flatten() was changed to .ravel() for some gate castings. This is fixed here and the above example works. Also, all tests are passing.

@scarrazza @andrea-pasquale, if you have access to a machine with multiple physical GPUs it would be good if you could check if the latest qibojit main gives you the above error and confirm that this PR fixes it. I found this while running the qibo tests on DGX. Note that the machine needs to have at least two GPUs, if there is only one (or the second GPU is hidden with CUDA_VISIBLE_DEVICES) then the error will not appear.

@stavros11 stavros11 added the bug Something isn't working label Mar 28, 2022
@codecov
Copy link

codecov bot commented Mar 28, 2022

Codecov Report

Merging #79 (f7afbec) into main (d3ff74e) will not change coverage.
The diff coverage is n/a.

@@            Coverage Diff            @@
##              main       #79   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            9         9           
  Lines         1088      1088           
=========================================
  Hits          1088      1088           
Flag Coverage Δ
unittests 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/qibojit/custom_operators/platforms.py 100.00% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d3ff74e...f7afbec. Read the comment docs.

Copy link
Contributor

@andrea-pasquale andrea-pasquale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this @stavros11.
I can confirm that main fails on dom when running with 2 GPUs.
This PR fixes the issue.

@scarrazza scarrazza merged commit e4b87ef into main Mar 30, 2022
@scarrazza scarrazza deleted the fixmultigpu branch August 17, 2022 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants