Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA 11/A100 Support #866

Closed
afiaka87 opened this issue Mar 17, 2021 · 4 comments
Closed

CUDA 11/A100 Support #866

afiaka87 opened this issue Mar 17, 2021 · 4 comments
Assignees

Comments

@afiaka87
Copy link

afiaka87 commented Mar 17, 2021

I saw in another thread that there are plans to target the A100 next. This sounds very useful to me as I'm trying to use sparse attention in another project and I've had luck getting access to A100's recently.

CUDA 11 is quickly becoming the de facto standard on a lot of cloud servers. I'm aware that I could roll back to 10.2, but I'd love to get the improvements of both.

Anyway, keep me posted on this feature if you can.

Thanks

@afiaka87 afiaka87 changed the title A100 Support CUDA 11/A100 Support Mar 27, 2021
@awan-10 awan-10 self-assigned this Apr 14, 2021
@awan-10
Copy link
Contributor

awan-10 commented Apr 14, 2021

Hi @afiaka87, can you please elaborate the problem a bit more?

Did you try to build from source on a CUDA11 machine and it failed?

We do support CUDA11 and A100.

@afiaka87
Copy link
Author

afiaka87 commented Apr 22, 2021

We're having a lot of trouble getting Sparse attention specifically to work. The issue remains even after the ZeRO Infinity update. I'll follow up tomorrow with an error message

@awan-10

@helena-balabin
Copy link

helena-balabin commented May 26, 2021

Do you have any advice on how to configure deepspeed for CUDA11 and an A100? I'm currently trying to set it up, but I'm constantly running into incompatibility issues.

@loadams
Copy link
Collaborator

loadams commented Aug 18, 2023

Hi @afiaka87 and @helena-balabin - I'm closing this issue as stale given the age of it. Cuda/Torch for A100s should be more well supported now there and in DeepSpeed. If you are still having any issues, please open a new issue and link this one and I'd be happy to take a look at any issues.

Thanks!

@loadams loadams closed this as completed Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants