-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Dockerfiles for Accelerate #377
Conversation
The documentation is not available anymore as the PR was closed or merged. |
I know nothing about Docker, so asking Lysandre for a review :-) |
Great to have docker support! Helpful for production scenarios. Few comments:
|
These will also be used for having GPU and multi GPU tests, these mimic the transformers docker images for this exact purpose. (This is actually the true purpose behind this PR)
Yes they are, it's a partial limitation of the GPU image. I could go through the effort of having it use conda instead, but Sylvain and I discussed that 3.6 support will be dropped on the next release. Since it's so soon, it's easier to just have it be this way.
Nope, these assume the git repo is the top level directory and then copy it. So when building the dockerfiles it looks something like (assuming from a fresh accelerate clone and cd'd to it)
Unsure with this one. Do you have docker cuda properly setup? IIRC you might need to find the right keys to use as well. I was able to build it just fine yesterday. |
Thinking on it more, having the python version be configurable in the cuda image would be nice. Will change this Also would probably be better to do similar to this Dockerfile and clone the repo instead: https://github.com/huggingface/transformers/blob/main/docker/transformers-pytorch-gpu/Dockerfile#L11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
This PR adds two Dockerfiles for building on the CPU and the GPU. I chose to avoid writing one for Deepspeed for now.
Eventually these will be integrated into test runners and built nightly, similar to how transformers is setup.
This uses a staged process to reduce the size of the images a bit.
Uncompressed sizes of each docker container:
CPU: ~871mb
GPU: ~13gb
For perspective, the uncompressed size of the transformers docker image for torch is 11.2gb
The biggest difference is the inclusion of conda, to make it easier for us to switch between python versions when needed.
It's also recommended to use buildkit to build the images, as it reduces the time by a healthy chunk. E.g.:
sudo DOCKER_BUILDKIT=1 docker build .-t accelerate-cpu -f docker/accelerate-cpu/Dockerfile
(Note: times were ran cache-less)
GPU image:
5m56s -> 4m16s
CPU Image:
3m20s -> 2m44s