-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker_container: DeviceRequests.Capabilities are improperly validated #42
Comments
Thanks for reporting! I'll take a look at this later today or tomorrow. I would be glad if you could test a PR. (I don't have the means to properly test the option.) |
Sure, I can test PR. Let me know when to do so. |
@skokec please test it as soon as you can :) |
It works OK now. I've tested you branch fix-docker_container-device_requests with the following: - hosts: <hostname>
become: yes
tasks:
- name: Start container with GPUs
community.docker.docker_container:
name: test
image: nvidia/cuda:10.1-runtime-ubuntu18.04
state: started
command: 'nvidia-smi -L'
detach: false
device_requests:
- # Add nVidia GPUs to this container
driver: nvidia
device_ids:
- '0'
- '1'
capabilities:
- ['gpu','nvidia']
register: docker_container_output
- name: Show test output
debug:
msg: "{{ docker_container_output.container.Output }}" which correctly allocates first two GPUs:
|
@skokec thanks! :) |
I merged the PR and plan to release a new version of this collection later this week. |
FYI, community.docker 1.0.1 has been released. |
SUMMARY
When using
device_requests
withcapabilities
option for nvidia GPU indocker_container
, ansible returns error indicating thatcapabilities
are not formatted properly. This happens for the example provided in the documentation.The issue seems to be around the line 1441 in docker_container.py where
capabilities
list is validated, however, this validation expects capabilities to be 'list of list of list of string' instead of 'list of list of string'. If I actually providecapabilities: [[[gpu]]]
then this goes through but then I get docker marshal error indicating this is not what docker API expects.A simple patch can fix this by removing the line 1441 in docker_container.py and updating the variable names. With this patch I do not get any issues and container is correctly created with the requested devices.
ISSUE TYPE
COMPONENT NAME
docker_container
ANSIBLE VERSION
CONFIGURATION
OS / ENVIRONMENT
Linux ubuntu 4.15.0-118-generic #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
STEPS TO REPRODUCE
EXPECTED RESULTS
Deploying container only on specific devices without errors
ACTUAL RESULTS
The following error:
The text was updated successfully, but these errors were encountered: