gpu-monitor/rest_api_server

What is it?

Every 1-seconds it try to parse each gpu's ...
- memory consumtion
- utilization
- gpu related process's name and it's memory usage
  - if the gpu process is held in docker container, it parses that docker container's name
Broadcasts via simple REST api server
- GET http://:3032/gpu_stat

Dependencies

Flask
Only works with Ubuntu with Nvidia graphic cards
- It uses nvidia-smi command to parse GPU infomation
Docker, Docker-compose, dind(Docker in Docker)
Using python's subprocess to call below commands
- nvidia-smi
- docker-inspect

Important Note

Try nvidia-smi on terminal and see if it outputs fairly fast enough.
If not, turn persistance mode on.

$ sudo nvidia-smi --persistence-mode=1

This repo will use following commands, use at your own risk!!

When inside container, it cannot parse processes that are not created inside the same container
- which leads to use
- docker run --pid=host ... ... ...
To run docker commands (docker inspect) inside docker container, it needs to install docker.io
- which leads to use
- docker run -v /var/run/docker.sock:/var/run/docker.sock ... ... ...
- Ref
To get docker ID with process id (PID)
- it needs host's /proc mapped into container and parse it with,
- cat /proc/<pid>/cgroup
- which leads to use
- docker run -v /proc:/host_proc ... ... ...

How to use

# get the source
:~$ git clone https://github.com/moono/gpu-monitor.git
:~$ cd gpu-monitor/rest_api_server

# run docker-compose
:~/gpu-monitor/rest_api_server$ docker-compose up -d

Test

:~$ curl -X GET http://127.0.0.1:3032/gpu_stat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

gpu-monitor/rest_api_server

What is it?

Dependencies

Important Note

This repo will use following commands, use at your own risk!!

How to use

Test

Files

README.md

Latest commit

History

README.md

File metadata and controls

gpu-monitor/rest_api_server

What is it?

Dependencies

Important Note

This repo will use following commands, use at your own risk!!

How to use

Test