From 754bbf7a25a8dda49b5d08ef0d0443bbf5af0e36 Mon Sep 17 00:00:00 2001 From: Craig Jennings Date: Sun, 7 Apr 2024 13:41:34 -0500 Subject: new repository --- devdocs/docker/compose%2Fgpu-support%2Findex.html | 119 ++++++++++++++++++++++ 1 file changed, 119 insertions(+) create mode 100644 devdocs/docker/compose%2Fgpu-support%2Findex.html (limited to 'devdocs/docker/compose%2Fgpu-support%2Findex.html') diff --git a/devdocs/docker/compose%2Fgpu-support%2Findex.html b/devdocs/docker/compose%2Fgpu-support%2Findex.html new file mode 100644 index 00000000..02d3394f --- /dev/null +++ b/devdocs/docker/compose%2Fgpu-support%2Findex.html @@ -0,0 +1,119 @@ +

Enabling GPU access with Compose

+ +

Compose services can define GPU device reservations if the Docker host contains such devices and the Docker Daemon is set accordingly. For this, make sure to install the prerequisites if you have not already done so.

The examples in the following sections focus specifically on providing service containers access to GPU devices with Docker Compose. You can use either docker-compose or docker compose commands.

Use of service runtime property from Compose v2.3 format (legacy)

Docker Compose v1.27.0+ switched to using the Compose Specification schema which is a combination of all properties from 2.x and 3.x versions. This re-enabled the use of service properties as runtime to provide GPU access to service containers. However, this does not allow to have control over specific properties of the GPU devices.

services:
+  test:
+    image: nvidia/cuda:10.2-base
+    command: nvidia-smi
+    runtime: nvidia
+
+

Enabling GPU access to service containers

Docker Compose v1.28.0+ allows to define GPU reservations using the device structure defined in the Compose Specification. This provides more granular control over a GPU reservation as custom values can be set for the following device properties:

Note

You must set the capabilities field. Otherwise, it returns an error on service deployment.

count and device_ids are mutually exclusive. You must only define one field at a time.

For more information on these properties, see the deploy section in the Compose Specification.

Example of a Compose file for running a service with access to 1 GPU device:

services:
+  test:
+    image: nvidia/cuda:10.2-base
+    command: nvidia-smi
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+

Run with Docker Compose:

$ docker-compose up
+Creating network "gpu_default" with the default driver
+Creating gpu_test_1 ... done
+Attaching to gpu_test_1    
+test_1  | +-----------------------------------------------------------------------------+
+test_1  | | NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.1     |
+test_1  | |-------------------------------+----------------------+----------------------+
+test_1  | | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
+test_1  | | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
+test_1  | |                               |                      |               MIG M. |
+test_1  | |===============================+======================+======================|
+test_1  | |   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
+test_1  | | N/A   23C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
+test_1  | |                               |                      |                  N/A |
+test_1  | +-------------------------------+----------------------+----------------------+
+test_1  |                                                                                
+test_1  | +-----------------------------------------------------------------------------+
+test_1  | | Processes:                                                                  |
+test_1  | |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
+test_1  | |        ID   ID                                                   Usage      |
+test_1  | |=============================================================================|
+test_1  | |  No running processes found                                                 |
+test_1  | +-----------------------------------------------------------------------------+
+gpu_test_1 exited with code 0
+
+

If no count or device_ids are set, all GPUs available on the host are going to be used by default.

services:
+  test:
+    image: tensorflow/tensorflow:latest-gpu
+    command: python -c "import tensorflow as tf;tf.test.gpu_device_name()"
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - capabilities: [gpu]
+
$ docker-compose up
+Creating network "gpu_default" with the default driver
+Creating gpu_test_1 ... done
+Attaching to gpu_test_1
+test_1  | I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
+.....
+test_1  | I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402]
+Created TensorFlow device (/device:GPU:0 with 13970 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5)
+test_1  | /device:GPU:0
+gpu_test_1 exited with code 0
+

On machines hosting multiple GPUs, device_ids field can be set to target specific GPU devices and count can be used to limit the number of GPU devices assigned to a service container. If count exceeds the number of available GPUs on the host, the deployment will error out.

$ nvidia-smi   
++-----------------------------------------------------------------------------+
+| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
+|-------------------------------+----------------------+----------------------+
+| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
+| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
+|                               |                      |               MIG M. |
+|===============================+======================+======================|
+|   0  Tesla T4            On   | 00000000:00:1B.0 Off |                    0 |
+| N/A   72C    P8    12W /  70W |      0MiB / 15109MiB |      0%      Default |
+|                               |                      |                  N/A |
++-------------------------------+----------------------+----------------------+
+|   1  Tesla T4            On   | 00000000:00:1C.0 Off |                    0 |
+| N/A   67C    P8    11W /  70W |      0MiB / 15109MiB |      0%      Default |
+|                               |                      |                  N/A |
++-------------------------------+----------------------+----------------------+
+|   2  Tesla T4            On   | 00000000:00:1D.0 Off |                    0 |
+| N/A   74C    P8    12W /  70W |      0MiB / 15109MiB |      0%      Default |
+|                               |                      |                  N/A |
++-------------------------------+----------------------+----------------------+
+|   3  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
+| N/A   62C    P8    11W /  70W |      0MiB / 15109MiB |      0%      Default |
+|                               |                      |                  N/A |
++-------------------------------+----------------------+----------------------+
+

To enable access only to GPU-0 and GPU-3 devices:

services:
+  test:
+    image: tensorflow/tensorflow:latest-gpu
+    command: python -c "import tensorflow as tf;tf.test.gpu_device_name()"
+    deploy:
+      resources:
+        reservations:
+          devices:
+          - driver: nvidia
+            device_ids: ['0', '3']
+            capabilities: [gpu]
+
+
$ docker-compose up
+...
+Created TensorFlow device (/device:GPU:0 with 13970 MB memory -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:1b.0, compute capability: 7.5)
+...
+Created TensorFlow device (/device:GPU:1 with 13970 MB memory) -> physical GPU (device: 1, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5)
+...
+gpu_test_1 exited with code 0
+
+

documentation, docs, docker, compose, GPU access, NVIDIA, samples

+
+

+ © 2019 Docker, Inc.
Licensed under the Apache License, Version 2.0.
Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or other countries.
Docker, Inc. and other parties may also have trademark rights in other terms used herein.
+ https://docs.docker.com/compose/gpu-support/ +

+
-- cgit v1.2.3