AMD Container Toolkit offers tools to streamline the use of AMD GPUs with containers. The toolkit includes the following packages.
amd-container-runtime- The AMD Container Runtimeamd-ctk- The AMD Container Toolkit CLI
- Ubuntu 22.04 or 24.04, or RHEL/CentOS 9
- Docker version 25 or later
- All the 'amd-ctk runtime configure' commands should be run as root/sudo
Note: Docker Desktop on Linux is not supported for GPU workloads; see troubleshooting to know more.
Install the Container toolkit.
To install the AMD Container Toolkit on Ubuntu systems, follow these steps:
-
Ensure pre-requisites are installed
apt update && apt install -y wget gnupg2 -
Add the GPG key for the repository:
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor | tee /etc/apt/keyrings/rocm.gpg > /dev/null
-
Add the repository to the system. Replace
noblewithjammywhen using Ubuntu 22.04:echo "deb [signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amd-container-toolkit/apt/ noble main" > /etc/apt/sources.list.d/amd-container-toolkit.list
-
Update the package list and install the toolkit:
apt update && apt install amd-container-toolkit
To install the AMD Container Toolkit on RHEL/CentOS 9 systems, follow these steps:
-
Add the repository configuration:
tee --append /etc/yum.repos.d/amd-container-toolkit.repo <<EOF [amd-container-toolkit] name=amd-container-toolkit baseurl=https://repo.radeon.com/amd-container-toolkit/el9/main/ enabled=1 priority=50 gpgcheck=1 gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key EOF
-
Clean the package cache and install the toolkit:
dnf clean all dnf install -y amd-container-toolkit
-
Configure the AMD container runtime for Docker as follows. The following command modifies the docker configuration file, /etc/docker/daemon.json, so that Docker can use the AMD container runtime.
> sudo amd-ctk runtime configure -
Restart the Docker daemon.
> sudo systemctl restart docker
- Configure Docker to use AMD container runtime.
> amd-ctk runtime configure --runtime=docker
-
Specify the required GPUs. There are 3 ways to do this.
-
Using
AMD_VISIBLE_DEVICESenvironment variable- To use all available GPUs,
> docker run --rm --runtime=amd -e AMD_VISIBLE_DEVICES=all rocm/rocm-terminal rocm-smi- To use a subset of available GPUs,
> docker run --rm --runtime=amd -e AMD_VISIBLE_DEVICES=0,1,2 rocm/rocm-terminal rocm-smi- To use many contiguously numbered GPUs,
> docker run --rm --runtime=amd -e AMD_VISIBLE_DEVICES=0-3,5,8 rocm/rocm-terminal rocm-smi -
Using CDI style
- First, generate the CDI spec.
> amd-ctk cdi generate --output=/etc/cdi/amd.json- Validate the generated CDI spec.
> amd-ctk cdi validate --path=/etc/cdi/amd.json- To use all available GPUs,
> docker run --rm --device amd.com/gpu=all rocm/rocm-terminal rocm-smi- To use a subset of available GPUs,
> docker run --rm --device amd.com/gpu=0 --device amd.com/gpu=1 rocm/rocm-terminal rocm-smi- Note that once the CDI spec,
/etc/cdi/amd.jsonis available,runtime=amdis not required in the docker run command.
-
Using explicit paths. Note that
runtime=amdis not required here.
> docker run --device /dev/kfd --device /dev/dri/renderD128 --device /dev/dri/renderD129 rocm/rocm-terminal rocm-smi -
-
List available GPUs. If this command is run as root, the container-toolkit logs go to /var/log/amd-container-runtime.log, otherwise they go to the user's home directory.
> amd-ctk cdi list
Found 1 AMD GPU device
amd.com/gpu=all
amd.com/gpu=0
/dev/dri/card1
/dev/dri/renderD128
- Make AMD container runtime default runtime. Avoid specifying
--runtime=amdoption with thedocker runcommand by setting the AMD container runtime as the default for Docker.
> amd-ctk runtime configure --runtime=docker --set-as-default
- Remove AMD container runtime as default runtime.
> amd-ctk runtime configure --runtime=docker --unset-as-default
- Remove AMD container runtime configuration in Docker (undo the earlier configuration).
> amd-ctk runtime configure --runtime=docker --remove
The following command can be used to list the GPUs available on the system and their enumeration. The GPUs are listed in the CDI format, but the same enumeration applies to usage with the OCI environment variable, AMD_VISIBLE_DEVICES.
> amd-ctk cdi list
Found 1 AMD GPU device
amd.com/gpu=all
amd.com/gpu=0
/dev/dri/card1
/dev/dri/renderD128
The AMD Container Toolkit supports GPU selection using unique identifiers (UUIDs) in addition to device indices. This enables more precise and reliable GPU targeting, especially in multi-GPU systems and orchestrated environments.
GPU UUIDs can be obtained using different tools:
rocm-smi --showuniqueidThis will display output similar to:
GPU[0] : Unique ID: 0xef2c1799a1f3e2ed
GPU[1] : Unique ID: 0x1234567890abcdef
The amd-smi tool can also be used to get the ASIC_SERIAL, which serves as the GPU UUID:
amd-smi static -aBThis will display output similar to:
GPU: 0
ASIC:
MARKET_NAME: AMD Instinct MI210
VENDOR_ID: 0x1002
VENDOR_NAME: Advanced Micro Devices Inc. [AMD/ATI]
SUBVENDOR_ID: 0x1002
DEVICE_ID: 0x740f
SUBSYSTEM_ID: 0x0c34
REV_ID: 0x02
ASIC_SERIAL: 0xD1CC3F11CFDD5112
OAM_ID: N/A
NUM_COMPUTE_UNITS: 104
TARGET_GRAPHICS_VERSION: gfx90a
BOARD:
MODEL_NUMBER: 102-D67302-00
PRODUCT_SERIAL: 692231000131
FRU_ID: 113-HPED67302000B.009
PRODUCT_NAME: Instinct MI210
MANUFACTURER_NAME: AMD
Use the ASIC_SERIAL value (e.g., 0xD1CC3F11CFDD5112) as the GPU UUID in container configurations.
Both AMD_VISIBLE_DEVICES and DOCKER_RESOURCE_* environment variables support UUID specification:
# Use specific GPUs by UUID
docker run --rm --runtime=amd \
-e AMD_VISIBLE_DEVICES=0xef2c1799a1f3e2ed,0x1234567890abcdef \
rocm/dev-ubuntu-24.04 rocm-smi
# Mix device indices and UUIDs
docker run --rm --runtime=amd \
-e AMD_VISIBLE_DEVICES=0,0xef2c1799a1f3e2ed \
rocm/dev-ubuntu-24.04 rocm-smi# Docker Swarm generic resource format
docker run --rm --runtime=amd \
-e DOCKER_RESOURCE_GPU=0xef2c1799a1f3e2ed \
rocm/dev-ubuntu-24.04 rocm-smiGPU UUID support significantly improves Docker Swarm deployments by enabling precise GPU allocation across cluster nodes.
Configure each swarm node's Docker daemon with GPU resources in /etc/docker/daemon.json:
{
"default-runtime": "amd",
"runtimes": {
"amd": {
"path": "amd-container-runtime",
"runtimeArgs": []
}
},
"node-generic-resources": [
"AMD_GPU=0x378041e1ada6015",
"AMD_GPU=0xef39dad16afb86ad",
"GPU_COMPUTE=0x583de6f2d99dc333"
]
}After updating the configuration, restart the Docker daemon:
sudo systemctl restart dockerDeploy services with specific GPU requirements using docker-compose:
Using generic resources:
# docker-compose.yml for Swarm deployment
version: '3.8'
services:
rocm-service:
image: rocm/dev-ubuntu-24.04
command: rocm-smi
deploy:
replicas: 1
resources:
reservations:
generic_resources:
- discrete_resource_spec:
kind: 'AMD_GPU' # Matches daemon.json key
value: 1Using environment variables:
# docker-compose.yml for Swarm deployment with environment variable
version: '3.8'
services:
rocm-service:
image: rocm/dev-ubuntu-24.04
command: rocm-smi
environment:
- AMD_VISIBLE_DEVICES=all
deploy:
replicas: 1Deploy the service:
docker stack deploy -c docker-compose.yml rocm-stackGPU Tracker is an optional, lightweight feature that tracks which containers use which GPUs and lets users set GPUs to shared or exclusive access. It is disabled by default; users can run amd-ctk gpu-tracker enable to turn it on, and status / reset to query or clear state. It only applies when containers are started with docker run and AMD_VISIBLE_DEVICES.
For full usage, examples, and limitations, see the GPU Tracker documentation in this repo, or the official documentation.
| Release | Features | Known Issues |
|---|---|---|
| v1.2.0 | 1. GPU Tracker feature support 2. Docker Swarm support |
None |
| v1.1.0 | 1. GPU partitioning support 2. Full RPM package support 3. Support for range operator in the input string to AMD_VISIBLE_DEVICES ENV variable. |
None |
| v1.0.0 | Initial release | 1. Partitioned GPUs are not supported. 2. RPM builds are experimental. |
To build debian package, use the following command.
make
make pkg-deb
To build rpm package, use the following command.
make build-dev-container-rpm
make pkg-rpm
The packages will be generated in the bin folder.
For detailed documentation including installation guides and configuration options, see the documentation.
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.