ubuntu 22.04系统Docker和Nvidia-docker的安装、测试,及运行GUI应用

快速搭建所需开发环境

Docker文档:https://docs.docker.com/,Docker安装指南: Install Docker Engine on Ubuntu

Dokcer安装

Uninstall old versions
 
~$ sudo apt-get remove docker docker-engine docker.io containerd runc

~$ sudo apt-get install curl
 
Install using the repository

~$ curl https://get.docker.com | sh \
  && sudo systemctl --now enable docker
 
Verify that Docker Engine is installed correctly by running the hello-world image.
~$ sudo docker run hello-world

Docker测试

# 启动docker服务
$ sudo service docker start

# Docker: hello-world
$ sudo docker run hello-world

其他Docker命令:

Usage: service docker {start|stop|restart|status}

查看镜像
$ sudo docker images

查看容器
$ sudo docker container ls -a

Tips:Docker中一般Crtl+C退出,传送门:停止、删除所有的 docker 容器和镜像

Nvidia-docker安装

查看nvidia版本

$ nvidia-smi
Thu Nov 26 10:34:37 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2060    Off  | 00000000:01:00.0  On |                  N/A |
|  0%   37C    P8     9W / 190W |    301MiB /  5931MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       942      G   /usr/lib/xorg/Xorg                 35MiB |
|    0   N/A  N/A      2278      G   /usr/lib/xorg/Xorg                 96MiB |
|    0   N/A  N/A      2404      G   /usr/bin/gnome-shell              150MiB |
|    0   N/A  N/A      4051      G   /usr/lib/firefox/firefox            3MiB |
+-----------------------------------------------------------------------------+

参考链接:官网 installation guide
Github:NVIDIA/nvidia-docker

#
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
 
$ sudo apt-get update

$ sudo apt-get install -y nvidia-container-toolkit

$ sudo nvidia-ctk runtime configure --runtime=docker
 
$ sudo systemctl restart docker

$ docker pull nvidia/cudagl:11.0-base
 
# 测试
$ docker run --rm --gpus all nvidia/cudagl:11.0-base nvidia-smi
 
Thu Nov 26 02:30:34 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2060    Off  | 00000000:01:00.0  On |                  N/A |
|  0%   37C    P8    10W / 190W |    307MiB /  5931MiB |     13%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

测试

$ sudo docker run --runtime=nvidia --rm nvidia/cudagl:11.0-base nvidia-smi

Docker 容器 GUI

$ sudo apt-get install x11-xserver-utils

# 关闭权限控制,允许其他X客户端绘制
$ xhost +
access control disabled, clients can connect from any host

$ docker run -e DISPLAY=$DISPLAY -e GDK_SCALE -e GDK_DPI_SCAL -v /tmp/.X11-unix:/tmp/.X11-unix --rm -it container-name-or-id

若遇到X Error时,添加参数:--ipc=host 或 --env="QT_X11_NO_MITSHM=1",参考链接:
How to fix X Error: BadAccess, BadDrawable, BadShmSeg while running graphical application using Docker?
Docker: gazebo: cannot connect to X server
若遇到 libGL error: No matching fbConfigs or visuals found libGL error... ,参考链接:
使用docker时出现libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver...
已成功测试上述链接中的 pull image 方式
使用nvidia-smi查看nvidia driver和cuda版本,根据 nvidia/cudagl ,选择合适的TAG

$ sudo apt-get install x11-xserver-utils

$ nvidia-smi

$ docker pull nvidia/cudagl:11.0-base

# 关闭权限控制,允许其他X客户端绘制
$ xhost +
access control disabled, clients can connect from any host

$ sudo docker run --rm --runtime=nvidia -it -e DISPLAY=$DISPLAY -e GDK_SCALE -e GDK_DPI_SCAL -v /tmp/.X11-unix:/tmp/.X11-unix nvidia/cudagl:11.0-base

$ apt-get update

$ apt-get install mesa-utils

$ glxgears

创建新的长期镜像:

$ sudo apt-get install x11-xserver-utils

$ nvidia-smi

$ docker pull nvidia/cudagl:10.2-base

# 关闭权限控制,允许其他X客户端绘制
$ xhost +
access control disabled, clients can connect from any host

$ sudo docker run -it --name isaac --runtime=nvidia -it -e DISPLAY=$DISPLAY -e GDK_SCALE -e GDK_DPI_SCAL -v /tmp/.X11-unix:/tmp/.X11-unix -v /data/isaac:/data/isaac nvidia/cudagl:10.2-base

$ apt-get update

$ apt-get install mesa-utils

$ glxgears

$ exit

$ sudo docker ps -a

$ sudo docker start isaac

$ sudo docker attach isaac

如果遇到如下报错:

Nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory

上述报错目前没有找到很好的解决方法,应该是某个安装包意外修改了系统配置,导致出现问题,重装系统可以顺利解决此问题。

参考链接


发布者

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注