透過原始碼建構 Docker 與 NVIDIA Docker

本文說明如何從 GitHub 上的原始碼專案,來建構 Docker 與 NVIDIA Docker Package。由於過程需要修改多個專案,因此開始前需要先了解一下將修改的 GitHub repos:

  • Moby: 建構 Docker server side 的開源專案,前身為 Docker Engine 專案。
  • Docker CE: 建構 Docker CE 整套工具的專案,最上層沒下 License,屬於 Docker 官方產品,因此受到 EULA 限制,但該專案底下的 Engine 與 CLI 都是 Apache v2,故這邊只拿 CLI 來使用。
  • NVIDIA Docker: 用以建構 NVIDIA Docker 工具。
  • NVIDIA Container Runtime: 提供 NVIDIA 的 Container runtime。
  • Libnvidia Container: Container runtime 的底層函式庫。

事前準備

開始建構前需要確保以下條件已達成:

  • Host節點需要安裝 Docker 容器引擎:
$ curl -fsSL "https://get.docker.com/" | sh

Ubuntu

建構 Docker Server 與 Client

本節將分別透過 Moby 與 Docker CE 建構 Server 與 Client 二進制執行檔,並透過簡單 dpkg 來封裝成 deb packages。

Moby

首先透過 Git 取得 Moby 最新版本專案,並透過 Make tool 編譯:

$ cd ~/
$ git clone https://github.com/moby/moby
$ cd moby
$ VERSION=1.0.0-terran make binary
...
Created binary: bundles/binary-daemon/dockerd-1.0.0-terran
Copying nested executables into bundles/binary-daemon

當完成後,可以在bundles/binary-daemon看到建構的檔案:

$ rm -rf bundles/binary-daemon/*.sha256 bundles/binary-daemon/*.md5 bundles/binary-daemon/dockerd
$ mv bundles/binary-daemon/dockerd-1.0.0-terran bundles/binary-daemon/dockerd
$ ls bundles/binary-daemon
docker-containerd docker-containerd-ctr docker-containerd-shim docker-init docker-proxy docker-runc dockerd

Docker CE

首先透過 Git 取得 Docker CE 最新版本專案:

$ cd ~/
$ git clone https://github.com/docker/docker-ce.git
$ cd docker-ce
$ echo "1.0.0-terran" > VERSION
$ make static
...
make[2]: Leaving directory '/root/docker-ce/components/packaging/static'
make[1]: Leaving directory '/root/docker-ce/components/packaging'

當完成後,可查看components/packaging/static/build/linux/docker底下檔案:

$ ls components/packaging/static/build/linux/docker
docker docker-containerd docker-containerd-ctr docker-containerd-shim docker-init docker-proxy docker-runc dockerd

這邊 dockerd 等檔案如 Moby 一樣,但為了避免被哭哭所以只拿 docker client 來用。

建構 deb 檔案

首先建立一個資料夾terran-docker,並建立下列目錄結構:

$ cd ~/
$ mkdir terran-docker
$ mkdir -p terran-docker/DEBIAN \
terran-docker/lib/systemd/system \
terran-docker/usr/local/bin

結果如下所示:

$ tree terran-docker/
terran-docker/
|-- DEBIAN
|-- lib
| `-- systemd
| `-- system
`-- usr
`-- local
`-- bin

terran-docker/DEBIAN新增檔案control來描述 deb packages:

Package: docker
Version: 1.0.0-terran
Architecture: amd64
Description: This is the terran project docker packages.
Maintainer: Kyle Bai <[email protected]>
Depends: iptables, init-system-helpers (>= 1.18~), lsb-base (>= 4.1+Debian11ubuntu7), libc6 (>= 2.17), libdevmapper1.02.1 (>= 2:1.02.97), libltdl7 (>= 2.4.6), libseccomp2 (>= 2.3.0), libsystemd0
Recommends: aufs-tools, ca-certificates, cgroupfs-mount | cgroup-lite, git, pigz, xz-utils, apparmor

接著在terran-docker/DEBIAN新增檔案postinst,加入以下內容:

#!/bin/sh
#
# Preinstall scripts
# Release May 15, 2018. Kyle Bai <[email protected]>

set -e

case "$1" in
configure)
if [ -z "$2" ]; then
if ! getent group docker > /dev/null; then
groupadd --system docker
fi
fi
systemctl enable docker.socket
systemctl enable docker.service
;;
abort-*)
# How'd we get here??
exit 1
;;
*)
;;
esac

#DEBHELPER#

修改terran-docker/DEBAIN/postinst腳本權限:

$ chmod 755 terran-docker/DEBIAN/postinst

terran-docker/lib/systemd/system 新增檔案docker.socket,並加入以下內容:

[Unit]
Description=Terran Docker Socket for the API
PartOf=docker.service

[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target

terran-docker/lib/systemd/system 新增檔案docker.service,並加入以下內容:

[Unit]
Description=Terran Docker Container Engine
After=network-online.target docker.socket firewalld.service
Wants=network-online.target
Requires=docker.socket

[Service]
Type=notify
ExecStart=/usr/local/bin/dockerd -H fd://
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

複製 Binaries 檔案到usr/local/bin,如下指令:

$ cp -rp moby/bundles/binary-daemon/* terran-docker/usr/local/bin
$ cp -rp docker-ce/components/packaging/static/build/linux/docker/docker terran-docker/usr/local/bin

確認檔案如以下結構:

$ tree terran-docker/
terran-docker/
|-- DEBIAN
| |-- control
| `-- postinst
|-- lib
| `-- systemd
| `-- system
| |-- docker.service
| `-- docker.socket
`-- usr
`-- local
`-- bin
|-- docker
|-- docker-containerd
|-- docker-containerd-ctr
|-- docker-containerd-shim
|-- dockerd
|-- docker-init
|-- docker-proxy
`-- docker-runc

完成後透過 dpkg 指令來建置 Debain package 檔:

$ dpkg -b terran-docker
dpkg-deb: building package 'terran-docker' in 'terran-docker.deb'.

$ ls *.deb
terran-docker.deb

$ dpkg -f terran-docker.deb

建構 NVIDIA Docker 工具

本步驟將需要分別建構三個專案來才能完整的使用 NVIDIA Docker。

Libnvidia Container

首先透過 Git 取得 Libnvidia 最新版本專案原始碼,並進行建構:

$ git clone https://github.com/NVIDIA/libnvidia-container.git
$ cd libnvidia-container
$ make docker-ubuntu:16.04 TAG=rc.1

完成後,可以查看dist/ubuntu16.04/底下的檔案:

$ ls dist/ubuntu16.04/
libnvidia-container-dev_1.0.0~rc.1-1_amd64.deb libnvidia-container1_1.0.0~rc.1-1_amd64.deb
libnvidia-container-tools_1.0.0~rc.1-1_amd64.deb libnvidia-container1-dbg_1.0.0~rc.1-1_amd64.deb

NVIDIA Container Runtime

首先透過 Git 取得 Container Runtime 最新版本專案原始碼:

$ git clone https://github.com/NVIDIA/nvidia-container-runtime.git
$ cd nvidia-container-runtime

編輯runtime/Makefile修改以下內容:


# line 17
ubuntu16.04: $(addsuffix -ubuntu16.04, 1.0.0-terran)

# line 31
1.0.0-terran-%-runc:
echo "4fc53a81fb7c994640722ac585fa9ca548971871"

接著執行以下指令建構 deb packages 檔案:

$ make ubuntu16.04
...
dpkg-buildpackage: binary-only upload (no source included)
make[1]: Leaving directory '/root/nvidia-container-runtime/hook'

完成後查看dist/ubuntu16.04/目錄底下檔案:

$ ls dist/ubuntu16.04/
nvidia-container-runtime-hook_1.3.0-1_amd64.deb nvidia-container-runtime_2.0.0+docker1.0.0-terran-1_amd64.deb

NVIDIA Docker

首先透過 Git 取得 NVIDIA Docker 最新版本專案原始碼:

$ git clone https://github.com/NVIDIA/nvidia-docker.git
$ cd nvidia-docker

編輯Makefile,並修改以下內容:

...

# line 18
ubuntu16.04: $(addsuffix -ubuntu16.04, 1.0.0-terran)

# line 50
%-ubuntu16.04:
$(DOCKER) build --build-arg VERSION_ID="16.04" \
--build-arg RUNTIME_VERSION="$(RUNTIME_VERSION)+docker$*-1" \
--build-arg DOCKER_VERSION="docker (= $*)" \
--build-arg PKG_VERS="$(VERSION)+docker$*" \
--build-arg PKG_REV="$(PKG_REV)" \
-t "nvidia/nvidia-docker2/ubuntu:16.04-docker$*" -f Dockerfile.ubuntu .
$(DOCKER) run --rm -v $(DIST_DIR)/ubuntu16.04:/dist:Z "nvidia/nvidia-docker2/ubuntu:16.04-docker$*"

完成後透過make來建構:

$ make ubuntu16.04
...
dpkg-source -i --after-build nvidia-docker2-2.0.3+docker1.0.0-terran
dpkg-buildpackage: binary-only upload (no source included)

查看dist/ubuntu16.04/目錄底下檔案:

$ ls dist/ubuntu16.04/
nvidia-docker2_2.0.3+docker1.0.0-terran-1_all.deb

集合 Debain Packages

最後把所有建構的 Debain Packages 複製到一個目錄底下:

$ cd 
$ mkdir nvidia-debs
$ cp -rp libnvidia-container/dist/ubuntu16.04/*.deb nvidia-debs/
$ cp -rp nvidia-container-runtime/dist/ubuntu16.04/*.deb nvidia-debs/
$ cp -rp nvidia-docker/dist/ubuntu16.04/*.deb nvidia-debs/
$ ls nvidia-debs
libnvidia-container-dev_1.0.0~rc.1-1_amd64.deb libnvidia-container1_1.0.0~rc.1-1_amd64.deb nvidia-docker2_2.0.3+docker1.0.0-terran-1_all.deb
libnvidia-container-tools_1.0.0~rc.1-1_amd64.deb nvidia-container-runtime-hook_1.3.0-1_amd64.deb
libnvidia-container1-dbg_1.0.0~rc.1-1_amd64.deb nvidia-container-runtime_2.0.0+docker1.0.0-terran-1_amd64.deb

測試建構檔案

找一台空的 Ubuntu 16.04 機器,然後複製檔案到該節點。然後先安裝 Terran Docker:

$ sudo dpkg -i terran-docker.deb

若發生缺少 dep 檔案則下以下指令:

$ sudo apt-get install -y -f

接著安裝 NVIDIA Docker 與相關套件:

$ sudo dpkg -i nvidia-debs/*.deb

Docker 測試結果:

Kubernetes 測試結果:

CentOS

建構 Docker Server 與 Client

本節將透過 Docker CE 建構 Server 與 Client 二進制執行檔。

Build from Docker CE

首先透過 Git 取得 Docker CE 最新版本專案:

$ git clone https://github.com/docker/docker-ce.git
$ cd docker-ce/components/packaging/rpm
$ VERSION=18.05.0-terran make centos
...
+ /usr/bin/rm -rf /root/rpmbuild/BUILDROOT/docker-ce-18.05.0.terran-3.el7.x86_64
+ exit 0

當完成後,可查看rpmbuild/RPMS/x86_64底下檔案:

$ ls rpmbuild/RPMS/x86_64
docker-ce-18.05.0.terran-3.el7.x86_64.rpm docker-ce-debuginfo-18.05.0.terran-3.el7.x86_64.rpm

複製到測試用節點:

$ ssh [email protected]$NODE_IP "mkdir ~/docker"
$ scp rpmbuild/RPMS/x86_64/*.rpm [email protected]$NODE_IP:~/docker

建構 NVIDIA Docker 工具

本步驟將需要分別建構三個專案來才能完整的使用 NVIDIA Docker。

Libnvidia Container

首先透過 Git 取得 Libnvidia 最新版本專案原始碼,並進行建構:

$ git clone https://github.com/NVIDIA/libnvidia-container.git
$ cd libnvidia-container
$ make docker-centos:7 TAG=rc.2

完成後,可以查看dist/centos7底下的檔案:

$ ls dist/centos7/
libnvidia-container_1.0.0-rc.2_x86_64.tar.xz libnvidia-container1-debuginfo-1.0.0-0.1.rc.2.x86_64.rpm libnvidia-container-static-1.0.0-0.1.rc.2.x86_64.rpm
libnvidia-container1-1.0.0-0.1.rc.2.x86_64.rpm libnvidia-container-devel-1.0.0-0.1.rc.2.x86_64.rpm libnvidia-container-tools-1.0.0-0.1.rc.2.x86_64.rpm

複製到測試用節點:

$ ssh [email protected]$NODE_IP "mkdir ~/libnvidia-container"
$ scp dist/centos7/*.rpm [email protected]$NODE_IP:~/libnvidia-container

NVIDIA Container Runtime

首先透過 Git 取得 Container Runtime 最新版本專案原始碼:

$ git clone https://github.com/NVIDIA/nvidia-container-runtime.git
$ cd nvidia-container-runtime

編輯runtime/Makefile修改以下內容:


# line 25
centos7: $(addsuffix -centos7, 18.05.0-terran)

# Add new lines in line 31(runc v1.0-rc5)
18.05.0-terran-%-runc:
echo "4fc53a81fb7c994640722ac585fa9ca548971871"

# Modify line 109
%-terran-centos7:

接著執行以下指令建構 rpm packages 檔案:

$ make centos7
...
+ exit 0
make[1]: Leaving directory '/home/k2r2bai/Desktop/build-docker/nvidia-container-runtime/hook'

完成後查看dist/centos7/目錄底下檔案:

$ ls dist/centos7/
nvidia-container-runtime-2.0.0-1.docker18.05.0.x86_64.rpm nvidia-container-runtime-hook-1.3.0-1.x86_64.rpm

複製到測試用節點:

$ ssh [email protected]$NODE_IP "mkdir ~/nvidia-container-runtime"
$ scp dist/centos7/*.rpm [email protected]$NODE_IP:~/nvidia-container-runtime

NVIDIA Docker

首先透過 Git 取得 NVIDIA Docker 最新版本專案原始碼:

$ git clone https://github.com/NVIDIA/nvidia-docker.git
$ cd nvidia-docker

編輯Makefile,並修改以下內容:

# line 26
centos7: $(addsuffix -centos7, 18.05.0.terran)

# Modify line 122
%.terran-centos7:
$(DOCKER) build --build-arg VERSION_ID="7" \
--build-arg RUNTIME_VERSION="$(RUNTIME_VERSION)-1.docker$*" \
--build-arg DOCKER_VERSION="docker-ce = $*.terran" \
--build-arg PKG_VERS="$(VERSION)" \
--build-arg PKG_REV="$(PKG_REV).docker$*.terran" \
-t "nvidia/nvidia-docker2/centos:7-docker$*.terran" -f Dockerfile.centos .
$(DOCKER) run --rm -v $(DIST_DIR)/centos7:/dist:Z "nvidia/nvidia-docker2/centos:7-docker$*.terran"

完成後透過make來建構:

$ make centos7
...
+ cd /tmp/nvidia-container-runtime-2.0.3/BUILD
+ exit 0

查看dist/centos7/目錄底下檔案:

$ ls dist/centos7
nvidia-docker2-2.0.3-1.docker18.05.0.terran.noarch.rpm

複製到測試用節點:

$ ssh [email protected]$NODE_IP "mkdir ~/nvidia-docker"
$ scp dist/centos7/*.rpm [email protected]$NODE_IP:~/nvidia-docker

測試建構檔案

首先在 CentOS 7 測試機器上安裝 NVIDIA Driver 與一些套件:

$ echo -e "blacklist nouveau\noptions nouveau modeset=0" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
$ sudo yum -y install gcc kernel-devel "kernel-devel-uname-r == $(uname -r)" dkms
$ sudo rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
$ sudo rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
$ sudo yum -y install kmod-nvidia libtool-ltdl container-selinux
$ sudo reboot

重新啟動後,該節點的 Home 目錄必須存在以下檔案:

$ tree ~/

確認沒問題後,透過 RPM 來進行安裝:

$ sudo rpm -ivh docker/*.rpm
$ sudo rpm -ivh libnvidia-container/*.rpm
$ sudo rpm -ivh nvidia-container-runtime/*.rpm
$ sudo rpm -ivh nvidia-docker/*.rpm

啟動 Dockerd,並設定開啟啟動 systemd:

$ sudo systemctl enable docker && sudo systemctl start docker
$ sudo docker version
Client:
Version: 18.05.0-terran
API version: 1.38
Go version: go1.10.3
Git commit: ccbd518
Built: Wed Jun 13 03:19:43 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm

Server:
Engine:
Version: 18.05.0-terran
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: ccbd518
Built: Wed Jun 13 03:20:54 2018
OS/Arch: linux/amd64
Experimental: false

測試 NVIDIA Docker:

$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Wed Jun 13 06:28:25 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67 Driver Version: 390.67 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:01:00.0 Off | N/A |
| 0% 34C P5 26W / 120W | 0MiB / 3019MiB | 2% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

Kubernetes v1.11 測試結果:

Share Comments