diff options
Diffstat (limited to 'devdocs/docker/engine%2Fsecurity%2Frootless%2Findex.html')
| -rw-r--r-- | devdocs/docker/engine%2Fsecurity%2Frootless%2Findex.html | 123 |
1 files changed, 123 insertions, 0 deletions
diff --git a/devdocs/docker/engine%2Fsecurity%2Frootless%2Findex.html b/devdocs/docker/engine%2Fsecurity%2Frootless%2Findex.html new file mode 100644 index 00000000..9f8a4f22 --- /dev/null +++ b/devdocs/docker/engine%2Fsecurity%2Frootless%2Findex.html @@ -0,0 +1,123 @@ +<h1>Run the Docker daemon as a non-root user (Rootless mode)</h1> + +<p>Rootless mode allows running the Docker daemon and containers as a non-root user to mitigate potential vulnerabilities in the daemon and the container runtime.</p> <p>Rootless mode does not require root privileges even during the installation of the Docker daemon, as long as the <a href="#prerequisites">prerequisites</a> are met.</p> <p>Rootless mode was introduced in Docker Engine v19.03 as an experimental feature. Rootless mode graduated from experimental in Docker Engine v20.10.</p> <h2 id="how-it-works">How it works</h2> <p>Rootless mode executes the Docker daemon and containers inside a user namespace. This is very similar to <a href="../userns-remap/index"><code class="language-plaintext highlighter-rouge">userns-remap</code> mode</a>, except that with <code class="language-plaintext highlighter-rouge">userns-remap</code> mode, the daemon itself is running with root privileges, whereas in rootless mode, both the daemon and the container are running without root privileges.</p> <p>Rootless mode does not use binaries with <code class="language-plaintext highlighter-rouge">SETUID</code> bits or file capabilities, except <code class="language-plaintext highlighter-rouge">newuidmap</code> and <code class="language-plaintext highlighter-rouge">newgidmap</code>, which are needed to allow multiple UIDs/GIDs to be used in the user namespace.</p> <h2 id="prerequisites">Prerequisites</h2> <ul> <li> <p>You must install <code class="language-plaintext highlighter-rouge">newuidmap</code> and <code class="language-plaintext highlighter-rouge">newgidmap</code> on the host. These commands are provided by the <code class="language-plaintext highlighter-rouge">uidmap</code> package on most distros.</p> </li> <li> <p><code class="language-plaintext highlighter-rouge">/etc/subuid</code> and <code class="language-plaintext highlighter-rouge">/etc/subgid</code> should contain at least 65,536 subordinate UIDs/GIDs for the user. In the following example, the user <code class="language-plaintext highlighter-rouge">testuser</code> has 65,536 subordinate UIDs/GIDs (231072-296607).</p> </li> </ul> <div class="highlight"><pre class="highlight" data-language="">$ id -u +1001 +$ whoami +testuser +$ grep ^$(whoami): /etc/subuid +testuser:231072:65536 +$ grep ^$(whoami): /etc/subgid +testuser:231072:65536 +</pre></div> <h3 id="distribution-specific-hint">Distribution-specific hint</h3> <blockquote> <p>Note: We recommend that you use the Ubuntu kernel.</p> </blockquote> <ul class="nav nav-tabs"> <li class="active"><a data-toggle="tab" data-target="#hint-ubuntu">Ubuntu</a></li> <li><a data-toggle="tab" data-target="#hint-debian">Debian GNU/Linux</a></li> <li><a data-toggle="tab" data-target="#hint-arch">Arch Linux</a></li> <li><a data-toggle="tab" data-target="#hint-opensuse-sles">openSUSE and SLES</a></li> <li><a data-toggle="tab" data-target="#hint-centos8-rhel8-fedora">CentOS 8, RHEL 8 and Fedora</a></li> <li><a data-toggle="tab" data-target="#hint-centos7-rhel7">CentOS 7 and RHEL 7</a></li> </ul> <div class="tab-content"> <div id="hint-ubuntu" class="tab-pane fade in active"> <ul> <li> <p>Install <code class="language-plaintext highlighter-rouge">dbus-user-session</code> package if not installed. Run <code class="language-plaintext highlighter-rouge">sudo apt-get install -y dbus-user-session</code> and relogin.</p> </li> <li> <p><code class="language-plaintext highlighter-rouge">overlay2</code> storage driver is enabled by default (<a href="https://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/fs/overlayfs?id=3b7da90f28fe1ed4b79ef2d994c81efbc58f1144">Ubuntu-specific kernel patch</a>).</p> </li> <li> <p>Known to work on Ubuntu 18.04, 20.04, and 21.04.</p> </li> </ul> </div> <div id="hint-debian" class="tab-pane fade in"> <ul> <li> <p>Install <code class="language-plaintext highlighter-rouge">dbus-user-session</code> package if not installed. Run <code class="language-plaintext highlighter-rouge">sudo apt-get install -y dbus-user-session</code> and relogin.</p> </li> <li> <p>For Debian 10, add <code class="language-plaintext highlighter-rouge">kernel.unprivileged_userns_clone=1</code> to <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> (or <code class="language-plaintext highlighter-rouge">/etc/sysctl.d</code>) and run <code class="language-plaintext highlighter-rouge">sudo sysctl --system</code>. This step is not required on Debian 11.</p> </li> <li> <p>Installing <code class="language-plaintext highlighter-rouge">fuse-overlayfs</code> is recommended. Run <code class="language-plaintext highlighter-rouge">sudo apt-get install -y fuse-overlayfs</code>. Using <code class="language-plaintext highlighter-rouge">overlay2</code> storage driver with Debian-specific modprobe option <code class="language-plaintext highlighter-rouge">sudo modprobe overlay permit_mounts_in_userns=1</code> is also possible, however, highly discouraged due to <a href="https://github.com/moby/moby/issues/42302">instability</a>.</p> </li> <li> <p>Rootless docker requires version of <code class="language-plaintext highlighter-rouge">slirp4netns</code> greater than <code class="language-plaintext highlighter-rouge">v0.4.0</code> (when <code class="language-plaintext highlighter-rouge">vpnkit</code> is not installed). Check you have this with</p> <div class="highlight"><pre class="highlight" data-language="">$ slirp4netns --version +</pre></div> <p>If you do not have this download and install with <code class="language-plaintext highlighter-rouge">sudo apt-get install -y slirp4netns</code> or download the latest <a href="https://github.com/rootless-containers/slirp4netns/releases">release</a>.</p> </li> </ul> </div> <div id="hint-arch" class="tab-pane fade in"> <ul> <li> <p>Installing <code class="language-plaintext highlighter-rouge">fuse-overlayfs</code> is recommended. Run <code class="language-plaintext highlighter-rouge">sudo pacman -S fuse-overlayfs</code>.</p> </li> <li> <p>Add <code class="language-plaintext highlighter-rouge">kernel.unprivileged_userns_clone=1</code> to <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> (or <code class="language-plaintext highlighter-rouge">/etc/sysctl.d</code>) and run <code class="language-plaintext highlighter-rouge">sudo sysctl --system</code></p> </li> </ul> </div> <div id="hint-opensuse-sles" class="tab-pane fade in"> <ul> <li> <p>Installing <code class="language-plaintext highlighter-rouge">fuse-overlayfs</code> is recommended. Run <code class="language-plaintext highlighter-rouge">sudo zypper install -y fuse-overlayfs</code>.</p> </li> <li> <p><code class="language-plaintext highlighter-rouge">sudo modprobe ip_tables iptable_mangle iptable_nat iptable_filter</code> is required. This might be required on other distros as well depending on the configuration.</p> </li> <li> <p>Known to work on openSUSE 15 and SLES 15.</p> </li> </ul> </div> <div id="hint-centos8-rhel8-fedora" class="tab-pane fade in"> <ul> <li> <p>Installing <code class="language-plaintext highlighter-rouge">fuse-overlayfs</code> is recommended. Run <code class="language-plaintext highlighter-rouge">sudo dnf install -y fuse-overlayfs</code>.</p> </li> <li> <p>You might need <code class="language-plaintext highlighter-rouge">sudo dnf install -y iptables</code>.</p> </li> <li> <p>Known to work on CentOS 8, RHEL 8, and Fedora 34.</p> </li> </ul> </div> <div id="hint-centos7-rhel7" class="tab-pane fade in"> <ul> <li> <p>Add <code class="language-plaintext highlighter-rouge">user.max_user_namespaces=28633</code> to <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> (or <code class="language-plaintext highlighter-rouge">/etc/sysctl.d</code>) and run <code class="language-plaintext highlighter-rouge">sudo sysctl --system</code>.</p> </li> <li> <p><code class="language-plaintext highlighter-rouge">systemctl --user</code> does not work by default. Run <code class="language-plaintext highlighter-rouge">dockerd-rootless.sh</code> directly without systemd.</p> </li> </ul> </div> </div> <h2 id="known-limitations">Known limitations</h2> <ul> <li>Only the following storage drivers are supported: <ul> <li> +<code class="language-plaintext highlighter-rouge">overlay2</code> (only if running with kernel 5.11 or later, or Ubuntu-flavored kernel)</li> <li> +<code class="language-plaintext highlighter-rouge">fuse-overlayfs</code> (only if running with kernel 4.18 or later, and <code class="language-plaintext highlighter-rouge">fuse-overlayfs</code> is installed)</li> <li> +<code class="language-plaintext highlighter-rouge">btrfs</code> (only if running with kernel 4.18 or later, or <code class="language-plaintext highlighter-rouge">~/.local/share/docker</code> is mounted with <code class="language-plaintext highlighter-rouge">user_subvol_rm_allowed</code> mount option)</li> <li><code class="language-plaintext highlighter-rouge">vfs</code></li> </ul> </li> <li>Cgroup is supported only when running with cgroup v2 and systemd. See <a href="#limiting-resources">Limiting resources</a>.</li> <li>Following features are not supported: <ul> <li>AppArmor</li> <li>Checkpoint</li> <li>Overlay network</li> <li>Exposing SCTP ports</li> </ul> </li> <li>To use the <code class="language-plaintext highlighter-rouge">ping</code> command, see <a href="#routing-ping-packets">Routing ping packets</a>.</li> <li>To expose privileged TCP/UDP ports (< 1024), see <a href="#exposing-privileged-ports">Exposing privileged ports</a>.</li> <li> +<code class="language-plaintext highlighter-rouge">IPAddress</code> shown in <code class="language-plaintext highlighter-rouge">docker inspect</code> and is namespaced inside RootlessKit’s network namespace. This means the IP address is not reachable from the host without <code class="language-plaintext highlighter-rouge">nsenter</code>-ing into the network namespace.</li> <li>Host network (<code class="language-plaintext highlighter-rouge">docker run --net=host</code>) is also namespaced inside RootlessKit.</li> <li>NFS mounts as the docker “data-root” is not supported. This limitation is not specific to rootless mode.</li> </ul> <h2 id="install">Install</h2> <blockquote> <p><strong>Note</strong></p> <p>If the system-wide Docker daemon is already running, consider disabling it: <code class="language-plaintext highlighter-rouge">$ sudo systemctl disable --now docker.service docker.socket</code></p> </blockquote> <ul class="nav nav-tabs"> <li class="active"><a data-toggle="tab" data-target="#install-with-packages">With packages (RPM/DEB)</a></li> <li><a data-toggle="tab" data-target="#install-without-packages">Without packages</a></li> </ul> <div class="tab-content"> <div id="install-with-packages" class="tab-pane fade in active"> <p>If you installed Docker 20.10 or later with <a href="../../install/index">RPM/DEB packages</a>, you should have <code class="language-plaintext highlighter-rouge">dockerd-rootless-setuptool.sh</code> in <code class="language-plaintext highlighter-rouge">/usr/bin</code>.</p> <p>Run <code class="language-plaintext highlighter-rouge">dockerd-rootless-setuptool.sh install</code> as a non-root user to set up the daemon:</p> <div class="highlight"><pre class="highlight" data-language="">$ dockerd-rootless-setuptool.sh install +[INFO] Creating /home/testuser/.config/systemd/user/docker.service +... +[INFO] Installed docker.service successfully. +[INFO] To control docker.service, run: `systemctl --user (start|stop|restart) docker.service` +[INFO] To run docker.service on system startup, run: `sudo loginctl enable-linger testuser` + +[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc): + +export PATH=/usr/bin:$PATH +export DOCKER_HOST=unix:///run/user/1000/docker.sock +</pre></div> <p>If <code class="language-plaintext highlighter-rouge">dockerd-rootless-setuptool.sh</code> is not present, you may need to install the <code class="language-plaintext highlighter-rouge">docker-ce-rootless-extras</code> package manually, e.g.,</p> <div class="highlight"><pre class="highlight" data-language="">$ sudo apt-get install -y docker-ce-rootless-extras +</pre></div> </div> <div id="install-without-packages" class="tab-pane fade in"> <p>If you do not have permission to run package managers like <code class="language-plaintext highlighter-rouge">apt-get</code> and <code class="language-plaintext highlighter-rouge">dnf</code>, consider using the installation script available at <a href="https://get.docker.com/rootless" target="_blank" rel="noopener" class="_">https://get.docker.com/rootless</a>. Since static packages are not available for <code class="language-plaintext highlighter-rouge">s390x</code>, hence it is not supported for <code class="language-plaintext highlighter-rouge">s390x</code>.</p> <div class="highlight"><pre class="highlight" data-language="">$ curl -fsSL https://get.docker.com/rootless | sh +... +[INFO] Creating /home/testuser/.config/systemd/user/docker.service +... +[INFO] Installed docker.service successfully. +[INFO] To control docker.service, run: `systemctl --user (start|stop|restart) docker.service` +[INFO] To run docker.service on system startup, run: `sudo loginctl enable-linger testuser` + +[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc): + +export PATH=/home/testuser/bin:$PATH +export DOCKER_HOST=unix:///run/user/1000/docker.sock +</pre></div> <p>The binaries will be installed at <code class="language-plaintext highlighter-rouge">~/bin</code>.</p> </div> </div> <p>See <a href="#troubleshooting">Troubleshooting</a> if you faced an error.</p> <h2 id="uninstall">Uninstall</h2> <p>To remove the systemd service of the Docker daemon, run <code class="language-plaintext highlighter-rouge">dockerd-rootless-setuptool.sh uninstall</code>:</p> <div class="highlight"><pre class="highlight" data-language="">$ dockerd-rootless-setuptool.sh uninstall ++ systemctl --user stop docker.service ++ systemctl --user disable docker.service +Removed /home/testuser/.config/systemd/user/default.target.wants/docker.service. +[INFO] Uninstalled docker.service +[INFO] This uninstallation tool does NOT remove Docker binaries and data. +[INFO] To remove data, run: `/usr/bin/rootlesskit rm -rf /home/testuser/.local/share/docker` +</pre></div> <p>Unset environment variables PATH and DOCKER_HOST if you have added them to <code class="language-plaintext highlighter-rouge">~/.bashrc</code>.</p> <p>To remove the data directory, run <code class="language-plaintext highlighter-rouge">rootlesskit rm -rf ~/.local/share/docker</code>.</p> <p>To remove the binaries, remove <code class="language-plaintext highlighter-rouge">docker-ce-rootless-extras</code> package if you installed Docker with package managers. If you installed Docker with https://get.docker.com/rootless (<a href="#install">Install without packages</a>), remove the binary files under <code class="language-plaintext highlighter-rouge">~/bin</code>:</p> <div class="highlight"><pre class="highlight" data-language="">$ cd ~/bin +$ rm -f containerd containerd-shim containerd-shim-runc-v2 ctr docker docker-init docker-proxy dockerd dockerd-rootless-setuptool.sh dockerd-rootless.sh rootlesskit rootlesskit-docker-proxy runc vpnkit +</pre></div> <h2 id="usage">Usage</h2> <h3 id="daemon">Daemon</h3> <ul class="nav nav-tabs"> <li class="active"><a data-toggle="tab" data-target="#usage-with-systemd">With systemd (Highly recommended)</a></li> <li><a data-toggle="tab" data-target="#usage-without-systemd">Without systemd</a></li> </ul> <div class="tab-content"> <div id="usage-with-systemd" class="tab-pane fade in active"> <p>The systemd unit file is installed as <code class="language-plaintext highlighter-rouge">~/.config/systemd/user/docker.service</code>.</p> <p>Use <code class="language-plaintext highlighter-rouge">systemctl --user</code> to manage the lifecycle of the daemon:</p> <div class="highlight"><pre class="highlight" data-language="">$ systemctl --user start docker +</pre></div> <p>To launch the daemon on system startup, enable the systemd service and lingering:</p> <div class="highlight"><pre class="highlight" data-language="">$ systemctl --user enable docker +$ sudo loginctl enable-linger $(whoami) +</pre></div> <p>Starting Rootless Docker as a systemd-wide service (<code class="language-plaintext highlighter-rouge">/etc/systemd/system/docker.service</code>) is not supported, even with the <code class="language-plaintext highlighter-rouge">User=</code> directive.</p> </div> <div id="usage-without-systemd" class="tab-pane fade in"> <p>To run the daemon directly without systemd, you need to run <code class="language-plaintext highlighter-rouge">dockerd-rootless.sh</code> instead of <code class="language-plaintext highlighter-rouge">dockerd</code>.</p> <p>The following environment variables must be set:</p> <ul> <li> +<code class="language-plaintext highlighter-rouge">$HOME</code>: the home directory</li> <li> +<code class="language-plaintext highlighter-rouge">$XDG_RUNTIME_DIR</code>: an ephemeral directory that is only accessible by the expected user, e,g, <code class="language-plaintext highlighter-rouge">~/.docker/run</code>. The directory should be removed on every host shutdown. The directory can be on tmpfs, however, should not be under <code class="language-plaintext highlighter-rouge">/tmp</code>. Locating this directory under <code class="language-plaintext highlighter-rouge">/tmp</code> might be vulnerable to TOCTOU attack.</li> </ul> </div> </div> <p>Remarks about directory paths:</p> <ul> <li>The socket path is set to <code class="language-plaintext highlighter-rouge">$XDG_RUNTIME_DIR/docker.sock</code> by default. <code class="language-plaintext highlighter-rouge">$XDG_RUNTIME_DIR</code> is typically set to <code class="language-plaintext highlighter-rouge">/run/user/$UID</code>.</li> <li>The data dir is set to <code class="language-plaintext highlighter-rouge">~/.local/share/docker</code> by default. The data dir should not be on NFS.</li> <li>The daemon config dir is set to <code class="language-plaintext highlighter-rouge">~/.config/docker</code> by default. This directory is different from <code class="language-plaintext highlighter-rouge">~/.docker</code> that is used by the client.</li> </ul> <h3 id="client">Client</h3> <p>You need to specify either the socket path or the CLI context explicitly.</p> <p>To specify the socket path using <code class="language-plaintext highlighter-rouge">$DOCKER_HOST</code>:</p> <div class="highlight"><pre class="highlight" data-language="">$ export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock +$ docker run -d -p 8080:80 nginx +</pre></div> <p>To specify the CLI context using <code class="language-plaintext highlighter-rouge">docker context</code>:</p> <div class="highlight"><pre class="highlight" data-language="">$ docker context use rootless +rootless +Current context is now "rootless" +$ docker run -d -p 8080:80 nginx +</pre></div> <h2 id="best-practices">Best practices</h2> <h3 id="rootless-docker-in-docker">Rootless Docker in Docker</h3> <p>To run Rootless Docker inside “rootful” Docker, use the <code class="language-plaintext highlighter-rouge">docker:<version>-dind-rootless</code> image instead of <code class="language-plaintext highlighter-rouge">docker:<version>-dind</code>.</p> <div class="highlight"><pre class="highlight" data-language="">$ docker run -d --name dind-rootless --privileged docker:20.10-dind-rootless +</pre></div> <p>The <code class="language-plaintext highlighter-rouge">docker:<version>-dind-rootless</code> image runs as a non-root user (UID 1000). However, <code class="language-plaintext highlighter-rouge">--privileged</code> is required for disabling seccomp, AppArmor, and mount masks.</p> <h3 id="expose-docker-api-socket-through-tcp">Expose Docker API socket through TCP</h3> <p>To expose the Docker API socket through TCP, you need to launch <code class="language-plaintext highlighter-rouge">dockerd-rootless.sh</code> with <code class="language-plaintext highlighter-rouge">DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="-p 0.0.0.0:2376:2376/tcp"</code>.</p> <div class="highlight"><pre class="highlight" data-language="">$ DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="-p 0.0.0.0:2376:2376/tcp" \ + dockerd-rootless.sh \ + -H tcp://0.0.0.0:2376 \ + --tlsverify --tlscacert=ca.pem --tlscert=cert.pem --tlskey=key.pem +</pre></div> <h3 id="expose-docker-api-socket-through-ssh">Expose Docker API socket through SSH</h3> <p>To expose the Docker API socket through SSH, you need to make sure <code class="language-plaintext highlighter-rouge">$DOCKER_HOST</code> is set on the remote host.</p> <div class="highlight"><pre class="highlight" data-language="">$ ssh -l <REMOTEUSER> <REMOTEHOST> 'echo $DOCKER_HOST' +unix:///run/user/1001/docker.sock +$ docker -H ssh://<REMOTEUSER>@<REMOTEHOST> run ... +</pre></div> <h3 id="routing-ping-packets">Routing ping packets</h3> <p>On some distributions, <code class="language-plaintext highlighter-rouge">ping</code> does not work by default.</p> <p>Add <code class="language-plaintext highlighter-rouge">net.ipv4.ping_group_range = 0 2147483647</code> to <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> (or <code class="language-plaintext highlighter-rouge">/etc/sysctl.d</code>) and run <code class="language-plaintext highlighter-rouge">sudo sysctl --system</code> to allow using <code class="language-plaintext highlighter-rouge">ping</code>.</p> <h3 id="exposing-privileged-ports">Exposing privileged ports</h3> <p>To expose privileged ports (< 1024), set <code class="language-plaintext highlighter-rouge">CAP_NET_BIND_SERVICE</code> on <code class="language-plaintext highlighter-rouge">rootlesskit</code> binary and restart the daemon.</p> <div class="highlight"><pre class="highlight" data-language="">$ sudo setcap cap_net_bind_service=ep $(which rootlesskit) +$ systemctl --user restart docker +</pre></div> <p>Or add <code class="language-plaintext highlighter-rouge">net.ipv4.ip_unprivileged_port_start=0</code> to <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> (or <code class="language-plaintext highlighter-rouge">/etc/sysctl.d</code>) and run <code class="language-plaintext highlighter-rouge">sudo sysctl --system</code>.</p> <h3 id="limiting-resources">Limiting resources</h3> <p>Limiting resources with cgroup-related <code class="language-plaintext highlighter-rouge">docker run</code> flags such as <code class="language-plaintext highlighter-rouge">--cpus</code>, <code class="language-plaintext highlighter-rouge">--memory</code>, <code class="language-plaintext highlighter-rouge">--pids-limit</code> is supported only when running with cgroup v2 and systemd. See <a href="https://docs.docker.com/config/containers/runmetrics/">Changing cgroup version</a> to enable cgroup v2.</p> <p>If <code class="language-plaintext highlighter-rouge">docker info</code> shows <code class="language-plaintext highlighter-rouge">none</code> as <code class="language-plaintext highlighter-rouge">Cgroup Driver</code>, the conditions are not satisfied. When these conditions are not satisfied, rootless mode ignores the cgroup-related <code class="language-plaintext highlighter-rouge">docker run</code> flags. See <a href="#limiting-resources-without-cgroup">Limiting resources without cgroup</a> for workarounds.</p> <p>If <code class="language-plaintext highlighter-rouge">docker info</code> shows <code class="language-plaintext highlighter-rouge">systemd</code> as <code class="language-plaintext highlighter-rouge">Cgroup Driver</code>, the conditions are satisfied. However, typically, only <code class="language-plaintext highlighter-rouge">memory</code> and <code class="language-plaintext highlighter-rouge">pids</code> controllers are delegated to non-root users by default.</p> <div class="highlight"><pre class="highlight" data-language="">$ cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers +memory pids +</pre></div> <p>To allow delegation of all controllers, you need to change the systemd configuration as follows:</p> <div class="highlight"><pre class="highlight" data-language=""># mkdir -p /etc/systemd/system/user@.service.d +# cat > /etc/systemd/system/user@.service.d/delegate.conf << EOF +[Service] +Delegate=cpu cpuset io memory pids +EOF +# systemctl daemon-reload +</pre></div> <blockquote> <p><strong>Note</strong></p> <p>Delegating <code class="language-plaintext highlighter-rouge">cpuset</code> requires systemd 244 or later.</p> </blockquote> <h4 id="limiting-resources-without-cgroup">Limiting resources without cgroup</h4> <p>Even when cgroup is not available, you can still use the traditional <code class="language-plaintext highlighter-rouge">ulimit</code> and <a href="https://github.com/opsengine/cpulimit"><code class="language-plaintext highlighter-rouge">cpulimit</code></a>, though they work in process-granularity rather than in container-granularity, and can be arbitrarily disabled by the container process.</p> <p>For example:</p> <ul> <li>To limit CPU usage to 0.5 cores (similar to <code class="language-plaintext highlighter-rouge">docker run --cpus 0.5</code>): <code class="language-plaintext highlighter-rouge">docker run <IMAGE> cpulimit --limit=50 --include-children <COMMAND></code> +</li> <li> <p>To limit max VSZ to 64MiB (similar to <code class="language-plaintext highlighter-rouge">docker run --memory 64m</code>): <code class="language-plaintext highlighter-rouge">docker run <IMAGE> sh -c "ulimit -v 65536; <COMMAND>"</code></p> </li> <li>To limit max number of processes to 100 per namespaced UID 2000 (similar to <code class="language-plaintext highlighter-rouge">docker run --pids-limit=100</code>): <code class="language-plaintext highlighter-rouge">docker run --user 2000 --ulimit nproc=100 <IMAGE> <COMMAND></code> +</li> </ul> <h2 id="troubleshooting">Troubleshooting</h2> <h3 id="errors-when-starting-the-docker-daemon">Errors when starting the Docker daemon</h3> <p><strong>[rootlesskit:parent] error: failed to start the child: fork/exec /proc/self/exe: operation not permitted</strong></p> <p>This error occurs mostly when the value of <code class="language-plaintext highlighter-rouge">/proc/sys/kernel/unprivileged_userns_clone</code> is set to 0:</p> <div class="highlight"><pre class="highlight" data-language="">$ cat /proc/sys/kernel/unprivileged_userns_clone +0 +</pre></div> <p>To fix this issue, add <code class="language-plaintext highlighter-rouge">kernel.unprivileged_userns_clone=1</code> to <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> (or <code class="language-plaintext highlighter-rouge">/etc/sysctl.d</code>) and run <code class="language-plaintext highlighter-rouge">sudo sysctl --system</code>.</p> <p><strong>[rootlesskit:parent] error: failed to start the child: fork/exec /proc/self/exe: no space left on device</strong></p> <p>This error occurs mostly when the value of <code class="language-plaintext highlighter-rouge">/proc/sys/user/max_user_namespaces</code> is too small:</p> <div class="highlight"><pre class="highlight" data-language="">$ cat /proc/sys/user/max_user_namespaces +0 +</pre></div> <p>To fix this issue, add <code class="language-plaintext highlighter-rouge">user.max_user_namespaces=28633</code> to <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> (or <code class="language-plaintext highlighter-rouge">/etc/sysctl.d</code>) and run <code class="language-plaintext highlighter-rouge">sudo sysctl --system</code>.</p> <p><strong>[rootlesskit:parent] error: failed to setup UID/GID map: failed to compute uid/gid map: No subuid ranges found for user 1001 (“testuser”)</strong></p> <p>This error occurs when <code class="language-plaintext highlighter-rouge">/etc/subuid</code> and <code class="language-plaintext highlighter-rouge">/etc/subgid</code> are not configured. See <a href="#prerequisites">Prerequisites</a>.</p> <p><strong>could not get XDG_RUNTIME_DIR</strong></p> <p>This error occurs when <code class="language-plaintext highlighter-rouge">$XDG_RUNTIME_DIR</code> is not set.</p> <p>On a non-systemd host, you need to create a directory and then set the path:</p> <div class="highlight"><pre class="highlight" data-language="">$ export XDG_RUNTIME_DIR=$HOME/.docker/xrd +$ rm -rf $XDG_RUNTIME_DIR +$ mkdir -p $XDG_RUNTIME_DIR +$ dockerd-rootless.sh +</pre></div> <blockquote> <p><strong>Note</strong>: You must remove the directory every time you log out.</p> </blockquote> <p>On a systemd host, log into the host using <code class="language-plaintext highlighter-rouge">pam_systemd</code> (see below). The value is automatically set to <code class="language-plaintext highlighter-rouge">/run/user/$UID</code> and cleaned up on every logout.</p> <p><strong><code class="language-plaintext highlighter-rouge">systemctl --user</code> fails with “Failed to connect to bus: No such file or directory”</strong></p> <p>This error occurs mostly when you switch from the root user to an non-root user with <code class="language-plaintext highlighter-rouge">sudo</code>:</p> <div class="highlight"><pre class="highlight" data-language=""># sudo -iu testuser +$ systemctl --user start docker +Failed to connect to bus: No such file or directory +</pre></div> <p>Instead of <code class="language-plaintext highlighter-rouge">sudo -iu <USERNAME></code>, you need to log in using <code class="language-plaintext highlighter-rouge">pam_systemd</code>. For example:</p> <ul> <li>Log in through the graphic console</li> <li><code class="language-plaintext highlighter-rouge">ssh <USERNAME>@localhost</code></li> <li><code class="language-plaintext highlighter-rouge">machinectl shell <USERNAME>@</code></li> </ul> <p><strong>The daemon does not start up automatically</strong></p> <p>You need <code class="language-plaintext highlighter-rouge">sudo loginctl enable-linger $(whoami)</code> to enable the daemon to start up automatically. See <a href="#usage">Usage</a>.</p> <p><strong>iptables failed: iptables -t nat -N DOCKER: Fatal: can’t open lock file /run/xtables.lock: Permission denied</strong></p> <p>This error may happen with an older version of Docker when SELinux is enabled on the host.</p> <p>The issue has been fixed in Docker 20.10.8. A known workaround for older version of Docker is to run the following commands to disable SELinux for <code class="language-plaintext highlighter-rouge">iptables</code>:</p> <div class="highlight"><pre class="highlight" data-language="">$ sudo dnf install -y policycoreutils-python-utils && sudo semanage permissive -a iptables_t +</pre></div> <h3 id="docker-pull-errors"> +<code class="language-plaintext highlighter-rouge">docker pull</code> errors</h3> <p><strong>docker: failed to register layer: Error processing tar file(exit status 1): lchown <FILE>: invalid argument</strong></p> <p>This error occurs when the number of available entries in <code class="language-plaintext highlighter-rouge">/etc/subuid</code> or <code class="language-plaintext highlighter-rouge">/etc/subgid</code> is not sufficient. The number of entries required vary across images. However, 65,536 entries are sufficient for most images. See <a href="#prerequisites">Prerequisites</a>.</p> <p><strong>docker: failed to register layer: ApplyLayer exit status 1 stdout: stderr: lchown <FILE>: operation not permitted</strong></p> <p>This error occurs mostly when <code class="language-plaintext highlighter-rouge">~/.local/share/docker</code> is located on NFS.</p> <p>A workaround is to specify non-NFS <code class="language-plaintext highlighter-rouge">data-root</code> directory in <code class="language-plaintext highlighter-rouge">~/.config/docker/daemon.json</code> as follows:</p> <div class="highlight"><pre class="highlight" data-language="">{"data-root":"/somewhere-out-of-nfs"} +</pre></div> <h3 id="docker-run-errors"> +<code class="language-plaintext highlighter-rouge">docker run</code> errors</h3> <p><strong>docker: Error response from daemon: OCI runtime create failed: ...: read unix @->/run/systemd/private: read: connection reset by peer: unknown.</strong></p> <p>This error occurs on cgroup v2 hosts mostly when the dbus daemon is not running for the user.</p> <div class="highlight"><pre class="highlight" data-language="">$ systemctl --user is-active dbus +inactive + +$ docker run hello-world +docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:385: applying cgroup configuration for process caused: error while starting unit "docker +-931c15729b5a968ce803784d04c7421f791d87e5ca1891f34387bb9f694c488e.scope" with properties [{Name:Description Value:"libcontainer container 931c15729b5a968ce803784d04c7421f791d87e5ca1891f34387bb9f694c488e"} {Name:Slice Value:"use +r.slice"} {Name:PIDs Value:@au [4529]} {Name:Delegate Value:true} {Name:MemoryAccounting Value:true} {Name:CPUAccounting Value:true} {Name:IOAccounting Value:true} {Name:TasksAccounting Value:true} {Name:DefaultDependencies Val +ue:false}]: read unix @->/run/systemd/private: read: connection reset by peer: unknown. +</pre></div> <p>To fix the issue, run <code class="language-plaintext highlighter-rouge">sudo apt-get install -y dbus-user-session</code> or <code class="language-plaintext highlighter-rouge">sudo dnf install -y dbus-daemon</code>, and then relogin.</p> <p>If the error still occurs, try running <code class="language-plaintext highlighter-rouge">systemctl --user enable --now dbus</code> (without sudo).</p> <p><strong><code class="language-plaintext highlighter-rouge">--cpus</code>, <code class="language-plaintext highlighter-rouge">--memory</code>, and <code class="language-plaintext highlighter-rouge">--pids-limit</code> are ignored</strong></p> <p>This is an expected behavior on cgroup v1 mode. To use these flags, the host needs to be configured for enabling cgroup v2. For more information, see <a href="#limiting-resources">Limiting resources</a>.</p> <h3 id="networking-errors">Networking errors</h3> <p><strong><code class="language-plaintext highlighter-rouge">docker run -p</code> fails with <code class="language-plaintext highlighter-rouge">cannot expose privileged port</code></strong></p> <p><code class="language-plaintext highlighter-rouge">docker run -p</code> fails with this error when a privileged port (< 1024) is specified as the host port.</p> <div class="highlight"><pre class="highlight" data-language="">$ docker run -p 80:80 nginx:alpine +docker: Error response from daemon: driver failed programming external connectivity on endpoint focused_swanson (9e2e139a9d8fc92b37c36edfa6214a6e986fa2028c0cc359812f685173fa6df7): Error starting userland proxy: error while calling PortManager.AddPort(): cannot expose privileged port 80, you might need to add "net.ipv4.ip_unprivileged_port_start=0" (currently 1024) to /etc/sysctl.conf, or set CAP_NET_BIND_SERVICE on rootlesskit binary, or choose a larger port number (>= 1024): listen tcp 0.0.0.0:80: bind: permission denied. +</pre></div> <p>When you experience this error, consider using an unprivileged port instead. For example, 8080 instead of 80.</p> <div class="highlight"><pre class="highlight" data-language="">$ docker run -p 8080:80 nginx:alpine +</pre></div> <p>To allow exposing privileged ports, see <a href="#exposing-privileged-ports">Exposing privileged ports</a>.</p> <p><strong>ping doesn’t work</strong></p> <p>Ping does not work when <code class="language-plaintext highlighter-rouge">/proc/sys/net/ipv4/ping_group_range</code> is set to <code class="language-plaintext highlighter-rouge">1 0</code>:</p> <div class="highlight"><pre class="highlight" data-language="">$ cat /proc/sys/net/ipv4/ping_group_range +1 0 +</pre></div> <p>For details, see <a href="#routing-ping-packets">Routing ping packets</a>.</p> <p><strong><code class="language-plaintext highlighter-rouge">IPAddress</code> shown in <code class="language-plaintext highlighter-rouge">docker inspect</code> is unreachable</strong></p> <p>This is an expected behavior, as the daemon is namespaced inside RootlessKit’s network namespace. Use <code class="language-plaintext highlighter-rouge">docker run -p</code> instead.</p> <p><strong><code class="language-plaintext highlighter-rouge">--net=host</code> doesn’t listen ports on the host network namespace</strong></p> <p>This is an expected behavior, as the daemon is namespaced inside RootlessKit’s network namespace. Use <code class="language-plaintext highlighter-rouge">docker run -p</code> instead.</p> <p><strong>Network is slow</strong></p> <p>Docker with rootless mode uses <a href="https://github.com/rootless-containers/slirp4netns">slirp4netns</a> as the default network stack if slirp4netns v0.4.0 or later is installed. If slirp4netns is not installed, Docker falls back to <a href="https://github.com/moby/vpnkit">VPNKit</a>.</p> <p>Installing slirp4netns may improve the network throughput. See <a href="https://github.com/rootless-containers/rootlesskit/tree/v0.13.0#network-drivers">RootlessKit documentation</a> for the benchmark result.</p> <p>Also, changing MTU value may improve the throughput. The MTU value can be specified by creating <code class="language-plaintext highlighter-rouge">~/.config/systemd/user/docker.service.d/override.conf</code> with the following content:</p> <div class="highlight"><pre class="highlight" data-language="">[Service] +Environment="DOCKERD_ROOTLESS_ROOTLESSKIT_MTU=<INTEGER>" +</pre></div> <p>And then restart the daemon:</p> <div class="highlight"><pre class="highlight" data-language="">$ systemctl --user daemon-reload +$ systemctl --user restart docker +</pre></div> <p><strong><code class="language-plaintext highlighter-rouge">docker run -p</code> does not propagate source IP addresses</strong></p> <p>This is because Docker with rootless mode uses RootlessKit’s builtin port driver by default.</p> <p>The source IP addresses can be propagated by creating <code class="language-plaintext highlighter-rouge">~/.config/systemd/user/docker.service.d/override.conf</code> with the following content:</p> <div class="highlight"><pre class="highlight" data-language="">[Service] +Environment="DOCKERD_ROOTLESS_ROOTLESSKIT_PORT_DRIVER=slirp4netns" +</pre></div> <p>And then restart the daemon:</p> <div class="highlight"><pre class="highlight" data-language="">$ systemctl --user daemon-reload +$ systemctl --user restart docker +</pre></div> <p>Note that this configuration decreases throughput. See <a href="https://github.com/rootless-containers/rootlesskit/tree/v0.13.0#port-drivers">RootlessKit documentation</a> for the benchmark result.</p> <h3 id="tips-for-debugging">Tips for debugging</h3> <p><strong>Entering into <code class="language-plaintext highlighter-rouge">dockerd</code> namespaces</strong></p> <p>The <code class="language-plaintext highlighter-rouge">dockerd-rootless.sh</code> script executes <code class="language-plaintext highlighter-rouge">dockerd</code> in its own user, mount, and network namespaces.</p> <p>For debugging, you can enter the namespaces by running <code class="language-plaintext highlighter-rouge">nsenter -U --preserve-credentials -n -m -t $(cat $XDG_RUNTIME_DIR/docker.pid)</code>.</p> +<p><a href="https://docs.docker.com/search/?q=security">security</a>, <a href="https://docs.docker.com/search/?q=namespaces">namespaces</a>, <a href="https://docs.docker.com/search/?q=rootless">rootless</a></p> +<div class="_attribution"> + <p class="_attribution-p"> + © 2019 Docker, Inc.<br>Licensed under the Apache License, Version 2.0.<br>Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or other countries.<br>Docker, Inc. and other parties may also have trademark rights in other terms used herein.<br> + <a href="https://docs.docker.com/engine/security/rootless/" class="_attribution-link">https://docs.docker.com/engine/security/rootless/</a> + </p> +</div> |
