From fd1239c4e93719fdba3d96bc1e2ce3bd37716014 Mon Sep 17 00:00:00 2001 From: aevesdocker Date: Wed, 20 Nov 2024 13:44:57 +0000 Subject: [PATCH 1/3] ENGDOCS-2324 --- _vale/Docker/Acronyms.yml | 3 + _vale/config/vocabularies/Docker/accept.txt | 11 +++ .../enhanced-container-isolation/_index.md | 73 ++++++------------- .../enhanced-container-isolation/config.md | 70 ++++++++---------- .../enhanced-container-isolation/faq.md | 18 ++--- .../features-benefits.md | 73 +++++++++---------- .../how-eci-works.md | 24 +++--- .../limitations.md | 43 ++++------- 8 files changed, 134 insertions(+), 181 deletions(-) diff --git a/_vale/Docker/Acronyms.yml b/_vale/Docker/Acronyms.yml index 9aa3cbb47f3..a2a89350e10 100644 --- a/_vale/Docker/Acronyms.yml +++ b/_vale/Docker/Acronyms.yml @@ -15,6 +15,7 @@ exceptions: - AUFS - AWS - BIOS + - BPF - CI - CISA - CLI @@ -52,6 +53,7 @@ exceptions: - HTTP - HTTPS - IAM + - ID - IDE - IP - JAR @@ -75,6 +77,7 @@ exceptions: - PATH - PDF - PEM + - PID - PHP - POSIX - POST diff --git a/_vale/config/vocabularies/Docker/accept.txt b/_vale/config/vocabularies/Docker/accept.txt index 690a9302196..a29a868073f 100644 --- a/_vale/config/vocabularies/Docker/accept.txt +++ b/_vale/config/vocabularies/Docker/accept.txt @@ -5,6 +5,7 @@ Apple Artifactory Autotest Azure +Berkely Btrfs BuildKit BusyBox @@ -21,6 +22,7 @@ Datadog Ddosify Debootstrap Dev Environments? +Dev Django Docker Build Cloud Docker Business @@ -71,23 +73,27 @@ Nuxeo OAuth OTel Okta +Paketo PKG Postgres PowerShell Python S3 +Seccomp SQLite Slack Snyk Solr SonarQube Syft +Sysbox Sysdig Testcontainers Traefik Ubuntu Unix VMware +VM's Wasm Windows WireMock @@ -95,6 +101,7 @@ Zscaler Zsh [Aa]utobuild [Bb]uildx +[Bb]uildpack(s)? [Cc]odenames? [Cc]ompose [Dd]istroless @@ -109,6 +116,7 @@ Zsh [Nn]amespace [Oo]nboarding [Pp]aravirtualization +[Pp]rocfs [Pp]roxied [Pp]roxying [Rr]eal-time @@ -116,6 +124,7 @@ Zsh [Ss]andbox(ed)? [Ss]wappable [Ss]warm +[Ss]ysfs [Tt]oolchains? [Vv]irtiofs [Vv]irtualize @@ -145,10 +154,12 @@ monorepos? musl nameserver namespace +namespacing npm osquery osxfs pgAdmin +rootful runc snapshotters? stdin diff --git a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/_index.md b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/_index.md index a8228d4a6a3..1bfa23d69c6 100644 --- a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/_index.md +++ b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/_index.md @@ -13,49 +13,41 @@ weight: 20 > > Enhanced Container Isolation is available to Docker Business customers only. -Enhanced Container Isolation provides an additional layer of security to prevent malicious workloads running in containers from compromising Docker Desktop or the host. +Enhanced Container Isolation (ECI) provides an additional layer of security to prevent malicious workloads running in containers from compromising Docker Desktop or the host. -It uses a variety of advanced techniques to harden container isolation, but without impacting developer productivity. It is available with [Docker Desktop 4.13.0 and later](/manuals/desktop/release-notes.md). +It uses a variety of advanced techniques to harden container isolation, but without impacting developer productivity. -These techniques include: -- Running all containers unprivileged through the Linux user-namespace, even those launched with the `--privileged` flag. This makes it harder for malicious container workloads to escape the container and infect the Docker Desktop VM and host. -- Ensuring Docker Desktop VM immutability (e.g., its internal settings can't be modified by containers or users). -- Vetting some critical system calls to prevent container escapes, and partially virtualizing portions of `/proc` and `/sys` inside the container for further isolation. -- Preventing user console access to the Docker Desktop VM. - -When Enhanced Container Isolation is enabled, these mechanisms are applied automatically and with minimal functional or performance impact to developers. Developers continue to use Docker Desktop as usual, but the containers they launch are more strongly isolated. - -Enhanced Container Isolation ensures stronger container isolation and also locks in any security configurations that have been created by IT admins, for instance through [Registry Access Management policies](/manuals/security/for-admins/hardened-desktop/registry-access-management.md) or with [Settings Management](../settings-management/_index.md). +Enhanced Container Isolation ensures stronger container isolation and also locks in any security configurations that have been created by administrators, for instance through [Registry Access Management policies](/manuals/security/for-admins/hardened-desktop/registry-access-management.md) or with [Settings Management](../settings-management/_index.md). > [!NOTE] > -> Enhanced Container Isolation is in addition to other container security techniques used by Docker. For example, reduced Linux Capabilities, Seccomp, AppArmor. +> ECI is in addition to other container security techniques used by Docker. For example, reduced Linux Capabilities, Seccomp, AppArmor. -### Who is it for? +## Who is it for? - For organizations and developers that want to prevent container attacks and reduce vulnerabilities in developer environments. - For organizations that want to ensure stronger container isolation that is easy and intuitive to implement on developers' machines. -### What happens when Enhanced Container Isolation is turned on? +## What happens when Enhanced Container Isolation is turned on? -When Enhanced Container Isolation is turned on, the following features are enabled: +When Enhanced Container Isolation is turned on, the following features and security techniques are enabled: -- All user containers are automatically run in Linux User Namespaces which ensures stronger isolation. Each container runs in a dedicated Linux user-namespace. +- All user containers are automatically run in Linux user namespaces which ensures stronger isolation. Each container runs in a dedicated Linux user-namespace. - The root user in the container maps to an unprivileged user inside the Docker Desktop Linux VM. -- Containers become harder to breach. For example, sensitive system calls are vetted and portions of `/proc` and `/sys` are emulated. +- Containers become harder to breach. For example, sensitive system calls are vetted and portions of `/proc` and `/sys` are emulated inside the container. - Users can continue using containers as usual, including bind mounting host directories, volumes, etc. - No change in the way developers run containers, and no special container images are required. -- Privileged containers (e.g., `--privileged` flag) work, but they are only privileged within the container's Linux User Namespace, not in the Docker Desktop VM. Therefore they can't be used to breach the Docker Desktop VM. +- Privileged containers (e.g., `--privileged` flag) work, but they are only privileged within the container's Linux user namespace, not in the Docker Desktop VM. Therefore they can't be used to breach the Docker Desktop VM. - Docker-in-Docker and even Kubernetes-in-Docker works, but run unprivileged inside the Docker Desktop Linux VM. In addition, the following restrictions are imposed: - Containers can no longer share namespaces with the Docker Desktop VM (e.g., `--network=host`, `--pid=host` are disallowed). - Containers can no longer modify configuration files inside the Docker Desktop VM (e.g., mounting any VM directory into the container is disallowed). -- Containers can no longer access the Docker engine (e.g., mounting the Docker engine's socket into the container is restricted); this prevents malicious containers from gaining control of the Docker engine. Admins can relax this for [trusted container images](config.md). +- Containers can no longer access the Docker Engine. For example, mounting the Docker Engine's socket into the container is restricted which prevents malicious containers from gaining control of the Docker Engine. Administrators can relax this for [trusted container images](config.md). - Console access to the Docker Desktop VM is forbidden for all users. -These features and restrictions ensure that containers are better secured at runtime, with minimal impact to developer experience and productivity. +These features and restrictions ensure that containers are better secured at runtime, with minimal impact to developer experience and productivity. Developers can continue to use Docker Desktop as usual, but the containers they launch are more strongly isolated. For more information on how Enhanced Container Isolation work, see [How does it work](how-eci-works.md). @@ -65,20 +57,9 @@ For more information on how Enhanced Container Isolation work, see [How does it > Kubernetes pods and Extension containers. For more information on known > limitations and workarounds, see [FAQs](faq.md). -### What host OSes / platforms is Enhanced Container Isolation supported on? - -Enhanced Container Isolation (ECI) was introduced in Docker Desktop 4.13, for all platforms (Windows, Mac, and Linux). - -For Windows hosts, ECI works with both the Docker Desktop Hyper-V and WSL 2 backends, as follows: +## How do I enable Enhanced Container Isolation? -- Docker Desktop 4.19 or prior: ECI only works with Hyper-V. -- Docker Desktop 4.20 or later: ECI Works with both Hyper-V and WSL 2 (with WSL version 1.1.3.0 and above). - -See [ECI Support for WSL](limitations.md#eci-support-for-wsl) for further info as well as security caveats when using Enhanced Container Isolation on WSL 2. - -### How do I enable Enhanced Container Isolation? - -#### As a developer +### As a developer To enable Enhanced Container Isolation as a developer: 1. Ensure your organization has a Docker Business subscription. @@ -92,19 +73,13 @@ To enable Enhanced Container Isolation as a developer: > > Enhanced Container Isolation does not protect containers created prior to enabling ECI. For more information on known limitations and workarounds, see [FAQs](faq.md). -#### As an admin - -##### Prerequisite +### As an administrator -To enable Enhanced Container Isolation as an admin, you first need to [enforce -sign-in](/manuals/security/for-admins/enforce-sign-in/_index.md). This is -because the Enhanced Container Isolation feature requires a Docker Business -subscription and therefore your Docker Desktop users must authenticate to your -organization for this configuration to take effect. +#### Prerequisite -Enforcing sign-in ensures that your Docker Desktop developers always authenticate to your organization. +You first need to [enforce sign-in](/manuals/security/for-admins/enforce-sign-in/_index.md) to ensure that all Docker Desktop developers authenticate with your organization. Since Settings Management requires a Docker Business subscription, enforced sign-in guarantees that only authenticated users have access and that the feature consistently takes effect across all users, even though it may still work without enforced sign-in. -##### Setup +#### Setup [Create and configure the `admin-settings.json` file](../settings-management/configure.md) and specify: @@ -118,13 +93,13 @@ Enforcing sign-in ensures that your Docker Desktop developers always authenticat } ``` -By setting `"value": true`, the admin ensures ECI is enabled by default. By -setting `"locked": true`, the admin ensures ECI can't be disabled by -developers. If you wish to give developers the ability to disable the feature, +Setting `"value": true` ensures ECI is enabled by default. By +setting `"locked": true`, ECI can't be disabled by +developers. If you want to give developers the ability to disable the feature, set `"locked": false`. -In addition, starting with Docker Desktop 4.27, admins can also configure Docker -socket mount permissions for containers, as described [here](config.md). +In addition, you can also [configure Docker +socket mount permissions for containers](config.md). For this to take effect: @@ -135,7 +110,7 @@ For this to take effect: > > Selecting **Restart** from the Docker menu isn't enough as it only restarts some components of Docker Desktop. -### What do users see when this setting is enforced by an admin? +## What do users see when this setting is enforced by an administrator? When Enhanced Container Isolation is enabled, users see: - **Use Enhanced Container Isolation** toggled on in **Settings** > **General**. diff --git a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md index dee207690e1..87b1075d660 100644 --- a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md +++ b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md @@ -8,16 +8,9 @@ aliases: weight: 30 --- -> [!NOTE] -> -> This feature is available with Docker Desktop version 4.27 (and later) on Mac, Linux, and Windows (Hyper-V). -> For Windows with WSL 2, this feature requires Docker Desktop 4.28 and later. - -This page describes optional, advanced configurations for ECI, once ECI is enabled. - ## Docker socket mount permissions -By default, when ECI is enabled, Docker Desktop does not allow bind-mounting the +By default, when Enhanced Container Isolation (ECI) is enabled, Docker Desktop does not allow bind-mounting the Docker Engine socket into containers: ```console @@ -25,8 +18,8 @@ $ docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock docker:cli docker: Error response from daemon: enhanced container isolation: docker socket mount denied for container with image "docker.io/library/docker"; image is not in the allowed list; if you wish to allow it, configure the docker socket image list in the Docker Desktop settings. ``` This prevents malicious containers from gaining access to the Docker Engine, as -such access could allow them to perform supply chain attacks (e.g., build and -push malicious images into the organization's repositories) or similar. +such access could allow them to perform supply chain attacks. For example, build and +push malicious images into the organization's repositories or similar. However, some legitimate use cases require containers to have access to the Docker Engine socket. For example, the popular [Testcontainers](https://testcontainers.com/) @@ -35,11 +28,11 @@ manage them or perform post-test cleanup. Similarly, some Buildpack frameworks, for example [Paketo](https://paketo.io/), require Docker socket bind-mounts into containers. -Starting with Docker Desktop 4.27, admins can optionally configure ECI to allow +Administrators can optionally configure ECI to allow bind mounting the Docker Engine socket into containers, but in a controlled way. This can be done via the Docker Socket mount permissions section in the -[admin-settings.json](../settings-management/configure.md) file. For example: +[`admin-settings.json`](../settings-management/configure.md) file. For example: ```json { @@ -64,15 +57,14 @@ This can be done via the Docker Socket mount permissions section in the } ``` -As shown above, there are two configurations for bind-mounting the Docker -socket into containers: the `imageList` and the `commandList`. These are -described below. +There are two configurations for bind-mounting the Docker +socket into containers - the `imageList` and the `commandList`. ### Image list The `imageList` is a list of container images that are allowed to bind-mount the -Docker socket. By default the list is empty (i.e., no containers are allowed to -bind-mount the Docker socket when ECI is enabled). However, an admin can add +Docker socket. By default the list is empty, no containers are allowed to +bind-mount the Docker socket when ECI is enabled. However, an administrator can add images to the list, using either of these formats: | Image Reference Format | Description | @@ -83,7 +75,7 @@ images to the list, using either of these formats: The image name follows the standard convention, so it can point to any registry and repository. -In the example above, the image list was configured with three images: +In the previous example, the image list was configured with three images: ```json "imageList": { @@ -107,11 +99,10 @@ $ docker run -it -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh > [!TIP] > -> Be restrictive on the images you allow, as described in [Recommendations](#recommendations) below. +> Be restrictive with the images you allow, as described in [Recommendations](#recommendations). -In general, it's easier to specify the image using the tag wildcard format -(e.g., `:*`) because then `imageList` doesn't need to be updated whenever a new version of the -image is used. Alternatively, you can use an immutable tag (e.g., `:latest`), +In general, it's easier to specify the image using the tag wildcard format, for example `:*`, because then `imageList` doesn't need to be updated whenever a new version of the +image is used. Alternatively, you can use an immutable tag, for example `:latest`, but it does not always work as well as the wildcard because, for example, Testcontainers uses specific versions of the image, not necessarily the latest one. @@ -122,8 +113,7 @@ memory. Then, when a container is started with a Docker socket bind-mount, Docker Desktop checks if the container's image digest matches one of the allowed digests. If so, the container is allowed to start, otherwise it's blocked. -Note that due to the digest comparison mentioned in the prior paragraph, it's -not possible to bypass the Docker socket mount permissions by retagging a +Due to the digest comparison, it's not possible to bypass the Docker socket mount permissions by re-tagging a disallowed image to the name of an allowed one. In other words, if a user does: @@ -139,11 +129,9 @@ ones in the repository. ### Docker Socket Mount Permissions for derived images -> [!NOTE] -> -> This feature is available with Docker Desktop version 4.34 and later. +{{ introduced desktop 4.34.0 "../../../../desktop/release-notes.md#4340" }} -As described in the prior section, admins can configure the list of container +As described in the prior section, administrators can configure the list of container images that are allowed to mount the Docker socket via the `imageList`. This works for most scenarios, but not always, because it requires knowing upfront @@ -160,7 +148,7 @@ also apply to any local images derived (i.e., built from) an image in the That is, if a local image called "myLocalImage" is built from "myBaseImage" (i.e., has a Dockerfile with a `FROM myBaseImage`), then if "myBaseImage" is in the `imageList`, both "myBaseImage" and "myLocalImage" are allowed to mount the -Docker socket (i.e., ECI won't block the mount). +Docker socket. For example, to enable Paketo buildpacks to work with Docker Desktop and ECI, simply add the following image to the `imageList`: @@ -188,7 +176,7 @@ A couple of caveats: * The `allowDerivedImages` setting only applies to local-only images built from an allowed image. That is, the derived image must not be present in a remote - repository (because if it were, you would just list it's name in the `imageList`). + repository because if it were, you would just list it's name in the `imageList`. * For derived image checking to work, the parent image (i.e., the image in the `imageList`) must be present locally (i.e., must have been explicitly pulled @@ -335,17 +323,17 @@ Whether to configure the list as an allow or deny list depends on the use case. | Unsupported command | Description | | :------------------- | :---------- | -| compose | Docker compose | -| dev | Docker dev environments | -| extension | Manages Docker extensions | -| feedback | Send feedback to Docker | -| init | Creates Docker-related starter files | -| manifest | Manages Docker image manifests | -| plugins | Manages plugins | -| sbom | View Software Bill of Materials (SBOM) | -| scan | Docker Scan | -| scout | Docker Scout | -| trust | Manage trust on Docker images | +| `compose` | Docker Compose | +| `dev` | Dev environments | +| `extension` | Manages Docker Extensions | +| `feedback` | Send feedback to Docker | +| `init` | Creates Docker-related starter files | +| `manifest` | Manages Docker image manifests | +| `plugins` | Manages plugins | +| `sbom` | View Software Bill of Materials (SBOM) | +| `scan` | Docker Scan | +| `scout` | Docker Scout | +| `trust` | Manage trust on Docker images | > [!NOTE] > diff --git a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/faq.md b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/faq.md index dde563e71b8..a2f7048eee3 100644 --- a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/faq.md +++ b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/faq.md @@ -1,5 +1,5 @@ --- -title: Enhanced Container Isolation (ECI) FAQs +title: Enhanced Container Isolation FAQs linkTitle: FAQs description: Frequently asked questions for Enhanced Container Isolation keywords: enhanced container isolation, security, faq, sysbox, Docker Desktop @@ -26,12 +26,12 @@ minimum. Yes, you can use the `--privileged` flag in containers but unlike privileged containers without ECI, the container can only use it's elevated privileges to access resources assigned to the container. It can't access global kernel -resources in the Docker Desktop Linux VM. This allows you to run privileged +resources in the Docker Desktop Linux VM. This lets you run privileged containers securely (including Docker-in-Docker). For more information, see [Key features and benefits](features-benefits.md#privileged-containers-are-also-secured). ### Will all privileged container workloads run with ECI? -No. Privileged container workloads that wish to access global kernel resources +No. Privileged container workloads that want to access global kernel resources inside the Docker Desktop Linux VM won't work. For example, you can't use a privileged container to load a kernel module. @@ -61,7 +61,7 @@ when using [Testcontainers](https://testcontainers.com/) for local testing. To enable such use cases, it's possible to configure ECI to allow Docker socket mounts into containers, but only for your chosen (i.e,. trusted) container images, and -even restrict what commands the container can send to the Docker engine via the socket. +even restrict what commands the container can send to the Docker Engine via the socket. See [ECI Docker socket mount permissions](config.md#docker-socket-mount-permissions). ### Does ECI protect all containers launched with Docker Desktop? @@ -84,12 +84,12 @@ and [Dev Environments containers](/manuals/desktop/features/dev-environments/_in ### Does ECI protect containers launched prior to enabling ECI? -No. Containers created prior to switching on ECI are not protected. Therefore, we -recommend removing all containers prior to switching on ECI. +No. Containers created prior to switching on ECI are not protected. Therefore, it is +recommended you remove all containers prior to switching on ECI. ### Does ECI affect the performance of containers? -ECI has very little impact on the performance of +ECI has little impact on the performance of containers. The exception is for containers that perform lots of `mount` and `umount` system calls, as these are trapped and vetted by the Sysbox container runtime to ensure they are not being used to breach the container's filesystem. @@ -101,10 +101,10 @@ containers deployed by Docker Desktop users. If a user attempts to override the runtime (e.g., `docker run --runtime=runc`), this request is ignored and the container is created through the Sysbox runtime. -The reason `runc` is disallowed with ECI because it allows users to run as "true +The reason `runc` is disallowed is it lets users run as "true root" on the Docker Desktop Linux VM, thereby providing them with implicit control of the VM and the ability to modify the administrative configurations -for Docker Desktop, for example. +for Docker Desktop. ### How is ECI different from Docker Engine's userns-remap mode? diff --git a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/features-benefits.md b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/features-benefits.md index 494e2d4a745..42865c68c27 100644 --- a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/features-benefits.md +++ b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/features-benefits.md @@ -7,9 +7,9 @@ aliases: weight: 20 --- -### Linux User Namespace on all containers +## Linux user namespace on all containers -With Enhanced Container Isolation, all user containers leverage the [Linux user-namespace](https://man7.org/linux/man-pages/man7/user_namespaces.7.html) +With Enhanced Container Isolation, all user containers leverage the [Linux user namespace](https://man7.org/linux/man-pages/man7/user_namespaces.7.html) for extra isolation. This means that the root user in the container maps to an unprivileged user in the Docker Desktop Linux VM. @@ -21,7 +21,7 @@ $ docker run -it --rm --name=first alpine 0 100000 65536 ``` -The output `0 100000 65536` is the signature of the Linux user-namespace. It +The output `0 100000 65536` is the signature of the Linux user namespace. It means that the root user (0) in the container is mapped to unprivileged user 100000 in the Docker Desktop Linux VM, and the mapping extends for a continuous range of 64K user IDs. The same applies to group IDs. @@ -44,28 +44,25 @@ $ docker run -it --rm alpine 0 0 4294967295 ``` -By virtue of using the Linux user-namespace, Enhanced Container Isolation +By virtue of using the Linux user namespace, Enhanced Container Isolation ensures the container processes never run as user ID 0 (true root) in the Linux VM. In fact they never run with any valid user-ID in the Linux VM. Thus, their Linux capabilities are constrained to resources within the container only, increasing isolation significantly compared to regular containers, both container-to-host and cross-container isolation. -### Privileged containers are also secured +## Privileged containers are also secured Privileged containers `docker run --privileged ...` are insecure because they give the container full access to the Linux kernel. That is, the container runs -as true root with all capabilities enabled, seccomp and AppArmor restrictions +as true root with all capabilities enabled, Seccomp and AppArmor restrictions are disabled, all hardware devices are exposed, for example. -For organizations that wish to secure Docker Desktop on their developer's -machines, privileged containers are problematic as they allow container -workloads whether benign or malicious to gain control of the Linux kernel -inside the Docker Desktop VM and thus modify security related settings, for example registry +Organizations aiming to secure Docker Desktop on developers' machines face challenges with privileged containers. These containers, whether running benign or malicious workloads, can gain control of the Linux kernel within the Docker Desktop VM, potentially altering security related settings, for example registry access management, and network proxies. With Enhanced Container Isolation, privileged containers can no longer do -this. The combination of the Linux user-namespace and other security techniques +this. The combination of the Linux user namespace and other security techniques used by Sysbox ensures that processes inside a privileged container can only access resources assigned to the container. @@ -74,13 +71,12 @@ access resources assigned to the container. > Enhanced Container Isolation does not prevent users from launching privileged > containers, but rather runs them securely by ensuring that they can only > modify resources associated with the container. Privileged workloads that -> modify global kernel settings, for example loading a kernel module or changing BPF +> modify global kernel settings, for example loading a kernel module or changing Berkely Packet Filters (BPF) > settings will not work properly as they will receive "permission > denied" error when attempting such operations. For example, Enhanced Container Isolation ensures privileged containers can't -access Docker Desktop network settings in the Linux VM configured via Berkeley -Packet Filters (BPF): +access Docker Desktop network settings in the Linux VM configured via BPF: ```console $ docker run --privileged djs55/bpftool map show @@ -105,13 +101,13 @@ example Docker-in-Docker, Kubernetes-in-Docker, etc. With Enhanced Container Isolation you can still run such workloads but do so much more securely than before. -### Containers can't share namespaces with the Linux VM +## Containers can't share namespaces with the Linux VM When Enhanced Container Isolation is enabled, containers can't share Linux -namespaces with the host (e.g., pid, network, uts, etc.) as that essentially +namespaces with the host (e.g., PID, network, uts, etc.) as that essentially breaks isolation. -For example, sharing the pid namespace fails: +For example, sharing the PID namespace fails: ```console $ docker run -it --rm --pid=host alpine @@ -125,7 +121,7 @@ $ docker run -it --rm --network=host alpine docker: Error response from daemon: failed to create shim task: OCI runtime create failed: error in the container spec: invalid or unsupported container spec: sysbox containers can't share a network namespace with the host (because they use the linux user-namespace for isolation): unknown. ``` -In addition, the `--userns=host` flag, used to disable the user-namespace on the +In addition, the `--userns=host` flag, used to disable the user namespace on the container, is ignored: ```console @@ -138,7 +134,7 @@ Finally, Docker build `--network=host` and Docker buildx entitlements (`network.host`, `security.insecure`) are not allowed. Builds that require these won't work properly. -### Bind mount restrictions +## Bind mount restrictions When Enhanced Container Isolation is enabled, Docker Desktop users can continue to bind mount host directories into containers as configured via **Settings** > @@ -147,7 +143,7 @@ arbitrary Linux VM directories into containers. This prevents containers from modifying sensitive files inside the Docker Desktop Linux VM, files that can hold configurations for registry access -management, proxies, docker engine configurations, and more. +management, proxies, Docker Engine configurations, and more. For example, the following bind mount of the Docker Engine's configuration file (`/etc/docker/daemon.json` inside the Linux VM) into a container is restricted @@ -162,7 +158,7 @@ In contrast, without Enhanced Container Isolation this mount works and gives the container full read and write access to the Docker Engine's configuration. Of course, bind mounts of host files continue to work as usual. For example, -assuming a user configures Docker Desktop to file share her $HOME directory, +assuming a user configures Docker Desktop to file share her `$HOME` directory, she can bind mount it into the container: ```console @@ -173,16 +169,16 @@ $ docker run -it --rm -v $HOME:/mnt alpine > [!NOTE] > > By default, Enhanced Container Isolation won't allow bind mounting the Docker Engine socket -> (/var/run/docker.sock) into a container, as doing so essentially grants the +> (`/var/run/docker.sock`) into a container, as doing so essentially grants the > container control of Docker Engine, thus breaking container isolation. However, > as some legitimate use cases require this, it's possible to relax > this restriction for trusted container images. See [Docker socket mount permissions](config.md#docker-socket-mount-permissions). -### Vetting sensitive system calls +## Vetting sensitive system calls Another feature of Enhanced Container Isolation is that it intercepts and vets a few highly sensitive system calls inside containers, such as `mount` and -`umount`. This ensures that processes that have capabilities to execute these +`umount`. This ensures that processes that have capabilities to execute these system calls can't use them to breach the container. For example, a container that has `CAP_SYS_ADMIN` (required to execute the @@ -200,7 +196,7 @@ read-only, it can't be changed from within the container to read-write, even if ensures container processes can't use `mount`, or `umount`, to breach the container's root filesystem. -Note however that in the example above the container can still create mounts +Note however that in the previous example the container can still create mounts within the container, and mount them read-only or read-write as needed. Those mounts are allowed since they occur within the container, and therefore don't breach it's root filesystem: @@ -226,10 +222,10 @@ that it does not affect the performance of containers in the great majority of cases. It intercepts control-path system calls that are rarely used in most container workloads but data-path system calls are not intercepted. -### Filesystem user-ID mappings +## Filesystem user-ID mappings -As mentioned above, Enhanced Container Isolation enables the Linux -user-namespace on all containers and this ensures that the container's user-ID +As mentioned, ECI enables the Linux +user namespace on all containers. This ensures that the container's user-ID range (0->64K) maps to an unprivileged range of "real" user-IDs in the Docker Desktop Linux VM (e.g., 100000->165535). @@ -240,29 +236,28 @@ group-IDs. In addition, if a container is stopped and restarted, there is no guarantee it will receive the same mapping as before. This is by design and further improves security. -However the above presents a problem when mounting Docker volumes into -containers, as the files written to such volumes will have the real -user/group-IDs and will therefore won't be accessible across a container's +However this presents a problem when mounting Docker volumes into +containers. Files written to such volumes have the real +user/group-IDs and therefore won't be accessible across a container's start/stop/restart, or between containers due to the different real user-ID/group-ID of each container. To solve this problem, Sysbox uses "filesystem user-ID remapping" via the Linux -Kernel's ID-mapped mounts feature (added in 2021) or an alternative module -called shiftfs. These technologies map filesystem accesses from the container's +Kernel's ID-mapped mounts feature (added in 2021) or an alternative `shiftsfs` module. These technologies map filesystem accesses from the container's real user-ID (e.g., range 100000->165535) to the range (0->65535) inside Docker Desktop's Linux VM. This way, volumes can now be mounted or shared across containers, even if each container uses an exclusive range of user-IDs. Users need not worry about the container's real user-IDs. -Note that although filesystem user-ID remapping may cause containers to access -Linux VM files mounted into the container with real user-ID 0 (i.e., root), the -[restricted mounts feature](#bind-mount-restrictions) described above ensures -that no Linux VM sensitive files can be mounted into the container. +Although filesystem user-ID remapping may cause containers to access +Linux VM files mounted into the container with real user-ID 0, the +[restricted mounts feature](#bind-mount-restrictions) ensures +that sensitive Linux VM files can't be mounted into the container. -### Procfs & Sysfs Emulation +## Procfs & sysfs emulation Another feature of Enhanced Container Isolation is that inside each container, -the procfs ("/proc") and sysfs ("/sys") filesystems are partially emulated. This +the `/proc` and `/sys` filesystems are partially emulated. This serves several purposes, such as hiding sensitive host information inside the container and namespacing host kernel resources that are not yet namespaced by the Linux kernel itself. diff --git a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/how-eci-works.md b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/how-eci-works.md index 8311006ffe2..2c1c4a7be99 100644 --- a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/how-eci-works.md +++ b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/how-eci-works.md @@ -12,9 +12,6 @@ container runtime](https://github.com/nestybox/sysbox). Sysbox is a fork of the standard OCI runc runtime that was modified to enhance standard container isolation and workloads. For more details see [Under the hood](#under-the-hood). -Starting with version 4.13, Docker Desktop includes a customized version of -Sysbox. - When [Enhanced Container Isolation is enabled](index.md#how-do-i-enable-enhanced-container-isolation), containers created by users through `docker run` or `docker create` are automatically launched using Sysbox instead of the standard OCI runc runtime. Users need not @@ -28,11 +25,10 @@ to breach the Docker Desktop Virtual Machine (VM) or other containers. > [!NOTE] > > When Enhanced Container Isolation is enabled in Docker Desktop, the Docker CLI -> "--runtime" flag is ignored. Docker's default runtime continues to be "runc", +> `--runtime` flag is ignored. Docker's default runtime continues to be `runc`, > but all user containers are implicitly launched with Sysbox. -Enhanced Container Isolation is not the same as Docker Engine's userns-remap -mode or Rootless Docker. This is explained further below. +Enhanced Container Isolation is not the same as [Docker Engine's userns-remap mode or Rootless Docker](#enhanced-container-isolation-versus-docker-userns-remap-mode). ### Under the hood @@ -42,20 +38,20 @@ Sysbox enhances container isolation by using techniques such as: * Restricting the container from mounting sensitive VM directories. * Vetting sensitive system-calls between the container and the Linux kernel. * Mapping filesystem user/group IDs between the container's user-namespace and the Linux VM. -* Emulating portions of the procfs and sysfs filesystems inside the container. +* Emulating portions of the `/proc` and `/sys` filesystems inside the container. Some of these are made possible by recent advances in the Linux kernel which Docker Desktop now incorporates. Sysbox applies these techniques with minimal functional or performance impact to containers. These techniques complement Docker's traditional container security mechanisms -such as using other Linux namespaces, cgroups, restricted Linux capabilities, -seccomp, and AppArmor. They add a strong layer of isolation between the +such as using other Linux namespaces, cgroups, restricted Linux Capabilities, +Seccomp, and AppArmor. They add a strong layer of isolation between the container and the Linux kernel inside the Docker Desktop VM. For more information, see [Key features and benefits](features-benefits.md). -### Enhanced Container Isolation vs Docker Userns-Remap Mode +### Enhanced Container Isolation versus Docker Userns-Remap Mode The Docker Engine includes a feature called [userns-remap mode](/engine/security/userns-remap/) that enables the user-namespace in all containers. However it suffers from a few @@ -70,16 +66,16 @@ exclusive user-namespace mappings per container automatically and adds several other [container isolation features](#under-the-hood) meant to secure Docker Desktop in organizations with stringent security requirements. -### Enhanced Container Isolation vs Rootless Docker +### Enhanced Container Isolation versus Rootless Docker -[Rootless Docker](/engine/security/rootless/) allows the Docker Engine, and by +[Rootless Docker](/engine/security/rootless/) lets Docker Engine, and by extension the containers, to run without root privileges natively on a Linux host. This -allows non-root users to install and run Docker natively on Linux. +lets non-root users to install and run Docker natively on Linux. Rootless Docker is not supported within Docker Desktop. While it's a valuable feature when running Docker natively on Linux, its value within Docker Desktop is reduced since Docker Desktop runs the Docker Engine within a Linux VM. That -is, Docker Desktop already allows non-root host users to run Docker and +is, Docker Desktop already lets non-root host users to run Docker and isolates the Docker Engine from the host using a virtual machine. Unlike Rootless Docker, Enhanced Container Isolation does not run Docker Engine diff --git a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/limitations.md b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/limitations.md index c167bf03bbe..8021f1a7a99 100644 --- a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/limitations.md +++ b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/limitations.md @@ -8,14 +8,6 @@ weight: 50 ### ECI support for WSL -Prior to Docker Desktop 4.20, Enhanced Container Isolation (ECI) on -Windows hosts was only supported when Docker Desktop was configured to use -Hyper-V to create the Docker Desktop Linux VM. ECI was not supported when Docker -Desktop was configured to use Windows Subsystem for Linux (aka WSL). - -Starting with Docker Desktop 4.20, ECI is supported when Docker Desktop is -configured to use either Hyper-V or WSL 2. - > [!NOTE] > > Docker Desktop requires WSL 2 version 1.1.3.0 or later. To get the current @@ -23,29 +15,28 @@ configured to use either Hyper-V or WSL 2. > it returns a version number prior to 1.1.3.0, update WSL to the latest version > by typing `wsl --update` in a Windows command or PowerShell terminal. -Note however that ECI on WSL is not as secure as on Hyper-V because: +ECI on WSL is not as secure as on Hyper-V because: -* While ECI on WSL still hardens containers so that malicious workloads can't +- While ECI on WSL still hardens containers so that malicious workloads can't easily breach Docker Desktop's Linux VM, ECI on WSL can't prevent Docker Desktop users from breaching the Docker Desktop Linux VM. Such users can trivially access that VM (as root) with the `wsl -d docker-desktop` command, and use that access to modify Docker Engine settings inside the VM. This gives - Docker Desktop users control of the Docker Desktop VM and allows them to - bypass Docker Desktop configs set by admins via the + Docker Desktop users control of the Docker Desktop VM and lets them bypass Docker Desktop configs set by administrators via the [settings-management](../settings-management/_index.md) feature. In contrast, - ECI on Hyper-V does not allow Docker Desktop users to breach the Docker + ECI on Hyper-V does not let Docker Desktop users to breach the Docker Desktop Linux VM. -* With WSL 2, all WSL 2 distributions on the same Windows host share the same instance +- With WSL 2, all WSL 2 distributions on the same Windows host share the same instance of the Linux kernel. As a result, Docker Desktop can't ensure the integrity of the kernel in the Docker Desktop Linux VM since another WSL 2 distribution could modify shared kernel settings. In contrast, when using Hyper-V, the Docker Desktop Linux VM has a dedicated kernel that is solely under the control of Docker Desktop. -The table below summarizes this. +The following table summarizes this. -| Security Feature | ECI on WSL | ECI on Hyper-V | Comment | +| Security feature | ECI on WSL | ECI on Hyper-V | Comment | | -------------------------------------------------- | ------------ | ---------------- | --------------------- | | Strongly secure containers | Yes | Yes | Makes it harder for malicious container workloads to breach the Docker Desktop Linux VM and host. | | Docker Desktop Linux VM protected from user access | No | Yes | On WSL, users can access Docker Engine directly or bypass Docker Desktop security settings. | @@ -54,10 +45,9 @@ The table below summarizes this. In general, using ECI with Hyper-V is more secure than with WSL 2. But WSL 2 offers advantages for performance and resource utilization on the host machine, and it's an excellent way for users to run their favorite Linux distribution on -Windows hosts and access Docker from within (see Docker Desktop's WSL distribution -integration feature, enabled via the Dashboard's **Settings** > **Resources** > **WSL Integration**). +Windows hosts and access Docker from within. -### ECI protection for Docker Builds with the "Docker" driver +### ECI protection for Docker builds with the "docker" driver Prior to Docker Desktop 4.30, `docker build` commands that use the buildx `docker` driver (the default) are not protected by ECI (i.e., the build runs @@ -66,14 +56,11 @@ rootful inside the Docker Desktop VM). Starting with Docker Desktop 4.30, `docker build` commands that use the buildx `docker` driver are protected by ECI (i.e., the build runs rootless inside the Docker Desktop VM), except when Docker Desktop is configured to use WSL 2 -(on Windows hosts). We expect to improve on this in future versions of Docker -Desktop. +(on Windows hosts). Note that `docker build` commands that use the `docker-container` driver are always protected by ECI (i.e., the build runs inside a rootless Docker -container). This is true since Docker Desktop 4.19 (when ECI was introduced) and -on all platforms where Docker Desktop is supported (Windows with WSL or Hyper-V, -Mac, and Linux). +container). ### Docker Build and Buildx have some restrictions @@ -98,17 +85,15 @@ arrangements are needed, just enable ECI and run the KinD tool as usual. Extension containers are also not yet protected by ECI. Ensure you extension containers come from trusted entities to avoid issues. -### Docker Desktop dev environments are not yet protected +### Docker Desktop Dev Environments are not yet protected Containers launched by the Docker Desktop Dev Environments feature are not yet -protected either. We expect to improve on this in future versions of Docker -Desktop. +protected. ### Docker Debug containers are not yet protected [Docker Debug](https://docs.docker.com/reference/cli/docker/debug/) containers -are not yet protected by ECI. We expect to improve on this in future versions of -Docker Desktop. +are not yet protected by ECI. ### Native Windows containers are not supported From 9e3841ed394c1445dad67e20128e26f77b70c278 Mon Sep 17 00:00:00 2001 From: aevesdocker Date: Tue, 26 Nov 2024 10:11:43 +0000 Subject: [PATCH 2/3] vale edit --- _vale/config/vocabularies/Docker/accept.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_vale/config/vocabularies/Docker/accept.txt b/_vale/config/vocabularies/Docker/accept.txt index a29a868073f..4ec5bae4a70 100644 --- a/_vale/config/vocabularies/Docker/accept.txt +++ b/_vale/config/vocabularies/Docker/accept.txt @@ -1,4 +1,4 @@ -(?-i)[A-Z]{2,}s +(?-i)[A-Z]{2,}'?s Amazon Anchore Apple @@ -93,7 +93,7 @@ Traefik Ubuntu Unix VMware -VM's +VM Wasm Windows WireMock From 07e9bd2d2bad06ca4b77a0740c7bdaa31d1aba48 Mon Sep 17 00:00:00 2001 From: Allie Sadler <102604716+aevesdocker@users.noreply.github.com> Date: Tue, 26 Nov 2024 10:19:17 +0000 Subject: [PATCH 3/3] Update content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md --- .../hardened-desktop/enhanced-container-isolation/config.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md index e7027da5868..f40d820fdf6 100644 --- a/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md +++ b/content/manuals/security/for-admins/hardened-desktop/enhanced-container-isolation/config.md @@ -32,7 +32,7 @@ Administrators can optionally configure ECI to allow bind mounting the Docker Engine socket into containers, but in a controlled way. This can be done via the Docker Socket mount permissions section in the -[`admin-settings.json`](../settings-management/configure.md) file. For example: +[`admin-settings.json`](../settings-management/configure-json-file.md) file. For example: ```json