Docker Best Practices: A Comprehensive Technical Guide

date

Nov 28, 2025

slug

docker-best-practices

status

Published

The "Docker Run" Rabbit Hole

We have all been there. You start with a simple need: run a database.

You: docker run postgresDocker: Runs.You: "Wait, I need a password." You: docker run -e POSTGRES_PASSWORD=secret postgresYou: "I need it to persist data." You: docker run -e POSTGRES_PASSWORD=secret -v ./data:/var/lib/postgresql/data postgresYou: "I need it on a specific network, exposed on port 5432, and named 'my-db'..."

Suddenly, your terminal looks like this:

If you are pasting this into a sticky note on your desktop, stop. You have entered the "Docker Run Rabbit Hole."

The solution? Infrastructure as Code. In Docker land, that starts with mastering the Dockerfile.

We are going to move from being Container Cowboys (yee-haw, sudo docker run --privileged!) to Docker Craftsmen.

Grab your coffee. Let’s optimize some layers.

From Installation to Production-Ready Deployments

1. Introduction

Docker is a containerization platform that packages applications with their dependencies into isolated, portable containers. Unlike virtual machines that virtualize hardware, containers virtualize the operating system, sharing the host kernel while maintaining process isolation. This makes containers lightweight, fast to start, and resource-efficient.

Why Docker matters for production deployments:

Consistency: The same container runs identically across development, staging, and production environments, eliminating "works on my machine" issues.

Isolation: Applications run in isolated environments with defined resource limits and security boundaries.

Efficiency: Containers start in milliseconds and use fewer resources than VMs because they share the host OS kernel.

Scalability: Container orchestration platforms can scale applications horizontally with ease.

DevOps Integration: Docker integrates seamlessly with CI/CD pipelines for automated testing and deployment.

Container vs Virtual Machine: A virtual machine includes a full operating system with its own kernel, while a container shares the host's kernel and only packages the application and its dependencies. This architectural difference means containers are 10-100x smaller and start 10-100x faster than VMs.

2. Installation

2.1 Installing Docker Engine on Ubuntu 24.04 LTS

Docker Engine is the core runtime that builds and runs containers. On Ubuntu 24.04, you'll install Docker from the official Docker repository to ensure you get the latest stable version.

Prerequisites verification:

Before installing Docker, verify your system meets the requirements:

Understanding technical terms:

Kernel: The core of the operating system that manages hardware and system resources

64-bit system: Computer architecture that can address more than 4GB of RAM and process data in 64-bit chunks

Repository: A storage location from which software packages are retrieved and installed

Step 1: Update system packages

Update your package index to ensure you're installing the latest versions:

What this does: apt update refreshes the list of available packages from configured repositories, while apt upgrade installs newer versions of installed packages.

Step 2: Install required dependencies

Install prerequisite packages that Docker needs:

What each package does:

apt-transport-https: Allows apt to retrieve packages over HTTPS (secure HTTP)

ca-certificates: Contains trusted Certificate Authority certificates for verifying SSL connections

curl: Command-line tool for transferring data using URLs

software-properties-common: Provides scripts for managing software repositories

lsb-release: Provides Linux Standard Base information about your distribution

Step 3: Add Docker's GPG key

Add Docker's official GPG key to verify package authenticity:

Technical explanation:

GPG (GNU Privacy Guard): Encryption software used to verify that packages haven't been tampered with

Dearmor: Converts ASCII-armored GPG key to binary format that apt can use

chmod a+r: Makes the key file readable by all users (required for apt to access it)

Step 4: Add Docker repository

For Ubuntu 24.04 (codename "noble"), Docker packages are available. Add the repository:

What this command does:

dpkg --print-architecture: Detects your CPU architecture (amd64, arm64, etc.)

lsb_release -cs: Gets your Ubuntu codename (noble for 24.04)

tee: Writes the repository configuration to a file while showing no output (> /dev/null)

Note: If Docker doesn't yet support the "noble" codename, use "jammy" (Ubuntu 22.04) as a workaround:

Step 5: Install Docker Engine

Install Docker Engine, CLI, and required plugins:

Package breakdown:

docker-ce: Docker Community Edition engine (the core daemon that runs containers)

docker-ce-cli: Command-line interface for interacting with Docker

containerd.io: Container runtime that manages the complete container lifecycle

docker-buildx-plugin: Extended build capabilities with BuildKit for advanced features

docker-compose-plugin: Tool for defining and running multi-container applications

Technical terms:

Daemon: A background process that runs continuously and handles requests

Runtime: Software that executes and manages running containers

BuildKit: Docker's next-generation build system with improved performance and caching

Step 6: Enable and start Docker service

Configure Docker to start automatically on boot:

What these commands do:

enable: Creates symbolic links so Docker starts automatically at boot

start: Starts the Docker daemon immediately

status: Shows whether Docker is running and displays recent log entries

Press Ctrl+C to exit the status view.

2.2 Installing Docker Compose v2

Docker Compose is a tool for defining and running multi-container applications using YAML configuration files. Compose v2 was rewritten in Go and is now a Docker CLI plugin (accessed via docker compose rather than the old docker-compose command).

Installation via apt (recommended)

Docker Compose v2 is included when you install docker-compose-plugin:

Manual installation (if needed)

To install a specific version or the latest release manually:

System-wide installation (for all users):

Technical note: Compose v2 uses a space (docker compose) instead of a hyphen (docker-compose). The old syntax is deprecated but still works if you install the legacy version.

2.3 Verifying Installation

Verify Docker is working correctly:

What the test container does:

Docker searches for the hello-world image locally

Since it's not found, Docker pulls it from Docker Hub (the default registry)

Docker creates a container from the image

The container runs, prints a success message, and exits

Expected output:

Verify system information:

This displays comprehensive information including:

Server version

Storage driver (typically overlay2 on Ubuntu)

Number of containers and images

Docker root directory (/var/lib/docker)

Logging driver (default is json-file)

Operating system details

CPU and memory information

2.4 Post-Installation Setup (Non-Root User)

By default, Docker requires sudo for all commands because the Docker daemon runs as root and owns the Unix socket /var/run/docker.sock. For convenience and security, add your user to the docker group.

Add user to docker group:

Technical explanation:

usermod: Command to modify user account properties

aG: Append to group (without removing from other groups)

$USER: Environment variable containing your username

Unix socket: Inter-process communication mechanism (file-based)

Verify non-root access:

Security consideration: Users in the docker group have effective root access to the host system because they can mount host directories and run privileged containers. Only add trusted users to this group.

3. Low-Level Docker Understanding

Understanding Docker CLI flags deeply is crucial for production deployments. This section explains important flags with practical examples and the reasoning behind their use.

3.1 Docker Run Flags

The docker run command creates and starts containers. Its general syntax is:

Basic execution flags:

d, --detach: Run container in background (detached mode)

When to use: Production services, long-running processes, anything not requiring interactive terminal.

it: Interactive terminal

When to use: Debugging, running shell commands, development workflows.

Technical explanation:

STDIN: Standard input stream (keyboard input)

TTY: TeleTypewriter, a terminal interface that allows line editing and signal handling

Pseudo-TTY: Software emulation of a terminal

-rm: Automatically remove container when it exits

When to use: One-off tasks, CI/CD jobs, testing. Prevents accumulation of stopped containers.

-name: Assign custom name to container

When to use: Production deployments, containers you reference frequently. Without --name, Docker assigns random names like zealous_darwin.

Environment and configuration flags:

e, --env: Set environment variables

-env-file: Load variables from file

Security warning: Environment variables are visible in docker inspect and process listings. Never store secrets in environment variables in production. Use Docker secrets or external secret managers instead.

w, --workdir: Set working directory inside container

-hostname: Set container hostname

u, --user: Run as specific user

When to use: Security best practice. Never run production containers as root unless absolutely necessary.

Resource flags (covered in detail in section 3.5):

-memory, -m: Memory limit

-cpus: CPU limit

Exit behavior flags:

-restart: Restart policy

Policy explanations:

no: Never restart automatically

always: Always restart, even after daemon restart

unless-stopped: Restart unless manually stopped (recommended for production)

on-failure:N: Restart only if container exits with non-zero code, max N times

When to use: Production deployments require restart policies to recover from crashes. Use unless-stopped for most services.

3.2 Docker Build Flags

The docker build command creates images from Dockerfiles.

Basic build flags:

t, --tag: Name and tag image

Tag naming convention: [registry/][namespace/]name[:tag][@digest]

f, --file: Specify Dockerfile location

. (build context): Directory containing files needed for build

Technical explanation: The build context is sent to the Docker daemon. All COPY and ADD commands in the Dockerfile are relative to this context. Large contexts slow builds.

Build argument flags:

-build-arg: Pass build-time variables

-target: Build specific stage in multi-stage Dockerfile

Cache management flags:

-no-cache: Build without using cache

When to use: When cache is causing issues, testing clean builds, or deploying security updates that must propagate through all layers.

-cache-from: Use external cache source

BuildKit cache flags (requires DOCKER_BUILDKIT=1):

Cache modes:

mode=min: Export only final stage layers (default)

mode=max: Export all layers including intermediate stages

When to use mode=max: Multi-stage builds with expensive intermediate stages (installing dependencies, compiling code).

-no-cache-filter: Ignore cache for specific stages

Platform flags:

-platform: Build for specific platform

Technical explanation: Docker uses QEMU to emulate different architectures when building cross-platform images.

3.3 Networking Flags

Docker networking enables container communication.

Port publishing flags:

p, --publish: Publish container port to host

P, --publish-all: Publish all exposed ports to random host ports

Network mode flags:

-network: Connect to specific network

Network modes explained:

bridge (default): Containers get private IP addresses and can communicate through Docker's virtual network. Port mapping required for external access.

host: Container uses host's network stack directly. No port mapping needed. Container's port 80 is accessible at host's port 80.

When to use host mode: High-performance networking applications, applications requiring specific network interfaces, when port mapping overhead is unacceptable. Security warning: Host mode removes network isolation.

none: No networking. Container is completely isolated.

Custom networks: Bridge networks you create. Provides automatic DNS resolution between containers.

DNS and hostname flags:

-dns: Set custom DNS server

-add-host: Add entry to /etc/hosts

-link (deprecated): Link to another container

3.4 Volume and Storage Flags

Volumes persist data beyond container lifecycle.

Volume types:

Named volumes: Managed by Docker, stored in /var/lib/docker/volumes/

Bind mounts: Mount host directory into container

tmpfs mounts: Store in host memory (Linux only)

v, --volume: Mount volume (older syntax)

-mount: Mount volume (preferred syntax)

Why --mount is preferred: More explicit, supports all options, better error messages.

tmpfs mount flags:

-tmpfs: Create tmpfs mount

tmpfs options:

size: Maximum size (default: 50% of host RAM)

mode: File permissions in octal (default: 1777)

When to use tmpfs: Storing sensitive temporary data (passwords, session tokens), temporary caches, high-performance temporary storage. Data is lost when container stops.

Volume management commands:

Best practices:

Use named volumes for data persistence: Easier to manage than bind mounts

Use bind mounts for development: Live code updates without rebuilding

Use tmpfs for secrets and temporary data: Never persisted to disk

Never store data in container's writable layer: Lost when container is removed

3.5 Resource Constraint Flags

Resource limits prevent containers from consuming excessive host resources.

Memory limits:

-memory, -m: Maximum memory

Memory units: b (bytes), k (kilobytes), m (megabytes), g (gigabytes).

-memory-reservation: Soft limit (memory reservation)

How it works: Container tries to stay below reservation. Docker only enforces hard limit (-m) when host memory is low.

-memory-swap: Total memory + swap

When to use memory limits: Always set memory limits in production to prevent OOM (Out of Memory) crashes affecting host.

CPU limits:

-cpus: Maximum CPU usage

How it works: If container has 100% CPU load and --cpus 0.5, Docker throttles it to use only 50% of one CPU over time.

-cpu-shares: Relative CPU priority

How it works: Only matters when CPU is contested. High-priority (2048) gets 2x CPU time of low-priority (512) when both are CPU-bound.

When to use: Multi-tenant environments, background tasks vs. user-facing services.

-cpuset-cpus: Pin to specific CPU cores

When to use: NUMA optimization, CPU-intensive applications that benefit from cache locality, isolating workloads.

Technical explanation:

NUMA (Non-Uniform Memory Access): Multi-processor systems where memory access speed depends on CPU location

Cache locality: Keeping process on same CPU improves performance due to CPU cache

-cpu-period and -cpu-quota: Precise CPU throttling

How it works: Within each cpu-period microseconds, container can use cpu-quota microseconds of CPU time.

Monitoring resource usage:

Output columns:

CPU %: CPU usage percentage

MEM USAGE / LIMIT: Current memory / Maximum memory

MEM %: Memory usage percentage

NET I/O: Network bytes in/out

BLOCK I/O: Disk bytes read/written

PIDS: Number of processes

Production resource limits strategy:

Guidelines:

Set memory limits 20-30% above normal usage to handle spikes

Set CPU limits to prevent noisy neighbor problems

Monitor with docker stats and adjust based on real usage

Use resource limits in docker-compose for consistency

3.6 Security Flags

Security flags reduce attack surface and enforce least privilege.

Read-only filesystem:

-read-only: Mount root filesystem as read-only

Why this matters: Attackers can't modify system files, install malware, or persist changes. Significantly reduces damage from compromised containers.

When to use: Production containers that don't need to write to disk (most stateless applications). Identify write locations during development and mount as tmpfs.

Capability flags:

Linux capabilities split root privileges into granular permissions. Docker drops many capabilities by default.

-cap-drop: Drop capabilities

-cap-add: Add capabilities

Common capabilities:

NET_BIND_SERVICE: Bind ports below 1024

NET_RAW: Use raw sockets (ping, packet capture)

SYS_ADMIN: Mount filesystems, various admin tasks

CHOWN: Change file ownership

DAC_OVERRIDE: Bypass file permission checks

Secure container example:

Breakdown:

-read-only: Immutable filesystem

-tmpfs /tmp: Writable temp with security flags

-cap-drop ALL: Drop all capabilities

-cap-add NET_BIND_SERVICE: Allow port 80 binding

-security-opt no-new-privileges: Prevent privilege escalation

-user 1000:1000: Run as non-root user

Security profiles:

-security-opt: Apply security profiles

Security profiles explained:

Seccomp: Filters system calls container can make. Docker's default profile blocks ~44 dangerous syscalls.

AppArmor: Mandatory Access Control (MAC) system. Restricts file access, network access, capabilities.

SELinux: Another MAC system common on Red Hat/CentOS. Labels resources and enforces policies.

Never disable security profiles in production.

Privileged mode (dangerous):

-privileged: Give container all host capabilities

What it does: Disables all security features, gives access to all devices, allows mounting filesystems. Container has nearly root access to host.

When to use: Docker-in-Docker, hardware access (GPU, USB devices), very specific admin tasks. Never use in production unless absolutely necessary.

4. Dockerfile Best Practices

A Dockerfile is a text document containing instructions to build Docker images. Well-written Dockerfiles create small, secure, cacheable images.

4.1 Multi-Stage Builds

Multi-stage builds use multiple FROM instructions in a single Dockerfile, allowing you to separate build environment from runtime environment.

Problem multi-stage builds solve:

Traditional Dockerfile includes build tools, source code, dependencies, and compiled artifacts—all in final image. This creates large images with unnecessary attack surface.

Single-stage Dockerfile (problematic):

Multi-stage Dockerfile (optimized):

Key concepts:

FROM ... AS name: Names a build stage for later reference

COPY --from=stage: Copies files from another stage

Final stage determines image size: Only last stage content is in final image

Benefits:

Smaller images: 5-10x size reduction by excluding build tools

Better security: Fewer packages = smaller attack surface

Single Dockerfile: One file for all environments

Better caching: Build stages cached independently

Go application multi-stage example:

Why Alpine base image: Alpine Linux is minimal (~5MB) but includes package manager. Perfect for production.

Python application multi-stage example:

Advanced: Multiple build stages for testing:

Build specific stages:

4.2 Layer Ordering and Caching

Docker caches each layer. When a layer changes, all subsequent layers rebuild.

How layer caching works:

Each Dockerfile instruction creates a layer. Docker reuses cached layers if instruction and context haven't changed.

Bad layer ordering (rebuilds frequently):

Problem: Changing any source file invalidates COPY . . layer, forcing npm install to re-run even though dependencies didn't change.

Optimized layer ordering:

Caching strategy: Order instructions from least-frequently-changed to most-frequently-changed:

Base image (rarely changes)

System dependencies (changes occasionally)

Application dependencies (changes sometimes)

Application code (changes frequently)

Python example:

Combining commands to reduce layers:

Bad (many layers):

Good (single layer):

Why this matters: Each layer adds to image size. Combining commands and cleaning up in same layer prevents intermediate files from bloating image.

BuildKit cache mounts (advanced):

What --mount=type=cache does: Mounts persistent cache directory that survives across builds. Go modules stay cached even when other layers change.

4.3 Base Image Selection

Base image choice affects security, size, and build times.

Base image options:

Official language images (e.g., node:20, python:3.11, golang:1.21)

Pros: Easy to use, well-maintained, include language toolchain

Cons: Large size (300MB-1GB), includes unnecessary tools

Use case: Build stages in multi-stage builds

Slim variants (e.g., node:20-slim, python:3.11-slim)

Pros: Smaller (~150-300MB), fewer vulnerabilities

Cons: Missing some tools, may need manual package installation

Use case: Production stages when you need glibc

Alpine variants (e.g., node:20-alpine, python:3.11-alpine)

Pros: Very small (~50MB), excellent security record

Cons: Uses musl libc (not glibc), some packages missing, occasional compatibility issues

Use case: Production images, microservices, when size matters

Distroless images (Google's distroless)

Pros: Minimal attack surface (no shell, no package manager)

Cons: Hard to debug, requires multi-stage builds

Use case: Maximum security production deployments

Scratch (empty image)

Pros: Smallest possible (just your binary)

Cons: No shell, no debugging tools, no CA certificates

Use case: Static binaries (Go, Rust), ultra-minimal images

Choosing base images:

Security consideration: Use specific versions, not latest:

Why pin versions: latest tag is mutable and can break builds or introduce vulnerabilities. Digests are immutable and guarantee exact image.

4.4 Minimizing Image Size

Smaller images = faster deployments, less storage, smaller attack surface.

Techniques for size reduction:

1. Use multi-stage builds (covered in 4.1)

2. Use Alpine base images (50-150MB vs 300-1000MB)

3. Clean up in same layer:

Technical explanation: Each RUN creates a layer with filesystem changes. Deleting files in later layer doesn't remove them from earlier layer.

4. Use --no-install-recommends:

5. Remove build dependencies after use:

6. Don't install unnecessary packages:

7. Copy only necessary files:

8. Use .dockerignore (covered in 4.5)

Image size comparison example:

Checking image sizes:

4.5 Using .dockerignore

.dockerignore excludes files from build context, reducing build time and image size.

Why .dockerignore matters:

The build context (all files sent to Docker daemon) affects:

Build speed: Larger context = slower transfer to daemon

Cache invalidation: Irrelevant file changes invalidate cache

Accidental file inclusion: Secrets, logs, temp files shouldn't be in images

How .dockerignore works:

Create .dockerignore in same directory as Dockerfile. It uses gitignore-like patterns:

Pattern syntax:

Common patterns for different languages:

Node.js:

Python:

Go:

Java:

Dockerfile-specific .dockerignore:

You can create Dockerfile-specific ignore files:

Verifying .dockerignore works:

Best practices:

Exclude .git: Git history doesn't belong in images

Exclude dependencies: node_modules, vendor will be rebuilt

Exclude build artifacts: Rebuilt during image build

Exclude secrets: Never include .env, keys, certificates

Include README: Often useful for documentation

Keep it simple: Start with common patterns, add as needed

Security note: .dockerignore is your last defense against accidentally including secrets. Always use it.

4.6 Non-Root Users

Running containers as root is dangerous. If container is compromised, attacker has root access.

Why non-root users matter:

Security: Limits damage from compromised containers

Compliance: Many security policies require non-root

Kubernetes: Some clusters block root containers by default

Creating non-root users:

Debian/Ubuntu syntax:

Alpine syntax (simpler):

Node.js example (built-in node user):

Python example:

Go example (nobody user):

Handling file permissions:

Common pitfalls:

Problem: User can't write to volumes

Solution: Set ownership in volume mount or use numeric UID

Problem: Port below 1024 requires root

Solution: Use port above 1024 or add NET_BIND_SERVICE capability

Best practices:

Always create explicit user: Don't rely on default

Use numeric UID/GID: More portable across systems

Common UID: Use 1000 or app-specific UID (e.g., 3000)

Set ownership: Use chown or COPY --chown

Switch early: Run USER before copying sensitive files

Test as non-root: Ensure app works without root

Verification:

4.7 Example Dockerfiles

Production-ready Dockerfile examples for common languages.

Node.js/Express Application:

Key features:

Multi-stage build separates build and runtime

Alpine base for small size (~150MB vs ~1GB)

Non-root user (nodejs)

dumb-init for proper signal handling (graceful shutdown)

Health check for orchestration

Production-only dependencies

Python/Flask Application:

Key features:

Multi-stage build for smaller image

Only runtime dependencies in final stage

Non-root user

Gunicorn production server (not Flask dev server)

Environment variables for Python optimization

Health check using Python

Go Application (Minimal):

Key features:

Scratch base (minimal possible image ~10MB)

Static binary (no dependencies)

CA certificates for HTTPS requests

Non-root user (numeric UID)

Extremely small and secure

Java/Spring Boot Application:

Key features:

Multi-stage build (Maven in builder only)

JRE instead of JDK in production (smaller)

Container-aware JVM settings

Non-root user

Health check via Spring Actuator

Environment variable for JVM tuning

React/Nginx Application:

nginx.conf for non-root:

Key features:

Multi-stage build (Node for build, Nginx for serving)

Non-root nginx configuration

Port 8080 (non-privileged)

Security headers

Static asset optimization

Common patterns across all examples:

Multi-stage builds: Separate build and runtime

Non-root users: Security best practice

Health checks: Enable orchestration monitoring

Small base images: Alpine or slim variants

Layer caching: Dependencies before source code

Security: Drop privileges, minimal packages

Production servers: Gunicorn, not Flask dev server; no nodemon

5. Image Versioning & Registry Strategy

Proper image versioning is critical for reliable deployments, rollbacks, and reproducibility.

5.1 Tags vs Digests

Image references have three forms:

Tag: Human-readable label (e.g., myapp:v1.2.3)

Digest: Immutable SHA256 hash (e.g., myapp@sha256:abc123...)

Tag + Digest: Both for clarity and immutability

Tags are mutable - same tag can point to different images:

Problem with mutable tags:

Builds become non-reproducible

Security patches may reintroduce vulnerabilities

Hard to know exactly what's running in production

Digests are immutable - always reference exact image:

Getting image digest:

Using digests in Dockerfile:

Using digests in docker-compose:

When to use digests:

Production deployments: Always use digests for reproducibility

Security scanning: Scan specific digest, not floating tag

Compliance: Prove exactly what's running

Rollbacks: Reference exact previous version

When tags are acceptable:

Development: Convenient to pull latest changes

CI builds: Build tagged, then extract digest for deployment

5.2 Why 'latest' is Bad

The latest tag is Docker's default but dangerous for production.

Problems with latest:

1. Not actually "latest"

latest is just a default tag name. It's only updated when explicitly pushed:

2. Mutable and unpredictable

Result: Production has mixed versions, causing inconsistent behavior.

3. Impossible to rollback

4. Breaks caching and reproducibility

Rebuild tomorrow = different image, possibly breaking changes.

5. Security vulnerabilities

Real-world incident: Node.js official images broke yarn support when latest was updated, breaking thousands of builds.

What to use instead:

Exceptions where latest is acceptable:

Never in production ❌

Local development experiments: Convenient for testing new versions ✅

Automated daily builds: If you explicitly want latest ✅

Kubernetes example showing the problem:

Problem: Three pods might run three different versions as they're created at different times.

Solution:

5.3 Production Pinning Strategy

Production-ready tagging strategy ensures reliability and traceability.

Semantic Versioning (SemVer)

Use SemVer format: MAJOR.MINOR.PATCH

MAJOR: Breaking changes (1.x.x → 2.0.0)

MINOR: New features, backward compatible (1.1.x → 1.2.0)

PATCH: Bug fixes, backward compatible (1.1.1 → 1.1.2)

Tagging strategy:

Tag hierarchy:

myapp:2.3.5 - Immutable, specific version

myapp:2.3 - Tracks latest patch in 2.3.x

myapp:2 - Tracks latest minor in 2.x.x

myapp:latest - Tracks latest release

Production deployment uses specific version:

Additional metadata tags:

Example complete tagging:

Benefits:

Specific version: Exactly know what's deployed

Git SHA: Trace back to source code

Build number: Track CI/CD build

Rolling tags: Convenience for development

Image labels (metadata in image):

Inspect labels:

Registry security:

Enable tag immutability in registry (Harbor, ECR, ACR):

Benefit: Prevents accidentally overwriting tags.

5.4 CI/CD Build and Push Workflow

Automated image building and publishing in CI/CD pipelines.

Typical workflow:

Developer pushes code to Git

CI/CD triggers on push/merge

Build Docker image with cache

Run tests in container

Tag image with version

Push to registry

Deploy to staging

Run integration tests

Manual approval (optional)

Deploy to production

GitHub Actions example:

What this does:

Triggers on push to main or version tags

Logs into GitHub Container Registry

Extracts version from Git tag

Builds with BuildKit cache from registry

Tags with multiple strategies (version, branch, SHA)

Pushes image and cache

GitLab CI example:

Advanced: Multi-platform builds

Automated tagging patterns:

Build caching strategies:

Security scanning in CI/CD:

Best practices:

Cache layers: Use BuildKit registry cache for faster builds

Multi-stage builds: Keep CI build times low

Scan images: Integrate security scanning

Semantic versioning: Auto-tag from Git tags

Immutable tags: Never overwrite version tags

Prune old images: Cleanup unused images in registry

6. Docker Compose (Latest Spec)

Docker Compose defines multi-container applications in YAML files. Compose v2 uses the Docker CLI plugin architecture.

6.1 Core Compose Fields

Compose file structure (version 3.8+):

Service definition anatomy:

6.2 Services Configuration

Build configuration:

Image and container naming:

Port mapping:

Environment variables:

.env file example:

Variable substitution in compose file:

Command and entrypoint:

User specification:

Working directory:

6.3 Networks and Volumes

Networks:

Network isolation example:

Volumes:

Volume permissions:

6.4 Secrets and Configs

Secrets (sensitive data like passwords, keys):

Secrets are mounted at /run/secrets/<secret_name>:

Configs (non-sensitive configuration files):

Environment-based secrets (for non-Swarm):

Better secrets management (production):

6.5 Healthchecks and Dependencies

Healthchecks:

Healthcheck alternatives:

Dependencies:

Dependency conditions:

service_started: Wait for container to start (default)

service_healthy: Wait for healthcheck to pass

service_completed_successfully: Wait for container to exit with code 0

Restart policies:

6.6 Production-Ready Compose Example

Complete production example with all best practices:

Key production features:

Resource limits: CPU and memory constraints

Health checks: All services monitored

Dependency order: Services wait for dependencies to be healthy

Restart policies: Automatic recovery from crashes

Log rotation: Prevents disk exhaustion

Network isolation: Backend services not exposed

Secrets management: Passwords not in environment

Version pinning: Specific image versions, no latest

YAML anchors: Reusable configuration blocks

Running production compose:

7. Security & Hardening

Securing Docker containers prevents attacks and limits damage from compromised containers.

7.1 Capabilities (cap-drop, cap-add)

Linux capabilities split root privileges into granular permissions.

Default Docker capabilities:

Docker containers start with these capabilities:

CHOWN: Change file ownership

DAC_OVERRIDE: Bypass file permission checks

FOWNER: Bypass permission checks on operations

FSETID: Set file capabilities

KILL: Send signals to processes

SETGID: Set GID

SETUID: Set UID

SETPCAP: Modify capabilities

NET_BIND_SERVICE: Bind ports below 1024

NET_RAW: Use raw sockets

SYS_CHROOT: Use chroot

MKNOD: Create device nodes

AUDIT_WRITE: Write to audit log

SETFCAP: Set file capabilities

Dangerous capabilities Docker blocks:

SYS_ADMIN: Mount filesystems, load kernel modules

SYS_MODULE: Load/unload kernel modules

SYS_RAWIO: Raw I/O operations

SYS_PTRACE: Trace processes

SYS_BOOT: Reboot system

Drop all capabilities (most secure):

When to drop ALL capabilities:

Most applications don't need any special privileges. Start with dropping all, then add only what's needed.

Common capability needs:

Testing capabilities:

7.2 Seccomp and AppArmor

Seccomp filters system calls containers can make.

What Seccomp does:

Docker's default seccomp profile blocks ~44 dangerous syscalls including:

reboot

mount

swapon

kexec_load

init_module (load kernel modules)

Using default seccomp profile:

Custom seccomp profile:

Example custom profile (block execve):

Never disable seccomp in production:

AppArmor (Mandatory Access Control):

AppArmor restricts:

File access

Network access

Capabilities

Docker's default AppArmor profile (docker-default):

Custom AppArmor profile:

Example custom AppArmor profile:

Security options in production:

What each security option does:

no-new-privileges: Prevents privilege escalation via setuid binaries

seccomp:default: Blocks dangerous system calls

apparmor:docker-default: Restricts file/network access

read_only: Immutable filesystem

tmpfs with noexec,nosuid: Temp dir can't execute binaries or use setuid

7.3 Read-Only Filesystem

Why read-only matters:

If attacker compromises container, they can't:

Modify system files

Install malware

Persist backdoors

Replace binaries

Change configurations

Basic read-only:

Problem: Most apps need some writable locations:

Solution: Mount writable tmpfs (in-memory):

tmpfs options explained:

rw: Read-write

noexec: Can't execute binaries (prevents code injection)

nosuid: Ignores setuid bits (prevents privilege escalation)

size: Maximum size

Identifying writable locations:

Production example:

Benefits:

Prevents malware: Can't write executables

Prevents persistence: Changes lost on restart

Reduces attack surface: No writable system files

Compliance: Meets immutable infrastructure requirements

Performance note: tmpfs is RAM-based, so very fast. But uses memory from container's limit.

7.4 Resource Limits

Why resource limits are critical:

Without limits, a container can:

Consume all host memory (crash host)

Use 100% CPU (starve other containers)

Fill disk with logs

Cause OOM (Out of Memory) kills

Memory limits:

Memory limit behavior:

Container tries to stay below reservation (256M)

Can use up to limit (512M) if needed

Exceeding limit → OOM kill

Setting appropriate memory limits:

CPU limits:

CPU limit behavior:

cpus: 1.5 → Maximum 150% CPU (1.5 cores)

Container throttled if it tries to exceed

Doesn't block other containers

CPU shares (priority):

How CPU shares work:

Only matter when CPU is contested:

Both containers running → high-priority gets 4x CPU

Only one running → uses full CPU regardless

Pinning to specific CPUs:

Production resource limits example:

Monitoring resources:

PID limits (prevent fork bombs):

Storage limits (rootfs size):

7.5 Secrets Management

Never store secrets in:

❌ Environment variables (visible in docker inspect, process lists)

❌ Dockerfiles (baked into image layers)

❌ Git repositories (version history)

❌ Plain text files in images

Best practices:

1. Docker Secrets (Swarm only):

Mounted at /run/secrets/db_password (in-memory, read-only).

2. Docker Compose secrets (file-based):

3. BuildKit secret mounts (build-time secrets):

4. External secret managers:

AWS Secrets Manager:

HashiCorp Vault:

5. Kubernetes Secrets:

Handling environment variables securely:

Secret rotation:

load-secrets.sh:

Security checklist:

✅ Store secrets in secret manager

✅ Mount as read-only files

✅ Use minimal permissions

✅ Rotate secrets regularly

✅ Audit secret access

✅ Never log secrets

✅ Use separate secrets per environment

7.6 Security Scanning

Scan images for vulnerabilities before deploying.

Tools:

Trivy (recommended - free, comprehensive)

Snyk

Anchore

Clair

Docker Scout

Trivy scanning:

CI/CD integration:

Fail build on vulnerabilities:

Docker Scout:

Remediation workflow:

Scan image

Identify vulnerabilities

Update base image / dependencies

Rebuild image

Re-scan

Deploy if clean

Example vulnerabilities and fixes:

Automated scanning schedule:

8. Performance & Observability

8.1 Log Drivers and Rotation

Docker's default logging driver (json-file) doesn't rotate logs, causing disk exhaustion.

Problem:

Solution: Configure log rotation:

Global daemon configuration (/etc/docker/daemon.json):

What this does:

max-size: Rotate when log reaches 10MB

max-file: Keep 3 rotated files (30MB total)

compress: Compress rotated logs

Restart Docker daemon:

Per-container logging (docker-compose):

Recommended logging drivers:

1. local (recommended for production):

Benefits:

Automatic rotation (unlike json-file)

More efficient storage format

Better performance

2. journald (systemd integration):

View logs:

3. syslog (remote logging):

4. fluentd (centralized logging):

5. awslogs (CloudWatch):

Production logging stack example:

Viewing logs:

Best practices:

Always configure rotation: Prevent disk exhaustion

Use local driver: Better than default json-file

Centralize logs: Send to log aggregation system

Add metadata: Use labels and tags

Monitor disk usage: Alert on log partition filling

8.2 Healthchecks

Healthchecks tell orchestration systems if container is functioning.

Dockerfile healthcheck:

Docker Compose healthcheck:

Parameters explained:

interval: Time between checks (30s = check every 30 seconds)

timeout: Maximum time for check to complete (10s)

retries: Consecutive failures before marking unhealthy (3)

start-period: Grace period during startup (40s - don't check)

Health check endpoint (/health):

What to check:

✅ Check:

Database connectivity

Cache connectivity

Essential external services

Application-specific critical resources

❌ Don't check:

Disk space (infrastructure concern)

CPU/memory (handled by resource limits)

Optional services (should degrade gracefully)

Complex healthcheck:

Healthcheck without curl:

Viewing health status:

Restart unhealthy containers:

Docker doesn't restart unhealthy containers by default. Two solutions:

1. Custom healthcheck that kills container:

2. External monitoring (autoheal):

Kubernetes healthchecks (liveness/readiness):

Difference:

Liveness: Is container alive? (restart if fails)

Readiness: Is container ready for traffic? (remove from load balancer if fails)

8.3 Metrics and Monitoring

Container metrics:

Metrics to monitor:

CPU usage (docker stats)

Memory usage (docker stats)

Network I/O (docker stats)

Disk I/O (docker stats)

Container health (docker inspect)

Image vulnerabilities (Trivy scans)

Log volume (disk usage)

Prometheus + cAdvisor:

9. Local vs Production Deployment Workflows

9.1 Local Development with Compose

For local development, optimize for speed and convenience (hot reloading, debugging).

docker-compose.dev.yml:

Run local dev:

9.2 Production Deployment using Pinned Images

For production, optimize for stability and reproducibility (immutable images).

docker-compose.prod.yml:

Deploy script:

9.3 Rollback Strategy

Manual Rollback:

Identify previous version tag (e.g., v1.2.2)

Update compose file or environment variable

Redeploy

Automated Rollback:

Keep backup compose files:

10. Production Checklist

Do:

Pin image versions (use digests or specific tags)

Run as non-root user

Set resource limits (CPU/Memory)

Use read-only filesystem where possible

Configure log rotation

Use multi-stage builds

Implement healthchecks

Scan images for vulnerabilities

Use secrets management (not ENV vars)

Keep images small (Alpine/Slim)

Don't:

Don't use latest tag

Don't run as root

Don't expose unnecessary ports

Don't include build tools in production image

Don't store secrets in image

Don't use -privileged flag

Don't mount Docker socket (/var/run/docker.sock) unless absolutely necessary

Don't ignore .dockerignore

11. Deploying to Cloud and Kubernetes

11.1 Deploying to AWS (ECS/EC2)

ECS (Elastic Container Service):

Push image to ECR (Elastic Container Registry)

Define Task Definition (equivalent to Docker Compose)

Create Service to run tasks

Use Fargate for serverless containers (no EC2 management)

EC2 (Bare Metal Docker):

Install Docker on EC2 instance (User Data script)

Use Docker Compose for simple stacks

Use AWS Systems Manager for secrets

11.2 Deploying to GCP (Cloud Run/GKE)

Cloud Run:

Fully managed, serverless

Scale to zero capability

Deploy directly from image:

GKE (Google Kubernetes Engine):

Managed Kubernetes

Best for complex orchestrations

11.3 Deploying to Kubernetes

Migration from Compose:

Use tools like Kompose to convert docker-compose.yml to K8s manifests

Or write Helm charts for better management

Key differences:

docker-compose.yml -> Deployment/Service/Ingress manifests

depends_on -> InitContainers or readiness probes

volumes -> PersistentVolumeClaims (PVC)

secrets -> Kubernetes Secrets

CI/CD for K8s:

Build & Push Image

Update Manifest (gitops) or kubectl apply

Rollout status check