Docker in Production: Containerization Best Practices That Reduce Incidents

Every Backend Job Requires Docker. Few Engineers Use It Well.

Docker appears in virtually every backend and DevOps job posting. It's the standard for packaging, shipping, and running applications. But there's a massive gap between "I can write a Dockerfile" and "I build production-grade container images."

The gap manifests as 2GB images that take 10 minutes to pull, containers running as root with full filesystem write access, applications that don't respond to shutdown signals, and builds that break caching on every commit. These aren't theoretical problems — they're the source of real production incidents, slow deployments, and security vulnerabilities.

This guide covers the Docker practices that separate hobby projects from production-grade infrastructure. Every pattern here addresses a real failure mode we've seen in production.

Multi-Stage Builds: Smaller, Safer Images

A typical Dockerfile installs build tools, dependencies, compiles code, and serves the application — all in one layer. The result is an image packed with compilers, package managers, and development headers that have no business being in production.

Multi-stage builds solve this:

# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

# Stage 2: Production
FROM node:20-alpine AS production
WORKDIR /app
RUN addgroup -g 1001 appgroup && \
    adduser -u 1001 -G appgroup -s /bin/sh -D appuser
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
USER appuser
EXPOSE 3000
CMD ["node", "dist/server.js"]

What this achieves: The final image contains only the compiled application and production dependencies. Build tools, source code, and dev dependencies are left behind in the builder stage. Image size typically drops 60-80%.

Python Multi-Stage Example

# Stage 1: Build dependencies
FROM python:3.12-slim AS builder
WORKDIR /app
RUN pip install --no-cache-dir poetry
COPY pyproject.toml poetry.lock ./
RUN poetry export -f requirements.txt --output requirements.txt --without-hashes
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# Stage 2: Production
FROM python:3.12-slim AS production
WORKDIR /app
COPY --from=builder /install /usr/local
COPY src/ ./src/
RUN useradd --create-home --shell /bin/bash appuser
USER appuser
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

✓

Choose the Right Base Image

Use -alpine or -slim variants of base images. The difference is dramatic: python:3.12 is 1GB, python:3.12-slim is 150MB, and python:3.12-alpine is 50MB. For even smaller images, consider distroless images from Google (gcr.io/distroless/python3) which contain only the runtime and no shell at all — the ultimate security hardening.

Layer Caching: Fast Builds in CI

Docker builds layers from top to bottom. When a layer changes, every subsequent layer is rebuilt. The order of your Dockerfile instructions directly impacts build speed.

Bad: Breaks Cache on Every Code Change

COPY . .
RUN pip install -r requirements.txt

Every code change copies all files, which invalidates the cache for the dependency install layer — even if dependencies didn't change.

Good: Dependencies Cached Separately

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

Dependencies are only reinstalled when requirements.txt changes. Code changes only rebuild the final COPY layer.

.dockerignore

Always include a .dockerignore file. Without it, COPY . . sends your entire directory — including .git, node_modules, .env files, and test data — to the Docker daemon.

.git
.env
.env.*
node_modules
__pycache__
*.pyc
.pytest_cache
.coverage
dist
build
*.md
docker-compose*.yml

Security: Don't Run as Root

By default, Docker containers run as root. This means a vulnerability in your application gives the attacker root access inside the container, and potentially access to the host via container escape vulnerabilities.

Create a Non-Root User

RUN addgroup -g 1001 appgroup && \
    adduser -u 1001 -G appgroup -s /bin/sh -D appuser
USER appuser

Read-Only Root Filesystem

Run containers with a read-only filesystem and only write to designated volumes:

# docker-compose.yml
services:
  api:
    image: myapp:latest
    read_only: true
    tmpfs:
      - /tmp
    volumes:
      - app-data:/app/data

Drop All Capabilities

Linux capabilities give fine-grained control over what a container process can do. Drop all capabilities and add back only what you need:

services:
  api:
    image: myapp:latest
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE  # Only if binding to ports < 1024

Never Use --privileged

Running a container with --privileged gives it full access to the host system, including all devices, all capabilities, and the ability to modify the host kernel. There is almost never a legitimate reason to use --privileged in production. If you think you need it, you almost certainly need a specific capability or device mount instead.

Health Checks: Let the Orchestrator Help

Without health checks, Docker (and Kubernetes) can only tell if your container process is running — not if it's actually healthy and serving requests. A container can be "running" while the application inside is deadlocked, out of memory connections, or stuck in an infinite loop.

Dockerfile HEALTHCHECK

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:8000/health || exit 1

Application-Level Health Endpoints

Build a dedicated health endpoint that checks real dependencies:

@app.get("/health")
async def health():
    checks = {
        "database": await check_db_connection(),
        "redis": await check_redis_connection(),
    }
    healthy = all(checks.values())
    return JSONResponse(
        status_code=200 if healthy else 503,
        content={"status": "healthy" if healthy else "degraded", "checks": checks},
    )

Graceful Shutdown: Handle SIGTERM

When Docker stops a container, it sends SIGTERM and waits 10 seconds before sending SIGKILL. Your application must handle SIGTERM to drain connections and finish in-flight requests.

The Problem with Shell Form CMD

# Bad: runs via /bin/sh, which doesn't forward signals
CMD python server.py

# Good: exec form, process receives signals directly
CMD ["python", "server.py"]

Always use the exec form (CMD ["executable", "arg"]) so your application process is PID 1 and receives signals directly.

Handle SIGTERM in Application Code

import signal
import sys

def graceful_shutdown(signum, frame):
    print("Received SIGTERM, shutting down gracefully...")
    # Close database connections
    # Finish in-flight requests
    # Flush logs and metrics
    sys.exit(0)

signal.signal(signal.SIGTERM, graceful_shutdown)

Logging: stdout, Not Files

Docker expects applications to write logs to stdout and stderr. The Docker logging driver then captures these logs and routes them to whatever backend you've configured (json-file, fluentd, CloudWatch, etc.).

Don't write logs to files inside the container. This fills up the container's writable layer, makes logs inaccessible to docker logs, and breaks log aggregation in orchestration platforms.

import logging

logging.basicConfig(
    level=logging.INFO,
    format='{"time":"%(asctime)s","level":"%(levelname)s","message":"%(message)s"}',
    handlers=[logging.StreamHandler()],  # stdout
)

Use JSON-formatted structured logging. This makes logs parseable by automated systems without regex gymnastics.

Secrets Management: Never Bake Secrets Into Images

# NEVER do this
ENV DATABASE_PASSWORD=mysecretpassword
COPY .env /app/.env

Secrets baked into images persist in every layer and are visible to anyone with access to the image.

Instead, inject secrets at runtime:

Environment variables: Pass via docker run -e or Docker Compose environment:. Simple but visible in docker inspect.
Docker secrets: Native secrets management for Docker Swarm. Mounted as files in /run/secrets/.
External secret managers: HashiCorp Vault, AWS Secrets Manager, Azure Key Vault. Pull secrets at application startup. The most secure approach for production.

Build-Time Secrets

If you need secrets during the build (e.g., pulling from a private package registry), use Docker BuildKit's --mount=type=secret:

RUN --mount=type=secret,id=npm_token \
    NPM_TOKEN=$(cat /run/secrets/npm_token) npm ci

The secret is available during the build step but is never committed to any image layer.

Image Scanning and Updates

Production images should be scanned for known vulnerabilities before deployment:

# Scan with Docker Scout (built into Docker Desktop)
docker scout cves myapp:latest

# Scan with Trivy (open source)
trivy image myapp:latest

Integrate image scanning into your CI pipeline. Block deployments if critical or high-severity CVEs are detected. For a complete CI/CD pipeline setup, see our guide on CI/CD with GitHub Actions and Docker.

Rebuild images regularly (at least monthly) to pick up base image security patches, even if your application code hasn't changed.

Container Registry Best Practices

Your images need to live somewhere between build and deployment. Container registry hygiene directly impacts both security and deployment speed.

Use Private Registries

Public registries (Docker Hub) impose pull rate limits that can break CI/CD pipelines at scale. Use a private registry close to your deployment target:

AWS ECR: Best choice for ECS and EKS deployments. Images stay within the AWS network, eliminating egress costs and rate limits.
GitHub Container Registry (GHCR): Good fit for teams already on GitHub. Free for public repos, integrated with GitHub Actions.
Google Artifact Registry: Preferred for GKE and hybrid Google Cloud setups.

Image Tagging Strategy

Never rely on latest in production. latest is mutable — the image it points to can change without warning.

# Good: semantic version + git SHA
docker tag myapp:latest myapp:v2.4.1-a3f9d82

# Good: git SHA alone (fully immutable)
docker tag myapp:latest myapp:a3f9d82

# Acceptable for environments: branch + build number
docker tag myapp:latest myapp:main-482

Use git SHAs for production deployments. They are immutable and provide a direct link between the running image and the exact code it was built from.

Lifecycle Policies: Clean Up Old Images

Without cleanup policies, registry storage grows unbounded. Configure lifecycle rules to automatically delete images older than N days or after N images accumulate:

{
  "rules": [
    {
      "rulePriority": 1,
      "description": "Keep last 30 untagged images",
      "selection": {
        "tagStatus": "untagged",
        "countType": "imageCountMoreThan",
        "countNumber": 30
      },
      "action": { "type": "expire" }
    }
  ]
}

Most teams produce 20–50 CI builds per day. Without a cleanup policy, a single service can accumulate thousands of images within weeks, generating significant storage costs.

Production Dockerfile Checklist

Before deploying any Docker image to production, verify:

Conclusion

Docker is a solved problem at the surface level — anyone can containerize an application. The difference between a containerized application and a production-grade container is security, reliability, and operational efficiency.

The practices in this guide — multi-stage builds, non-root execution, health checks, graceful shutdown, proper logging, and secrets management — aren't optional extras. They're the baseline for any container that handles real traffic. Skip any one of them and you're accumulating technical debt that will surface as production incidents.

Start with your most important service. Apply these patterns one at a time. Each improvement reduces your attack surface, speeds up your deployments, and makes your containers more reliable. If you're moving from single containers to orchestration, our guide on Docker and Kubernetes orchestration covers the next step in the journey.

Want to practice this hands-on?

CloudaQube generates complete labs from a simple description. Try it free.

Get Started Free