"It works on my machine" is the most expensive sentence in software engineering. Docker eliminates it entirely. Your Python app runs in the same environment everywhere — your laptop, CI server, staging, and production. No more version mismatches, missing system libraries, or broken virtual environments.
Docker is especially valuable for Python because:
Let's start with a basic Flask application. Here's the project structure:
my-app/
├── app.py
├── requirements.txt
├── Dockerfile
└── .dockerignore
The application:
# app.py
from flask import Flask, jsonify
import os
app = Flask(__name__)
@app.route("/")
def hello():
return jsonify({
"message": "Hello from Docker!",
"environment": os.getenv("FLASK_ENV", "production"),
"version": os.getenv("APP_VERSION", "0.1.0")
})
@app.route("/health")
def health():
return jsonify({"status": "healthy"}), 200
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
Requirements:
# requirements.txt
flask==3.1.0
gunicorn==23.0.0
And the Dockerfile:
# Dockerfile
FROM python:3.12-slim
# Set working directory
WORKDIR /app
# Install dependencies first (Docker layer caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Create non-root user
RUN useradd --create-home appuser
USER appuser
# Expose port
EXPOSE 5000
# Run with gunicorn (production WSGI server)
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "app:app"]
Build and run:
# Build the image
docker build -t my-python-app .
# Run the container
docker run -p 5000:5000 my-python-app
# Run in background
docker run -d --name my-app -p 5000:5000 my-python-app
# Check it's running
curl http://localhost:5000/health
COPY requirements.txt before COPY . .?Without .dockerignore, Docker copies everything into the build context — including virtual environments, .git, test data, and IDE configs. This bloats your image and slows builds.
# .dockerignore
__pycache__
*.pyc
*.pyo
.git
.gitignore
.env
.env.*
.venv
venv
env
*.egg-info
dist
build
.pytest_cache
.mypy_cache
.coverage
htmlcov
node_modules
*.md
!README.md
docker-compose*.yml
Dockerfile*
.dockerignore
tests/
docs/
*.log
.env files in your Docker image. They contain secrets. Use environment variables at runtime or Docker secrets instead.
A standard python:3.12 image is ~1GB. With multi-stage builds, you can get your final image down to 50-100MB. The trick: use a full image to build dependencies, then copy only what you need into a slim final image.
# Multi-stage Dockerfile
# Stage 1: Build dependencies
FROM python:3.12-slim AS builder
WORKDIR /build
# Install build tools (needed for packages with C extensions)
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc libpq-dev && \
rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
# Stage 2: Production image
FROM python:3.12-slim AS production
WORKDIR /app
# Install runtime-only dependencies (no compilers)
RUN apt-get update && \
apt-get install -y --no-install-recommends libpq5 curl && \
rm -rf /var/lib/apt/lists/*
# Copy installed packages from builder
COPY --from=builder /install /usr/local
# Copy application
COPY . .
# Security: non-root user
RUN useradd --create-home --shell /bin/bash appuser && \
chown -R appuser:appuser /app
USER appuser
EXPOSE 5000
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "app:app"]
Compare the sizes:
# Check image sizes
docker images | grep my-app
# Typical results:
# my-app-full latest 1.2GB (single stage, python:3.12)
# my-app-slim latest 180MB (single stage, python:3.12-slim)
# my-app-multi latest 95MB (multi-stage build)
Alpine-based images are tiny (~20MB base), but they use musl instead of glibc. Many Python packages with C extensions won't compile or will behave differently. Use Alpine only if you understand the trade-offs:
# Alpine-based image (use with caution)
FROM python:3.12-alpine AS builder
RUN apk add --no-cache gcc musl-dev libffi-dev postgresql-dev
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
FROM python:3.12-alpine
RUN apk add --no-cache libpq curl
COPY --from=builder /install /usr/local
COPY . /app
WORKDIR /app
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]
-slim for data science and ML projects.
Real Python applications need databases, caches, task queues, and background workers. Docker Compose orchestrates all of them with a single YAML file.
# docker-compose.yml
services:
web:
build:
context: .
dockerfile: Dockerfile
ports:
- "5000:5000"
environment:
- DATABASE_URL=postgresql://app:secret@db:5432/myapp
- REDIS_URL=redis://redis:6379/0
- FLASK_ENV=development
volumes:
- .:/app # Live reload in development
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
worker:
build: .
command: celery -A tasks worker --loglevel=info --concurrency=2
environment:
- DATABASE_URL=postgresql://app:secret@db:5432/myapp
- REDIS_URL=redis://redis:6379/0
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
beat:
build: .
command: celery -A tasks beat --loglevel=info
environment:
- REDIS_URL=redis://redis:6379/0
depends_on:
- redis
restart: unless-stopped
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: app
POSTGRES_PASSWORD: secret
POSTGRES_DB: myapp
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U app -d myapp"]
interval: 5s
timeout: 3s
retries: 5
redis:
image: redis:7-alpine
command: redis-server --maxmemory 128mb --maxmemory-policy allkeys-lru
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 5
volumes:
pgdata:
Common Docker Compose commands:
# Start everything
docker compose up -d
# View logs (follow mode)
docker compose logs -f web worker
# Restart a single service
docker compose restart web
# Rebuild after code changes
docker compose up -d --build web
# Scale workers
docker compose up -d --scale worker=3
# Stop everything (keep volumes)
docker compose down
# Stop and remove volumes (⚠️ deletes database data)
docker compose down -v
Use overrides for different environments:
# docker-compose.override.yml (auto-loaded in development)
services:
web:
command: flask run --host=0.0.0.0 --port=5000 --reload
volumes:
- .:/app
environment:
- FLASK_DEBUG=1
- FLASK_ENV=development
# docker-compose.prod.yml
services:
web:
volumes: [] # No bind mounts in production
environment:
- FLASK_ENV=production
deploy:
replicas: 2
resources:
limits:
memory: 512M
cpus: "0.5"
# Run production config
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
Never hardcode secrets in Dockerfiles or images. Here are three approaches, from simplest to most secure:
# .env (never commit this file!)
DATABASE_URL=postgresql://app:secret@db:5432/myapp
REDIS_URL=redis://redis:6379/0
SECRET_KEY=dev-secret-key-change-in-production
API_KEY=sk-test-1234567890
# docker-compose.yml
services:
web:
env_file:
- .env
# Create secrets
echo "super-secure-password" | docker secret create db_password -
echo "sk-prod-abc123" | docker secret create api_key -
# docker-compose.yml (Swarm mode)
services:
web:
secrets:
- db_password
- api_key
environment:
- DB_PASSWORD_FILE=/run/secrets/db_password
secrets:
db_password:
external: true
api_key:
external: true
Read secrets in Python:
# config.py
import os
from pathlib import Path
def get_secret(name: str, default: str = "") -> str:
"""Read from Docker secret file or env variable."""
# Check for Docker secret file first
secret_file = Path(f"/run/secrets/{name}")
if secret_file.exists():
return secret_file.read_text().strip()
# Fall back to environment variable
return os.getenv(name.upper(), default)
# Usage
DATABASE_PASSWORD = get_secret("db_password")
API_KEY = get_secret("api_key")
# Dockerfile — build-time args (NOT for secrets)
ARG PYTHON_VERSION=3.12
FROM python:${PYTHON_VERSION}-slim
# Build-time arg (visible in image layers — don't use for secrets!)
ARG APP_VERSION=0.1.0
ENV APP_VERSION=${APP_VERSION}
# Runtime env (set when container starts)
ENV FLASK_ENV=production
ENV WORKERS=4
CMD gunicorn --bind 0.0.0.0:5000 --workers ${WORKERS} app:app
# Build with args
docker build --build-arg APP_VERSION=1.2.3 -t my-app:1.2.3 .
# Run with env overrides
docker run -e WORKERS=8 -e SECRET_KEY=prod-key my-app:1.2.3
Containers crash. Networks hiccup. Databases restart. Health checks let Docker (and orchestrators like Kubernetes) detect and recover from failures automatically.
# Dockerfile health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
A proper health check endpoint in Python:
# health.py
from flask import jsonify
import redis
import psycopg2
def register_health_routes(app, config):
@app.route("/health")
def health():
"""Basic liveness check — is the process running?"""
return jsonify({"status": "ok"}), 200
@app.route("/ready")
def readiness():
"""Readiness check — can we serve traffic?"""
checks = {}
healthy = True
# Database check
try:
conn = psycopg2.connect(config["DATABASE_URL"])
conn.execute("SELECT 1")
conn.close()
checks["database"] = "ok"
except Exception as e:
checks["database"] = str(e)
healthy = False
# Redis check
try:
r = redis.from_url(config["REDIS_URL"])
r.ping()
checks["redis"] = "ok"
except Exception as e:
checks["redis"] = str(e)
healthy = False
status_code = 200 if healthy else 503
return jsonify({
"status": "ready" if healthy else "not_ready",
"checks": checks
}), status_code
Restart policies for Docker Compose:
services:
web:
restart: unless-stopped # Restart on crash, not on manual stop
worker:
restart: on-failure # Only restart if exit code != 0
deploy:
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3 # Give up after 3 crashes
window: 120s
--start-period to cover your app's startup time (loading models, running migrations, warming caches). Health checks during this window don't count as failures.
Automate building, testing, and deploying your Docker image with GitHub Actions:
# .github/workflows/docker.yml
name: Build and Deploy
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install dependencies
run: pip install -r requirements.txt -r requirements-dev.txt
- name: Run tests
run: pytest --cov=app tests/
build-and-push:
needs: test
runs-on: ubuntu-latest
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract version from tag
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=sha
type=raw,value=latest
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
APP_VERSION=${{ github.sha }}
deploy:
needs: build-and-push
runs-on: ubuntu-latest
steps:
- name: Deploy to server
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.SERVER_HOST }}
username: deploy
key: ${{ secrets.SSH_PRIVATE_KEY }}
script: |
cd /opt/my-app
docker compose pull
docker compose up -d --remove-orphans
docker image prune -f
# Bad — running as root (default)
CMD ["python", "app.py"]
# Good — create and switch to non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser
USER appuser
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]
# Bad — tag can shift to a new version anytime
FROM python:3
# Better — major.minor pinned
FROM python:3.12-slim
# Best — SHA digest for reproducibility
FROM python:3.12-slim@sha256:abc123...
# Bad — each RUN creates a layer
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y libpq-dev
RUN rm -rf /var/lib/apt/lists/*
# Good — combine commands, clean up in same layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl libpq-dev && \
rm -rf /var/lib/apt/lists/*
# gunicorn.conf.py
import multiprocessing
# Workers: 2-4 per CPU core
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "gthread"
threads = 2
# Timeouts
timeout = 30
graceful_timeout = 30
keepalive = 5
# Binding
bind = "0.0.0.0:5000"
# Logging
accesslog = "-"
errorlog = "-"
loglevel = "info"
# Security
limit_request_line = 4094
limit_request_fields = 100
CMD ["gunicorn", "--config", "gunicorn.conf.py", "app:app"]
# app.py — graceful shutdown
import signal
import sys
def graceful_shutdown(signum, frame):
"""Handle SIGTERM for graceful shutdown in Docker."""
print(f"Received signal {signum}, shutting down gracefully...")
# Close database connections
# Finish current requests
# Flush logs
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)
signal.signal(signal.SIGINT, graceful_shutdown)
exec form for CMD. CMD ["gunicorn", "app:app"] (exec form) runs gunicorn as PID 1, so it receives SIGTERM directly. CMD gunicorn app:app (shell form) wraps it in /bin/sh -c, and the shell may not forward signals.
# Generate locked requirements
pip freeze > requirements.txt
# Or use pip-compile for better management
pip install pip-tools
pip-compile requirements.in --output-file requirements.txt
# requirements.in (what you specify)
flask>=3.0,<4.0
sqlalchemy>=2.0
# requirements.txt (what pip-compile generates — pinned)
flask==3.1.0
sqlalchemy==2.0.36
# ... all transitive deps pinned
Containers are ephemeral — when they die, their filesystem goes with them. Write logs to stdout/stderr (Docker captures them), not to files inside the container.
# logging_config.py
import logging
import sys
import json
from datetime import datetime, timezone
class JSONFormatter(logging.Formatter):
"""Structured JSON logging for containers."""
def format(self, record):
log_data = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
}
if record.exc_info and record.exc_info[0]:
log_data["exception"] = self.formatException(record.exc_info)
# Add extra fields
for key in ("request_id", "user_id", "endpoint", "duration_ms"):
if hasattr(record, key):
log_data[key] = getattr(record, key)
return json.dumps(log_data)
def setup_logging():
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(JSONFormatter())
root = logging.getLogger()
root.setLevel(logging.INFO)
root.addHandler(handler)
# Reduce noise from libraries
logging.getLogger("urllib3").setLevel(logging.WARNING)
logging.getLogger("werkzeug").setLevel(logging.WARNING)
View and manage container logs:
# View recent logs
docker logs my-app --tail 100
# Follow logs in real-time
docker logs -f my-app
# Logs since a specific time
docker logs --since 2026-03-26T00:00:00 my-app
# Filter JSON logs with jq
docker logs my-app 2>&1 | jq 'select(.level == "ERROR")'
# Log rotation (prevent disk exhaustion)
# docker-compose.yml
services:
web:
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
# Real-time container stats
docker stats
# One-shot stats
docker stats --no-stream --format \
"table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"
# Export metrics for Prometheus
# Add to your Flask app:
from prometheus_flask_instrumentator import Instrumentator
Instrumentator().instrument(app).expose(app, endpoint="/metrics")
When things go wrong inside a container, you need to get inside and look around:
# Shell into a running container
docker exec -it my-app /bin/bash
# Shell into a stopped/crashed container (start a new one from image)
docker run -it --entrypoint /bin/bash my-app:latest
# Check what's running inside
docker exec my-app ps aux
# Inspect container details
docker inspect my-app | jq '.[0].State'
# View container's environment variables
docker exec my-app env | sort
# Check file system
docker exec my-app ls -la /app
# Copy files out for inspection
docker cp my-app:/app/logs/error.log ./debug-error.log
# Watch container events
docker events --filter container=my-app
# Build with verbose output
docker build --progress=plain --no-cache -t my-app .
# Build up to a specific stage
docker build --target builder -t my-app-debug .
# Inspect image layers
docker history my-app --no-trunc
# Check what's in the image
docker run --rm my-app:latest find /app -type f | head -50
# Check container networking
docker network ls
docker network inspect my-app_default
# Test connectivity between containers
docker exec web ping db
docker exec web curl http://redis:6379
# Check port bindings
docker port my-app
By default, Docker containers run as root. If an attacker exploits your app, they have root access to the container — and potentially to the host via volume mounts. Always use a non-root user.
Without it, your .git directory (often hundreds of MB), virtual environments, and test fixtures all get copied into the build context. Builds become slow and images bloat.
latest Tag in Productionlatest is mutable — it points to whatever was pushed last. Use immutable tags (git SHA, version number, or digest) so you can roll back reliably.
The first process in a container (PID 1) is special — it must handle signals and reap zombie processes. Use exec form CMD, or add --init flag to docker run for a lightweight init system.
# Add init process (handles signals and zombie reaping)
docker run --init my-app
# Or in docker-compose.yml
services:
web:
init: true
Container filesystems are ephemeral. When the container stops, data is gone. Always use volumes for persistent data:
services:
db:
volumes:
- pgdata:/var/lib/postgresql/data # Named volume (persistent)
- ./init.sql:/docker-entrypoint-initdb.d/init.sql # Bind mount (seed data)
A Python app with a memory leak will eventually eat all host memory and crash everything. Set limits:
services:
web:
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256M
Here's a complete, production-ready Docker setup for a Python web application with a database, cache, background workers, and monitoring:
# Dockerfile (production)
FROM python:3.12-slim AS builder
WORKDIR /build
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc libpq-dev && \
rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
FROM python:3.12-slim
# Security: non-root user
RUN groupadd -r app && useradd -r -g app -d /app -s /bin/bash app
# Runtime dependencies only
RUN apt-get update && \
apt-get install -y --no-install-recommends libpq5 curl tini && \
rm -rf /var/lib/apt/lists/*
# Copy Python packages
COPY --from=builder /install /usr/local
WORKDIR /app
COPY --chown=app:app . .
USER app
EXPOSE 5000
# tini handles PID 1 responsibilities
ENTRYPOINT ["tini", "--"]
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
CMD ["gunicorn", "--config", "gunicorn.conf.py", "app:app"]
# docker-compose.yml (production)
services:
web:
build: .
ports:
- "5000:5000"
env_file:
- .env
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
init: true
deploy:
resources:
limits:
memory: 512M
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
worker:
build: .
command: celery -A tasks worker -l info -c 2
env_file:
- .env
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
deploy:
resources:
limits:
memory: 512M
beat:
build: .
command: celery -A tasks beat -l info
env_file:
- .env
depends_on:
- redis
restart: unless-stopped
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: ${DB_USER:-app}
POSTGRES_PASSWORD: ${DB_PASSWORD:?Set DB_PASSWORD}
POSTGRES_DB: ${DB_NAME:-myapp}
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER:-app} -d ${DB_NAME:-myapp}"]
interval: 5s
timeout: 3s
retries: 5
deploy:
resources:
limits:
memory: 256M
redis:
image: redis:7-alpine
command: >
redis-server
--maxmemory 128mb
--maxmemory-policy allkeys-lru
--appendonly yes
volumes:
- redisdata:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 5
volumes:
pgdata:
redisdata:
Deploy script:
#!/bin/bash
# deploy.sh — zero-downtime deployment
set -euo pipefail
IMAGE_TAG="${1:?Usage: deploy.sh }"
COMPOSE_FILE="docker-compose.yml"
echo "🚀 Deploying version: $IMAGE_TAG"
# Pull new images
docker compose -f "$COMPOSE_FILE" pull
# Rolling update — restart one at a time
docker compose -f "$COMPOSE_FILE" up -d --no-deps --build web
sleep 5
# Check health
for i in $(seq 1 10); do
if curl -sf http://localhost:5000/health > /dev/null; then
echo "✅ Web service healthy"
break
fi
echo "⏳ Waiting for health check... ($i/10)"
sleep 3
done
# Update workers
docker compose -f "$COMPOSE_FILE" up -d --no-deps --build worker beat
# Cleanup old images
docker image prune -f
echo "✅ Deployment complete: $IMAGE_TAG"
The AI Toolkit includes deployment scripts, web scrapers, API integrations, data pipelines, email automation, and more — all production-ready with Docker support and error handling.
Get the AI Toolkit — $19