
From 4 Minutes to 2 Seconds: Docker Build Optimization
How we achieved 100x faster Docker rebuilds and 42% smaller images while hardening containers for production
So our Docker build was taking almost 4 minutes every time we pushed. Even with cache. It was killing our CI/CD speed and honestly just embarrassing at this point. The image was also 562MB which meant slow deployments and sluggish Kubernetes pod starts.
I fixed it. Here's what I did.
What Was Wrong
The old setup was your typical single-stage Dockerfile. Install everything, build everything, ship everything. It worked but it was bloated and slow.
The real issue? Cache wasn't working properly. Every rebuild was basically starting from scratch. And worse, the container ran as root with /bin/sh available. Not great for security.
The Fix: Multi-Stage Build with BuildKit Cache
I rewrote the Dockerfile into three stages:
# Stage 1: Dependencies - cached layer
FROM node:22-alpine AS deps
RUN --mount=type=cache,target=/root/.pnpm-store pnpm install
# Stage 2: Builder - needs ALL deps
FROM node:22-alpine AS builder
COPY --from=deps /app/node_modules ./node_modules
RUN --mount=type=cache,target=/app/.next/cache pnpm build
# Stage 3: Runtime - distroless (no shell!)
FROM gcr.io/distroless/nodejs22-debian12 AS runner
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static .next/static
COPY --from=builder /app/public ./public
USER node
CMD ["server.js"]The magic here is the BuildKit cache mounts. We mount two persistent caches:
~/.pnpm-storekeeps npm packages between builds/app/.next/cacheis the Next.js build cache for incremental builds
Now when only source code changes, package installation is skipped entirely.
The Numbers
- Cold build: 3m 39s → 2m 04s
- Cached rebuild: ~3-4 min → 1.9s
- Image size: 562MB → 348MB
That 1.9 second cached rebuild was the game changer. Our CI pipeline no longer dies on trivial code changes.
Security: Actually Doing It Right
The biggest win wasn't even the speed. It was actually securing the container properly.
We switched to distroless (gcr.io/distroless/nodejs22-debian12). This image has no shell, no package manager, nothing extra. Just Node.js.
security_opt:
- no-new-privileges:true
read_only: true
cap_drop:
- ALL
tmpfs:
- /tmp:size=64MThe filesystem is read-only, we run as non-root, and all Linux capabilities are dropped.
If someone finds an RCE vulnerability in the app, they can't do much. There's no shell to spawn, no filesystem to explore, no tools to install. The attack is contained to the Node.js process.
Nginx Layer
I also added nginx in front of Next.js to handle static assets properly. This means:
- Static files cached for 1 year
- Gzip compression (level 6, ~70% bandwidth savings)
- Rate limiting to protect against traffic spikes
- 65k worker connections for high concurrency
location /_next/static/ {
proxy_pass http://frontend;
expires 1y;
add_header Cache-Control "public, max-age=31536000, immutable";
access_log off;
}Static assets never hit the Node.js app. Nginx serves them directly with aggressive caching.
What's Next
Still have some things on the list. Here's how I'd actually implement them.
Add a CDN for Edge Caching
CloudFlare is the easiest option. Point your DNS to CloudFlare, enable caching, and done.
For Next.js specifically, set up aggressive cache headers in next.config.ts:
headers: async () => [
{
source: '/(.*)',
headers: [
{ key: 'Cache-Control', value: 'public, max-age=86400, stale-while-revalidate=31536000' },
],
},
]CloudFlare will cache static assets at the edge. Your server barely gets touched for repeat visits.
Try Brotli Compression
Brotli is 15-20% better than gzip. Nginx has built-in support, just enable it:
brotli on;
brotli_types text/plain text/css application/json application/javascript text/xml application/xml;
brotli_comp_level 6;That's it. Nginx will automatically use brotli if the browser supports it.
Set Up Prometheus + Grafana
This is the monitoring stack I want. Export metrics from your Node.js app using prom-client:
import { register, Counter, Histogram } from 'prom-client'
const httpRequestDuration = new Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests',
labelNames: ['method', 'route', 'status_code'],
buckets: [0.1, 0.3, 0.5, 1, 3, 5]
})
// In your request handler
httpRequestDuration.observe({ method: 'GET', route: '/api' }, duration)Then Prometheus scrapes the /metrics endpoint, Grafana visualizes. You'll see request rates, error rates, latency percentiles, memory usage.
But the core infrastructure is solid now. Fast builds, small images, actually secure containers.
If you're still running root in your containers with a shell available, fix that first. It's 2026, no excuse.