TL;DR - Quick Answer

Zero-downtime deployment means releasing updates without service interruption
Requires: proper readiness probes, rolling update strategy, graceful shutdown handling
Common strategies: Rolling Updates (default), Blue-Green, Canary Releases
Most failures are caused by missing readiness probes or improper connection draining

What is Zero-Downtime Deployment?

Zero-downtime deployment is a deployment strategy where application updates are released without any service interruption. Users continue to access the application normally throughout the entire deployment process - they don't experience errors, timeouts, or any indication that an update is happening.

In Kubernetes, this is achieved through a combination of proper deployment configuration, health checks, and graceful shutdown handling. When done correctly, you can deploy multiple times per day with complete confidence that users won't be affected.

Why Zero-Downtime Matters

Every minute of downtime has a cost:

Revenue loss - E-commerce sites lose sales, SaaS platforms lose usage-based revenue
User trust - Frequent outages erode confidence in your platform
Developer velocity - Fear of deployments leads to infrequent releases with larger, riskier changes
On-call burden - Risky deployments mean more incidents and stressed engineers

Teams that achieve reliable zero-downtime deployments deploy more frequently, ship smaller changes, and have higher confidence in their releases.

Deployment Strategies Compared

Strategy	How It Works	Best For
Rolling Update	Gradually replaces old pods with new ones	Most applications (default strategy)
Blue-Green	Two identical environments, switch traffic at once	Database migrations, major changes
Canary	Route small % of traffic to new version first	High-risk changes, gradual rollout

Step 1: Configure Readiness Probes

The readiness probe tells Kubernetes when your pod is ready to receive traffic. Without it, Kubernetes will send traffic to pods that aren't ready, causing errors during deployment.

spec:
  containers:
  - name: app
    readinessProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3

Common Mistake

Don't use liveness probe alone! Liveness probes restart unhealthy pods, but readiness probes control traffic routing. You need both for zero-downtime.

Your health endpoint should verify that the application is truly ready:

Database connections are established
Cache is warmed (if needed)
Dependencies are reachable

Step 2: Configure Rolling Update Strategy

The rolling update strategy controls how Kubernetes replaces old pods with new ones. Two key parameters:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 1

maxUnavailable: 0 - Never have fewer than desired pods (zero-downtime)
maxSurge: 1 - Allow one extra pod during rollout

With this configuration, Kubernetes will:

Create a new pod with the updated version
Wait for it to pass readiness checks
Start routing traffic to the new pod
Terminate an old pod
Repeat until all pods are updated

Step 3: Implement Graceful Shutdown

When Kubernetes terminates a pod, it sends a SIGTERM signal. Your application must handle this signal gracefully:

Stop accepting new requests
Finish processing in-flight requests
Close database connections cleanly
Exit the process

# Python example
import signal
import sys

def graceful_shutdown(signum, frame):
    print("Received SIGTERM, shutting down gracefully...")
    # Stop accepting new requests
    server.stop_accepting()
    # Wait for in-flight requests to complete
    server.wait_for_pending_requests(timeout=30)
    # Close connections
    db.close()
    sys.exit(0)

signal.signal(signal.SIGTERM, graceful_shutdown)

Step 4: Add PreStop Hook

There's a race condition: Kubernetes removes the pod from the service endpoints at the same time it sends SIGTERM. Some traffic may still arrive during this window.

The preStop hook adds a delay before shutdown, giving time for endpoint updates to propagate:

spec:
  containers:
  - name: app
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh", "-c", "sleep 10"]

This 10-second sleep gives load balancers and ingress controllers time to stop sending traffic to this pod before it begins shutdown.

Pro Tip

Set terminationGracePeriodSeconds to at least preStop sleep + your application's shutdown time. Default is 30 seconds.

Complete Example

Here's a complete deployment configuration for zero-downtime:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 1
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: app
        image: my-app:v2
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 10
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 10"]

Troubleshooting Common Issues

1. Errors During Deployment

Symptom: Users see 502/503 errors during deployments.
Cause: Usually missing or misconfigured readiness probes.
Fix: Ensure readiness probe returns healthy only when app is truly ready.

2. Connection Resets

Symptom: In-flight requests fail with connection reset.
Cause: App not handling SIGTERM gracefully.
Fix: Implement graceful shutdown and add preStop hook.

3. Slow Rollouts

Symptom: Deployments take too long.
Cause: Usually slow readiness probes or conservative settings.
Fix: Tune initialDelaySeconds and periodSeconds, increase maxSurge.

Key Takeaways

Zero-downtime deployment is achievable with proper configuration
Readiness probes are critical - they control traffic routing
Use maxUnavailable: 0 to ensure capacity throughout deployment
Implement graceful shutdown in your application
Add preStop hook to handle the endpoint update race condition
Test your deployment strategy before relying on it in production

Need Help with Kubernetes Deployments?

We've helped companies achieve zero-downtime deployments and reduce deployment anxiety. Let's talk about your infrastructure.

Kubernetes Zero-Downtime Deployments: A Complete Guide