When Docker Compose Hit the Ceiling
I remember that day vividly. Our API was handling ~5000 RPS at peak times. Docker Compose ran 8 app replicas across three servers. Deployment looked like this: SSH into each server, git pull, docker-compose up -d --build, pray everything doesn't crash. It usually did.
Problems snowballed:
- Rolling update? Manual, one server at a time. 30-60 seconds downtime on each.
- One server died at night. Load balancer kept sending traffic there — we learned about it in the morning from angry customers.
- Autoscaling? Forget it. Load spikes → I manually spin up another server → an hour later everything calms down → server sits idle, I pay for air.
- Health checks existed but were primitive: container alive ≠ app working.
At some point it became clear: further scaling with Compose turns into maintenance hell. We needed an orchestrator. The question was — which one?
Docker Compose works great for local development and small production setups (1-3 servers, up to ~500 RPS). But when load grows and downtime costs money, you need more serious tools.
Real Reasons to Migrate to Kubernetes (Not Hype)
Let's cut through the marketing BS. Here are specific problems Kubernetes solves that Compose doesn't solve or solves with duct tape.
Reason 1: Zero-Downtime Deployments Out of the Box
With Compose:
You stop the container → build new image → start it. Even with docker-compose up -d --no-deps --build service, there's a gap of several seconds. In production, users see this.
With Kubernetes:
Rolling update is native. New pods start → pass readiness probe → begin receiving traffic → old ones gracefully terminate. Downtime = 0. Rollback — one command: kubectl rollout undo.
# Deploy new version
kubectl set image deployment/myapp myapp=myapp:v2
# Rollback if something went wrong
kubectl rollout undo deployment/myappReason 2: Automatic Recovery (Self-Healing)
With Compose:
Container died → restart: unless-stopped restarts it. But if the whole server crashed? Manually SSH to a new one, redeploy the stack.
With Kubernetes: Pod died → scheduler automatically starts a new one. Node (server) died → pods migrate to healthy nodes. Everything automatic, without your involvement at 3 AM.
Reason 3: Declarative Configuration and GitOps
With Compose:
docker-compose.yml exists, but state is managed with imperative commands (docker-compose up, docker-compose scale). What's running in prod right now? Need to SSH and check.
With Kubernetes:
Everything described in YAML manifests. Want 5 app replicas? Write replicas: 5, apply kubectl apply -f deployment.yaml. Kubernetes brings cluster state to desired state. Git is the single source of truth.
GitOps example:
# Commit changes to manifests
git commit -m "Scale app to 10 replicas"
git push
# ArgoCD or Flux automatically applies changes
# Rollback? Just git revert and pushReason 4: Built-in Health Checks and Liveness Probes
With Compose:
You can set healthcheck in docker-compose.yml, but it's primitive: container responds to /health → considered alive. If DB is down but container is alive — you get 500 errors.
With Kubernetes: Two types of checks:
- Liveness probe: Is app alive? If not → kill pod and start new one.
- Readiness probe: Is app ready to receive traffic? If not → don't send requests to this pod.
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5Result: pods that are still starting or already struggling don't receive traffic. Fewer 502s, fewer customer complaints.
Reason 5: Horizontal Pod Autoscaling (HPA)
With Compose:
No autoscaling. Load increased → go to SSH, manually run docker-compose scale app=10. Load dropped → manually decrease. Or write a hacky cron script.
With Kubernetes: Horizontal Pod Autoscaler (HPA) out of the box. Configure metric (CPU, memory, custom metric from Prometheus), and k8s automatically scales pods.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70CPU above 70% → pods automatically added. CPU below → removed. You sleep peacefully.
When Compose Is Enough (Honestly)
Don't rush to rewrite everything in Kubernetes. Here are scenarios when Compose is the right choice:
- Local development — Compose is simpler, faster, uses less resources.
- Monolith on 1-2 servers — If you have 100-200 RPS and don't care about a minute of downtime once a month during deployment.
- Prototypes and MVPs — Why complicate if the project might not take off?
- Team without DevOps — Kubernetes requires knowledge. If you don't have it, Compose or managed solutions (Render, Fly.io) are better.
Kubernetes isn't a silver bullet. It's a complex tool that solves complex problems. If your problems aren't complex, don't complicate your life.
Kubernetes Under the Hood: What It Really Is
Kubernetes (k8s) is a container orchestrator. Simplified: you tell k8s "I need 5 app replicas, each with 2 CPU and 4GB RAM", and it decides which servers to run them on, monitors their health, restarts on failure.
Key Concepts (Minimum to Get Started)
1. Pod — minimal deployment unit. One or more containers that always run together on one node.
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
spec:
containers:
- name: myapp
image: myapp:latest
ports:
- containerPort: 80802. Deployment — describes desired state: how many pod replicas, which image, update strategy.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:v1.0.0
ports:
- containerPort: 80803. Service — network abstraction. Provides stable IP/DNS for a set of pods. Types: ClusterIP (inside cluster), NodePort (external access), LoadBalancer (cloud balancer).
apiVersion: v1
kind: Service
metadata:
name: myapp-service
spec:
selector:
app: myapp
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: LoadBalancer4. Ingress — HTTP/HTTPS routing from outside into services. Like Nginx reverse proxy but declarative.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp-ingress
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-service
port:
number: 805. ConfigMap and Secret — configuration and secrets. Instead of variables in docker-compose.yml, they live separately.
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
data:
DATABASE_URL: "postgresql://user@db:5432/mydb"
LOG_LEVEL: "info"
---
apiVersion: v1
kind: Secret
metadata:
name: myapp-secrets
type: Opaque
data:
DB_PASSWORD: cGFzc3dvcmQxMjM= # base64 encodedWhat k8s Solves
✅ Orchestration: Where to run pods, how to balance them ✅ Scaling: Horizontal and vertical ✅ Self-healing: Automatic restart and rescheduling ✅ Service discovery: Pods find each other via DNS ✅ Rolling updates and rollbacks ✅ Secrets management: Storage and injection of secrets
What k8s Doesn't Solve
❌ App state management — that's your job ❌ Monitoring and logging — need Prometheus, Loki, Grafana ❌ CI/CD — need GitLab CI, ArgoCD, Flux ❌ Backup and disaster recovery — need Velero and manual planning ❌ Security — entire separate domain (RBAC, Network Policies, Pod Security Standards)
Kubernetes is a platform, not a turnkey solution. It's Lego from which you build your infrastructure. You need additional tools and knowledge.
Helm Charts in Practice
Helm is a "package manager" for Kubernetes. Like apt for Ubuntu or npm for Node.js, only for k8s manifests.
Why You Need Helm
Imagine: you have an app with Deployment, Service, Ingress, ConfigMap, Secret. That's ~200 lines of YAML. Now you want to deploy this to 3 environments: dev, staging, prod. With different parameters: replica count, resource limits, domains.
Without Helm: Copy-paste YAML files, Find & Replace values. Maintenance nightmare.
With Helm: One template (chart), parameters in values.yaml. Deploy to any environment — one command.
Helm Chart Structure
myapp/
Chart.yaml # Metadata: name, version
values.yaml # Default variable values
templates/
deployment.yaml # Deployment template
service.yaml # Service template
ingress.yaml # Ingress template
configmap.yaml # ConfigMap template
Migration from docker-compose to Helm (Practical Example)
Before (docker-compose.yml):
version: "3.8"
services:
app:
image: myapp:latest
ports:
- "8080:8080"
environment:
DATABASE_URL: postgresql://user:pass@db:5432/mydb
REDIS_URL: redis://redis:6379
depends_on:
- db
- redis
db:
image: postgres:15
environment:
POSTGRES_PASSWORD: secretpass
volumes:
- db_data:/var/lib/postgresql/data
redis:
image: redis:7-alpine
volumes:
db_data:After (Helm Chart):
values.yaml:
replicaCount: 3
image:
repository: myapp
tag: "v1.0.0"
pullPolicy: IfNotPresent
service:
type: LoadBalancer
port: 80
targetPort: 8080
ingress:
enabled: true
host: myapp.example.com
env:
DATABASE_URL: postgresql://user@postgres:5432/mydb
REDIS_URL: redis://redis:6379
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70templates/deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: { { include "myapp.fullname" . } }
spec:
replicas: { { .Values.replicaCount } }
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
ports:
- containerPort: { { .Values.service.targetPort } }
env:
- name: DATABASE_URL
value: { { .Values.env.DATABASE_URL } }
- name: REDIS_URL
value: { { .Values.env.REDIS_URL } }
resources: { { - toYaml .Values.resources | nindent 10 } }
livenessProbe:
httpGet:
path: /health
port: { { .Values.service.targetPort } }
initialDelaySeconds: 30
readinessProbe:
httpGet:
path: /health/ready
port: { { .Values.service.targetPort } }
initialDelaySeconds: 5Deployment:
# Install chart
helm install myapp ./myapp
# Upgrade with new parameters
helm upgrade myapp ./myapp --set image.tag=v1.1.0
# Rollback to previous version
helm rollback myapp
# Uninstall
helm uninstall myappDifferent environments:
# Dev (1 replica, small resources)
helm install myapp-dev ./myapp -f values-dev.yaml
# Staging (3 replicas)
helm install myapp-staging ./myapp -f values-staging.yaml
# Production (10 replicas, HPA)
helm install myapp-prod ./myapp -f values-prod.yamlReady-Made Helm Charts
No need to write everything from scratch. There's a huge repository of ready charts: Artifact Hub.
Popular charts:
# PostgreSQL
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install postgres bitnami/postgresql
# Redis
helm install redis bitnami/redis
# Nginx Ingress Controller
helm install nginx-ingress ingress-nginx/ingress-nginx
# Prometheus + Grafana (monitoring)
helm install monitoring prometheus-community/kube-prometheus-stackHelm saves dozens of hours. Instead of writing hundreds of lines of YAML, take a ready chart, customize values.yaml, deploy. Profit.
Local Development with Kubernetes
Before running k8s in production, you need to learn to work with it locally. Problem: full k8s requires multiple servers and gigabytes of RAM. Solution: local k8s distributions.
kind vs minikube vs k3d — Comparison
| Criterion | kind | minikube | k3d |
|---|---|---|---|
| Foundation | Docker containers | VM (VirtualBox/Docker) | k3s in Docker |
| Startup speed | ✅ Very fast (10-20 sec) | ⚠️ Slow (1-2 min) | ✅ Fast (20-30 sec) |
| RAM | ✅ Low (~2GB) | ❌ High (~4-8GB) | ✅ Low (~1-2GB) |
| Multi-node cluster | ✅ Yes | ✅ Yes (harder) | ✅ Yes |
| LoadBalancer support | ⚠️ Via MetalLB | ✅ Out of box | ✅ Out of box |
| Closeness to prod | ✅ Very close | ✅ Close | ⚠️ k3s != full k8s |
| Best for | CI/CD, testing | Learning, dev | Fast dev |
My choice:
- kind — for CI/CD and when you want maximum closeness to real k8s.
- minikube — for learning and experiments, if RAM isn't an issue.
- k3d — for daily development when you need speed.
Practical Setup: kind
Installation (macOS):
brew install kind kubectl
# Or via binary
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-darwin-arm64
chmod +x ./kind
mv ./kind /usr/local/bin/kindCreating cluster:
# Basic single-node cluster
kind create cluster --name dev
# Multi-node cluster (1 control-plane + 2 workers)
cat <<EOF | kind create cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
EOFVerification:
kubectl cluster-info
kubectl get nodesDeploying app:
# Load local image into kind
kind load docker-image myapp:latest --name dev
# Deploy
kubectl apply -f deployment.yaml
kubectl get pods
kubectl logs -f <pod-name>
# Port forward for access
kubectl port-forward deployment/myapp 8080:8080
# Now available at http://localhost:8080Deleting cluster:
kind delete cluster --name devPractical Setup: k3d
Installation:
brew install k3d
# Or curl
curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bashCreating cluster with LoadBalancer:
# Cluster with 3 worker nodes and LoadBalancer on port 8080
k3d cluster create dev \
--agents 3 \
--port "8080:80@loadbalancer"Deployment:
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
# If Service type: LoadBalancer, available at localhost:8080
curl http://localhost:8080Stop/start cluster:
k3d cluster stop dev
k3d cluster start devHow Not to Kill Your Laptop
Resources:
- Minimum: 8GB RAM, 4 CPU cores
- Recommended: 16GB RAM, 6+ CPU cores
Optimization:
- Limit Docker Desktop resources: Settings → Resources → 6GB RAM, 4 CPUs
- Use
resource limitsin manifests:resources: limits: cpu: 200m memory: 256Mi - Don't run everything at once. Testing API? Run only API + DB, no frontend and microservices.
- Use
kubectl deleteafter tests. Stopped pods still consume RAM.
Local k8s isn't a toy. It really consumes resources. If laptop is slow, use cloud dev cluster (GKE Autopilot, EKS, DigitalOcean Kubernetes). 1-2 worker nodes cost $10-20/month.
Monitoring and Debugging in Kubernetes
Kubernetes is a black box until you set up observability. Here's the minimum survival kit.
kubectl — Your Main Tool
Basic commands:
# Get pod list
kubectl get pods
# Detailed pod info
kubectl describe pod <pod-name>
# Pod logs
kubectl logs <pod-name>
kubectl logs <pod-name> -f # follow (real-time)
kubectl logs <pod-name> --previous # logs of previous crashed container
# Logs from all deployment pods
kubectl logs -l app=myapp --all-containers=true
# Exec into pod (like docker exec)
kubectl exec -it <pod-name> -- /bin/bash
# Port forward
kubectl port-forward pod/<pod-name> 8080:8080
# Cluster events
kubectl get events --sort-by='.lastTimestamp'
# Top (CPU/Memory usage)
kubectl top nodes
kubectl top podsTroubleshooting common issues:
1. Pod in Pending status
kubectl describe pod <pod-name>
# Check Events: usually "Insufficient CPU/Memory" or "No nodes available"Solution: Increase cluster resources or decrease pod requests.
2. Pod in CrashLoopBackOff status
kubectl logs <pod-name> --previous
# See why container crashedSolution: Usually code error, wrong config, or unavailable dependency (DB, Redis).
3. Pod Running but not responding
kubectl describe pod <pod-name>
# Check Readiness probeSolution: Pod didn't pass readiness probe → not receiving traffic. Check /health/ready endpoint.
4. Service unavailable
kubectl get svc
kubectl get endpoints <service-name>
# Endpoints empty? Pods didn't pass readiness probeMonitoring: Prometheus + Grafana
Quick setup via Helm:
# Add repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Install kube-prometheus-stack (Prometheus + Grafana + Alertmanager)
helm install monitoring prometheus-community/kube-prometheus-stack
# Get Grafana password
kubectl get secret monitoring-grafana -o jsonpath="{.data.admin-password}" | base64 --decode
# Port forward for access
kubectl port-forward svc/monitoring-grafana 3000:80
# Grafana available at http://localhost:3000 (admin / <password>)What you get:
- Prometheus with cluster, node, pod metrics
- Grafana with ready dashboards (Kubernetes Overview, Node Exporter, etc.)
- Alertmanager for alerts
Useful dashboards (already built-in):
- Kubernetes / Compute Resources / Cluster
- Kubernetes / Compute Resources / Namespace (Pods)
- Node Exporter / Nodes
Logging: Loki + Promtail
Setup via Helm:
helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki-stack \
--set grafana.enabled=false \
--set promtail.enabled=true
# Add Loki as data source in Grafana
# URL: http://loki:3100Viewing logs in Grafana:
- Explore → Loki data source
- Query:
{app="myapp"} - Filtering:
{app="myapp"} |= "ERROR"
Kubernetes Alternatives
Kubernetes isn't the only orchestrator. Sometimes alternatives fit better. Honest comparison.
Nomad — Simplicity and Practicality
What it is: Orchestrator from HashiCorp. Can manage containers (Docker), VMs, standalone apps.
Pros:
- ✅ Simplicity: Config is times simpler than k8s. 100 lines Nomad = 300 lines k8s.
- ✅ Lightweight: Single binary, 50MB RAM. k8s is dozens of components.
- ✅ Multi-runtime: Docker, Podman, Java, binaries, VMs.
- ✅ Integration with Consul (service mesh) and Vault (secrets).
Cons:
- ❌ Smaller ecosystem: Fewer ready solutions and community charts.
- ❌ No built-in Ingress: Need separate Nginx/Traefik.
- ❌ Fewer managed options: AWS/GCP/Azure don't offer managed Nomad.
When to choose:
- Small teams (up to 10 people) who need simplicity.
- If already using HashiCorp stack (Consul, Vault, Terraform).
- Want to manage not just containers but legacy apps too.
Nomad config example:
job "myapp" {
datacenters = ["dc1"]
group "app" {
count = 3
task "web" {
driver = "docker"
config {
image = "myapp:v1.0.0"
ports = ["http"]
}
resources {
cpu = 500
memory = 512
}
service {
name = "myapp"
port = "http"
check {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
}
}
}AWS ECS — If Already on AWS
What it is: Managed orchestrator from Amazon. Deep integration with AWS (ALB, RDS, IAM).
Pros:
- ✅ Managed: AWS manages control plane, you only deploy tasks.
- ✅ AWS-native: Integration with IAM roles, secrets, logs, metrics.
- ✅ Fargate: Serverless mode — don't manage servers at all.
Cons:
- ❌ Vendor lock-in: Tied to AWS, migration is hard.
- ❌ More expensive than self-hosted k8s (especially Fargate).
- ❌ Less flexibility than k8s.
When to choose:
- You're already all-in on AWS.
- Don't want to manage infrastructure (Fargate).
- Simplicity matters more than flexibility.
Fly.io — Modern Approach
What it is: Platform for deploying containers closer to users (edge computing). Under the hood — their own orchestrator based on Nomad.
Pros:
- ✅ Simplicity:
fly deploy— done. Like Heroku but with Docker. - ✅ Global distribution: Automatically deploys your app to data centers worldwide.
- ✅ Free tier: Up to 3 VMs, enough for experiments.
- ✅ Managed PostgreSQL, Redis out of the box.
Cons:
- ❌ Vendor lock-in.
- ❌ Less control than k8s.
- ❌ Cost grows with scaling.
When to choose:
- Startups and small projects.
- Want deployment speed without DevOps team.
- Global latency matters (users worldwide).
Deployment example:
# Install CLI
brew install flyctl
# Login
fly auth login
# Initialize app
fly launch
# Deploy
fly deploy
# Scaling
fly scale count 5
fly scale vm shared-cpu-2xDocker Swarm — Forgotten but Alive
What it is: Orchestrator built into Docker. Simplified version of k8s.
Pros:
- ✅ Simplicity: Easier than k8s. If you know Docker, learn Swarm in a day.
- ✅ Built into Docker: No additional installation needed.
- ✅ docker-compose.yml compatibility: Can deploy existing compose files.
Cons:
- ❌ Dying project: Docker Inc barely develops it.
- ❌ No ecosystem: Few ready solutions.
- ❌ Fewer features than k8s.
When to choose:
- If Docker Compose is no longer enough but k8s is overkill.
- For small production (3-5 servers).
Migration from Compose:
# Initialize Swarm
docker swarm init
# Deploy compose file
docker stack deploy -c docker-compose.yml myapp
# Scaling
docker service scale myapp_web=5Final Selection Table
| Criterion | Kubernetes | Nomad | AWS ECS | Fly.io | Docker Swarm |
|---|---|---|---|---|---|
| Complexity | ❌ High | ⚠️ Medium | ✅ Low | ✅ Very low | ✅ Low |
| Ecosystem | ✅ Huge | ⚠️ Medium | ⚠️ AWS-only | ⚠️ Small | ❌ Dead |
| Flexibility | ✅ Maximum | ✅ High | ⚠️ Medium | ❌ Low | ⚠️ Medium |
| Vendor lock-in | ✅ No | ✅ No | ❌ AWS | ❌ Fly.io | ✅ No |
| Cost | ⚠️ DIY cheap, managed expensive | ✅ DIY cheap | ❌ Expensive | ⚠️ Grows | ✅ Cheap |
| Best for | Medium/large teams | Small teams, HashiCorp stack | AWS-native projects | Startups | Small projects |
My advice: start simple. If Docker Compose is enough — stay on it. Outgrow it — first try Fly.io or Nomad. Kubernetes — when you really need its power and flexibility.
Real Cost of Migrating to Kubernetes
Let's be honest: Kubernetes is an investment. Not just in infrastructure but in time and knowledge.
Time to Learn
Minimum level (to deploy): 20-40 hours
- Concepts: Pods, Deployments, Services, Ingress
- kubectl basic commands
- Helm installing ready charts
- Troubleshooting: logs, describe, events
Medium level (to manage in production): 100-200 hours
- k8s networking model, NetworkPolicies
- RBAC and security
- Stateful apps (StatefulSets, PersistentVolumes)
- Monitoring and logging
- CI/CD integration
- Disaster recovery
Expert level (k8s architect): 500+ hours
- Multi-cluster setup
- Service mesh (Istio, Linkerd)
- Custom operators
- Performance tuning
- Custom schedulers and admission controllers
Operational Complexity
Managed Kubernetes (EKS, GKE, AKS):
- ✅ Control plane managed by provider
- ✅ Automatic updates
- ⚠️ You manage worker nodes, addons, monitoring
- 💰 Cost: $70-150/month for control plane + $50-200/month for worker nodes
Self-hosted Kubernetes:
- ❌ You manage everything: control plane, etcd, networking, updates
- ❌ Need DevOps/SRE on team
- ❌ High risk without expertise
- 💰 Cost: $0 for software, but need team time
Real costs (from my experience):
Project with 5 microservices, 20-30 pods, ~2000 RPS.
- Managed GKE: $250/month (control plane + 3 worker nodes e2-standard-4)
- DevOps time: 10-15 hours/month for maintenance
- Monitoring: Prometheus + Grafana self-hosted, $0 extra costs
- Team training: 2 weeks for developer onboarding
Total: ~$250/month + 10-15 hours time. Pays off if you save on manual scaling and downtimes.
When NOT to Migrate to Kubernetes
- Team < 3 developers — k8s overhead eats all time.
- Monolith on 1 server — Docker Compose is enough.
- No DevOps expertise — better Fly.io, Render, Railway.
- Startup at MVP stage — focus on product, not infrastructure.
- Budget < $500/month — managed k8s expensive, self-hosted risky.
Kubernetes isn't status or trendy. It's a tool for solving specific problems. If you don't have those problems, don't create new ones by adopting k8s.
Step-by-Step Migration from Compose to Kubernetes
If you decided k8s is your path, here's a pain-free migration plan.
Stage 1: Preparation (1-2 weeks)
1. Learn k8s basics
- Complete official tutorial: kubernetes.io/docs/tutorials
- Spin up local cluster (kind or k3d)
- Deploy test app
2. Audit current infrastructure
- Which services are running?
- What dependencies between them?
- Where is state stored (databases, files, caches)?
- What environment variables and secrets are used?
3. Choose managed k8s or self-hosted
- Managed (GKE, EKS, AKS) — if budget allows
- Self-hosted (kubeadm, k3s) — if you have DevOps expertise
4. Set up CI/CD for k8s
- Integrate kubectl/helm into pipeline
- Configure image registry (Docker Hub, GCR, ECR)
Stage 2: Migrate Stateless Services (2-4 weeks)
1. Start with simplest service
- Choose stateless service without dependencies
- Create Deployment and Service manifests
- Deploy to dev namespace
API service migration example:
Before (docker-compose.yml):
api:
image: myapi:latest
ports:
- "8080:8080"
environment:
DATABASE_URL: postgresql://db:5432/mydbAfter (k8s/api-deployment.yaml):
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myapi:v1.0.0
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: api-secrets
key: database-url
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 8080
readinessProbe:
httpGet:
path: /health/ready
port: 8080
---
apiVersion: v1
kind: Service
metadata:
name: api
spec:
selector:
app: api
ports:
- port: 80
targetPort: 80802. Gradually migrate remaining stateless services
3. Configure Ingress for HTTP traffic
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: main-ingress
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
secretName: api-tls
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api
port:
number: 80Stage 3: Migrate Stateful Services (4-8 weeks)
1. Databases — DON'T rush to migrate
- Managed DBs (RDS, Cloud SQL) easier than PostgreSQL in k8s
- If migrating — use StatefulSets and PersistentVolumes
- Must configure backup and disaster recovery
Managed DB (recommended):
env:
- name: DATABASE_URL
value: postgresql://user@rds-endpoint:5432/mydbPostgreSQL in k8s (hard but possible):
# Use ready Helm chart
helm install postgres bitnami/postgresql \
--set persistence.size=100Gi \
--set primary.persistence.storageClass=ssd2. Redis/Memcached — can be in k8s
helm install redis bitnami/redis3. File storage
- Use S3-compatible storage (AWS S3, MinIO)
- Or PersistentVolumeClaims with ReadWriteMany (NFS, Ceph)
Stage 4: Set Up Monitoring and Logging (1-2 weeks)
# Prometheus + Grafana
helm install monitoring prometheus-community/kube-prometheus-stack
# Loki for logs
helm install loki grafana/loki-stackStage 5: Switch Traffic (1 day, but test for a week)
1. Parallel run
- Run new k8s stack parallel to old Compose
- Switch small percentage of traffic (10%) to k8s
- Monitor metrics and logs
2. Gradual switching
- 10% traffic → check → 50% → check → 100%
3. Rollback always possible
- Keep old Compose stack alive another 1-2 weeks
- If something went wrong — rollback traffic
Post-migration Checklist
Conclusions
Kubernetes is a powerful tool but not a panacea. It solves real problems of scaling, fault tolerance, and automation. But entry price is time for learning and operational complexity.
What to remember:
-
Docker Compose is enough for small projects (1-3 servers, up to 500 RPS). Don't complicate unnecessarily.
-
Migrate to k8s when:
- Need zero-downtime deployment
- Horizontal autoscaling is critical
- Team ready to invest time in learning
- Managed k8s available and budget allows
-
Helm charts save dozens of hours. Don't write YAML from scratch, use ready charts.
-
Local development: kind for CI/CD, k3d for daily dev, minikube for learning.
-
Alternatives exist:
- Nomad — simplicity for HashiCorp stack
- Fly.io — speed for startups
- AWS ECS — if on AWS and want managed
- Docker Swarm — if k8s overkill but Compose not enough
-
Monitoring not optional. Prometheus + Grafana + Loki — minimum stack for production k8s.
-
Migration is gradual. Start with stateless services, test, monitor, only then migrate critical stateful components.
-
Don't migrate databases to k8s without good reasons. Managed DBs (RDS, Cloud SQL) simpler and more reliable.
My personal conclusion:
I went from Compose to k8s in 2021. First 2 months — pain, learning, pitfalls. But when everything worked, I got infrastructure that scales without my involvement, self-heals, and allows deploying 10 times a day without downtime.
Kubernetes is worth its money and time if your task is complex enough. If not — use simpler tools and sleep peacefully.
P.S. If you have questions about k8s migration or need help choosing an orchestrator, reach out — I'll help you figure it out.
See also:
- Load Balancers — next step after Docker Compose
- Monitoring Stack 2025 — how to set up Prometheus + Grafana
