Advanced Control
Master specialized K8s workloads: DaemonSets, Jobs, CronJobs, HPA, resource management, affinity, taints, and InitContainers.
1. Beyond Deployments โ Specialized Workload Controllers
Deployments are perfect for stateless web servers, but Kubernetes includes several specialized controllers for specific operational patterns:
| Controller | Purpose | Common Use Case |
|---|---|---|
| DaemonSet | One pod per node | Log collectors, monitoring agents, CNI plugins |
| Job | Run pods to completion | DB migrations, batch processing, one-time data imports |
| CronJob | Scheduled Jobs | Nightly backups, periodic cleanup, report generation |
| StatefulSet | Stateful apps (see Module 3) | Databases, Kafka, Zookeeper, Elasticsearch |
2. DaemonSets
A DaemonSet ensures that all (or some) nodes run a copy of a Pod. As nodes are added to the cluster, pods are automatically scheduled on them. When nodes are removed, those pods are garbage collected. DaemonSets are unaffected by replicas โ the count is determined entirely by the number of nodes.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-logger
namespace: kube-system
spec:
selector:
matchLabels:
name: fluentd
template:
metadata:
labels:
name: fluentd
spec:
tolerations:
- key: node-role.kubernetes.io/control-plane # Also run on control plane nodes
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd:v1.16
volumeMounts:
- name: varlog
mountPath: /var/log # Access host node logs
volumes:
- name: varlog
hostPath:
path: /var/log 3. Jobs โ Run to Completion
A Job creates one or more Pods and ensures the specified number of them complete successfully. If a Pod fails, the Job creates a replacement. Once complete, the Job stops creating new Pods but the finished Pods are kept for log inspection until the Job's ttlSecondsAfterFinished expires.
apiVersion: batch/v1
kind: Job
metadata:
name: db-migration
spec:
completions: 1 # How many pods must complete successfully
parallelism: 1 # How many pods to run in parallel
backoffLimit: 3 # Retry up to 3 times on failure
ttlSecondsAfterFinished: 300 # Auto-delete job and pods after 5 minutes
template:
spec:
restartPolicy: Never # REQUIRED: Never or OnFailure (not Always)
containers:
- name: migrator
image: my-app:latest
command: ["python", "manage.py", "migrate"]
env:
- name: DB_URL
valueFrom:
secretKeyRef:
name: db-creds
key: url 4. CronJobs โ Scheduled Work
A CronJob creates Job objects on a repeating schedule using standard UNIX cron syntax.
apiVersion: batch/v1
kind: CronJob
metadata:
name: db-backup
spec:
schedule: "0 2 * * *" # Run at 2:00 AM every day
# Cron format: MIN HOUR DOM MON DOW
# "*/15 * * * *" = every 15 minutes
# "0 0 * * 0" = every Sunday at midnight
concurrencyPolicy: Forbid # Don't run a new job if previous is still running
startingDeadlineSeconds: 300 # Skip if not started within 5 min of schedule
successfulJobsHistoryLimit: 3 # Keep last 3 successful jobs
failedJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: backup
image: postgres:15
command: ["pg_dump", "-h", "postgres", "-U", "admin", "mydb"] 5. Resource Management โ Requests and Limits
Kubernetes uses resource requests and limits to manage CPU and memory across the cluster. This is one of the most important concepts for production stability.
- Requests: The minimum resources guaranteed to the container. The scheduler uses this value to decide which node can fit a pod. A pod with a 500m CPU request will only be scheduled on a node with at least 500m available CPU.
- Limits: The maximum resources a container can use. If a container exceeds its memory limit, it is OOMKilled (Out of Memory Killed). If it exceeds its CPU limit, it is throttled (slowed down, not killed).
## Set namespace-wide quotas (admin sets this)
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: dev-team
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
pods: "20"
---
## Set default limits per container (so devs don't need to add them manually)
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: dev-team
spec:
limits:
- default:
cpu: "500m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
type: Container 6. Horizontal Pod Autoscaler (HPA)
The HPA automatically scales the number of replicas in a Deployment, ReplicaSet, or StatefulSet based on observed CPU/memory utilization or custom metrics (via the Metrics API).
# Quick HPA creation
kubectl autoscale deployment web --min=2 --max=10 --cpu-percent=70
# Or declaratively:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # Scale up when avg CPU > 70% 7. Node Affinity & Taints/Tolerations
Node Affinity lets you constrain which nodes a Pod can be scheduled on based on node labels. Taints and Tolerations work the opposite way: they allow a node to repel pods unless the pod explicitly tolerates the taint.
## Taint a node so only specific pods can use it
kubectl taint nodes gpu-node-1 dedicated=gpu:NoSchedule
## Pod that tolerates the taint (can run on gpu-node-1)
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: accelerator # Node must have this label
operator: In
values: ["nvidia-tesla-v100"] 8. InitContainers
InitContainers run to completion sequentially before the main app containers start. They share the same volumes but NOT the same network namespace. Common uses:
- Wait for a database to be ready before starting the app
- Run database migrations before the API starts
- Download and validate configuration from a secrets vault
- Set up file permissions on a shared volume
spec:
initContainers:
- name: wait-for-db # Runs FIRST, must succeed before app starts
image: busybox
command: ['sh', '-c',
'until nc -z postgres 5432; do echo waiting for postgres; sleep 2; done']
- name: run-migrations # Runs SECOND
image: my-app:latest
command: ["python", "manage.py", "migrate"]
containers:
- name: app # Runs LAST, after all initContainers succeed
image: my-app:latest
command: ["python", "manage.py", "runserver"]