๐Ÿ‘ค
๐Ÿ’พ Module 3 โ€ข 3 Labs โ€ข Est. 35 min โ€ข Intermediate

Storage

Master Kubernetes storage: Volumes, PersistentVolumes, PersistentVolumeClaims, StorageClasses, and StatefulSets.

1. Why Storage is Different in Kubernetes

Containers are stateless by design โ€” any data written inside a container's filesystem is lost when the container crashes or is restarted. This is perfectly fine for stateless web servers, but databases, file storage systems, and any persistent application require storage that survives container lifecycle events.

Kubernetes solves this with a layered storage system: Volumes โ†’ PersistentVolumes โ†’ PersistentVolumeClaims โ†’ StorageClasses.

2. Volumes โ€” Ephemeral Container Storage

A Volume is a directory accessible to containers in a Pod. Unlike a container's filesystem, a Volume's lifetime is tied to the Pod โ€” it persists through container restarts within the same Pod, but is destroyed when the Pod is deleted.

Common Volume Types

  • emptyDir: Created fresh when a Pod is assigned to a node. Empty at start. Shared between all containers in the Pod. Deleted when the Pod is removed. Only use for scratch space or inter-container communication.
  • hostPath: Mounts a file or directory from the host node's filesystem into the Pod. Useful for accessing node-level data (Docker socket, /proc). NOT portable โ€” the data lives on one specific node.
  • configMap / secret: Injects configuration data or sensitive values as files or environment variables. Read-only.
  • nfs: Mounts an NFS share from a remote NFS server. Survives pod restarts and can be mounted ReadWriteMany by multiple pods simultaneously.
pod-with-volumes.yaml
apiVersion: v1
kind: Pod
metadata:
  name: volume-demo
spec:
  volumes:
  - name: shared-data           # Define volumes at Pod level
    emptyDir: 
  - name: config-vol
    configMap:
      name: app-config
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: shared-data
      mountPath: /usr/share/nginx/html  # Mount inside container
    - name: config-vol
      mountPath: /etc/config
  - name: sidecar              # Second container in same Pod
    image: busybox
    volumeMounts:
    - name: shared-data
      mountPath: /data          # Both containers share same emptyDir!

3. The PV/PVC System โ€” Cluster Persistent Storage

For truly persistent storage that outlives Pods, Kubernetes uses a two-tier system that separates storage administration from storage consumption:

  • PersistentVolume (PV): The actual storage resource in the cluster. Can be a cloud disk (AWS EBS, GCP PD), NFS share, or even a local disk. Created by a cluster administrator. Has a capacity, access mode, and reclaim policy.
  • PersistentVolumeClaim (PVC): A request for storage by a developer. Specifies desired size, access mode, and optionally a StorageClass. Kubernetes automatically binds it to a suitable PV.
sequenceDiagram participant Admin participant Cluster as Kubernetes API participant Dev as Developer / App participant Storage as Underlying Storage Admin->>Cluster: Creates PV (e.g., 100GB AWS EBS disk) Note over Cluster: PV Status: Available Dev->>Cluster: Creates PVC (requests 10Gi, RWO) Cluster-->>Cluster: Matches PVC to compatible PV Cluster->>Cluster: Binds PVC <-> PV (1-to-1) Note over Cluster: Both Status: Bound Dev->>Cluster: Deploys Pod referencing PVC Cluster->>Storage: Attaches & Mounts disk to Node Note over Storage: Pod can now read/write persistently
pv-and-pvc.yaml
## ADMIN creates PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
  - ReadWriteOnce               # RWO: only one node can mount r/w
  persistentVolumeReclaimPolicy: Retain   # Don't delete data when PVC is removed
  hostPath:
    path: /mnt/data             # Example: using node local path
---
## DEVELOPER creates PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi              # Request 5Gi โ€” will bind to the 10Gi PV above
---
## POD uses PVC
apiVersion: v1
kind: Pod
metadata:
  name: db-pod
spec:
  volumes:
  - name: db-storage
    persistentVolumeClaim:
      claimName: my-pvc         # Reference the PVC, never the PV directly!
  containers:
  - name: postgres
    image: postgres:15
    volumeMounts:
    - name: db-storage
      mountPath: /var/lib/postgresql/data

Access Modes Explained

Access Mode Short Description
ReadWriteOnce RWO One node mounts r/w. Most cloud disks (EBS, PD).
ReadOnlyMany ROX Many nodes mount read-only. Good for config/static assets.
ReadWriteMany RWX Many nodes mount r/w simultaneously. Requires NFS, CephFS, or Azure File.

4. StorageClasses โ€” Dynamic Provisioning

Manually creating PVs for every application is tedious. StorageClass automates PV creation. When a PVC references a StorageClass, Kubernetes automatically creates a matching PV from the cloud provider.

storageclass.yaml
## Define a StorageClass (usually done by cluster admin or cloud provider)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com      # AWS EBS CSI driver
parameters:
  type: gp3
  encrypted: "true"
reclaimPolicy: Delete             # Delete EBS volume when PVC is deleted
volumeBindingMode: WaitForFirstConsumer  # Only create volume when pod is scheduled
---
## PVC referencing the StorageClass
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fast-db-pvc
spec:
  storageClassName: fast-ssd      # Reference the StorageClass
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi               # AWS EBS disk created automatically!

5. StatefulSets โ€” Stateful Applications

For stateful applications like databases, Kubernetes provides StatefulSets. Unlike Deployments (where pods are interchangeable), StatefulSet pods have:

  • Stable, unique network identities: Pods are named with an ordinal index (db-0, db-1, db-2) and keep those names on restart.
  • Stable persistent storage per pod: Each pod gets its own PVC via volumeClaimTemplates. When db-0 is rescheduled to another node, it reattaches to the same PVC โ€” its data is never lost.
  • Ordered, graceful deployment: Pods start in order (0, 1, 2) and terminate in reverse (2, 1, 0). Critical for leader-follower replication setups like MySQL or Kafka.
statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres-headless   # Required for stable DNS
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:            # Each pod gets its OWN PVC!
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 10Gi
# Result: postgres-0 โ†’ data-postgres-0 (PVC)
#         postgres-1 โ†’ data-postgres-1 (PVC)
#         postgres-2 โ†’ data-postgres-2 (PVC)

Deployment vs StatefulSet โ€” When to Use Each

Use Deployment for:

  • โœ… Stateless web servers
  • โœ… API services
  • โœ… Batch workers
  • โœ… When pods are interchangeable

Use StatefulSet for:

  • โœ… Databases (MySQL, PostgreSQL)
  • โœ… Distributed stores (Kafka, Zookeeper)
  • โœ… Search engines (Elasticsearch)
  • โœ… When each pod needs its own identity/storage
โ–ถ Terminal Simulator
โ–ฒ
K8s.Learn Simulator connected.
Type 'help' for available commands.
root@k8s-master:~#