Kubernetes Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) are the system’s way of letting pods get durable storage, but they’re fundamentally about abstracting away the how from the what.
Let’s see it in action. Imagine we have an application that needs to store user uploads. We don’t want that data to disappear when the pod restarts.
First, we need to make sure our Kubernetes cluster knows how to provision storage. This is where StorageClass comes in. A StorageClass defines a "class" of storage, like "fast SSDs" or "cheap spinning disks," and specifies which provisioner (like AWS EBS, Google Persistent Disk, or even a local disk) should be used.
Here’s a sample StorageClass for AWS EBS:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
fsType: ext4
reclaimPolicy: Retain
The provisioner field tells Kubernetes which storage plugin to use. parameters are specific to that provisioner; here, type: gp3 specifies a General Purpose SSD volume on AWS, and fsType: ext4 sets the filesystem. reclaimPolicy: Retain means that when the PVC is deleted, the actual EBS volume won’t be deleted automatically, which is often desirable for production data.
Now, our application’s pod will request storage using a PersistentVolumeClaim (PVC). The PVC is a request for storage. It specifies the size and the access modes (like ReadWriteOnce, ReadOnlyMany, ReadWriteMany) it needs, and crucially, it can reference a StorageClass.
Here’s a PVC requesting 10Gi of storage from our fast-ssd class:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: user-uploads-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
When Kubernetes sees this PVC, it looks for a StorageClass named fast-ssd. If it finds one, it invokes the specified provisioner (kubernetes.io/aws-ebs in this case) to create a physical storage volume (a PV) that matches the PVC’s requirements. The provisioner will talk to the cloud provider’s API (AWS, in this example) to create an EBS volume. Once the volume is created, Kubernetes dynamically creates a PersistentVolume (PV) object that represents this physical storage and binds it to the PVC.
The PersistentVolume object is a representation of an actual piece of storage. It has details like its capacity, access modes, and the underlying storage details. When a PV is dynamically provisioned, Kubernetes automatically creates this PV object and binds it to the PVC.
Finally, our application pod can mount this storage. The pod’s definition will include a volume section that references the PVC.
apiVersion: v1
kind: Pod
metadata:
name: user-app
spec:
containers:
- name: app-container
image: my-user-app:latest
ports:
- containerPort: 8080
volumeMounts:
- name: uploads-storage
mountPath: /app/uploads
volumes:
- name: uploads-storage
persistentVolumeClaim:
claimName: user-uploads-pvc
When the pod starts, Kubernetes ensures that the volume requested by user-uploads-pvc is attached to the node where the pod is scheduled and then mounted at /app/uploads inside the app-container. If the pod is rescheduled to another node, Kubernetes will detach the volume from the old node and attach it to the new one, ensuring data persistence.
The magic here is the abstraction. The pod developer doesn’t need to know how the storage is provisioned (AWS EBS, NFS, Ceph, etc.) or where it physically resides. They just declare their storage needs via a PVC, and Kubernetes, guided by the StorageClass, handles the rest.
A common point of confusion is the difference between reclaimPolicy on the StorageClass and persistentVolumeReclaimPolicy on the PersistentVolume itself. While StorageClass defines the default for dynamically provisioned PVs, the PV’s policy is what dictates what happens when the PVC is deleted. If a PV is statically provisioned (meaning you created the PV object manually before creating the PVC), its persistentVolumeReclaimPolicy is the only one that matters for its lifecycle.
The next hurdle you’ll likely encounter is understanding how to manage storage for stateful applications that require stable network identities and ordered deployments, which is where StatefulSets come into play.