Kubernetes doesn’t actually manage storage itself; it delegates that responsibility to specialized plugins called CSI drivers.

Let’s see this in action. Imagine a pod that needs persistent storage.

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app-container
    image: nginx
    volumeMounts:
    - name: my-persistent-storage
      mountPath: "/usr/share/nginx/html"
  volumes:
  - name: my-persistent-storage
    persistentVolumeClaim:
      claimName: my-pvc

This pod references a PersistentVolumeClaim (PVC) named my-pvc. When Kubernetes sees this, it doesn’t know how to create storage on your cloud provider or bare-metal hardware. Instead, it looks for a CSI driver that’s been configured to handle storage requests like this one.

The CSI driver, which runs as pods itself within Kubernetes (often as a DaemonSet on worker nodes), intercepts this request. It communicates with the underlying storage system (e.g., AWS EBS, Google Persistent Disk, Ceph, NFS) to provision a new volume. Once the volume is created, the CSI driver creates a PersistentVolume (PV) object in Kubernetes, representing that actual storage. The PVC then gets bound to this PV, fulfilling the pod’s storage requirement. The pod can then mount this volume and use it for its data.

This separation of concerns is key. The core Kubernetes scheduler and controller-manager remain storage-agnostic. They simply orchestrate the creation of PVCs and the binding to PVs. The CSI driver, a separate component, handles the complex, provider-specific details of creating, attaching, and detaching storage. This allows Kubernetes to be cloud-agnostic and hardware-agnostic, as long as a CSI driver exists for the target storage backend.

You control the behavior of your persistent storage primarily through the StorageClass. A StorageClass is a Kubernetes object that describes the "class" of storage you want. It specifies which CSI driver to use and provides parameters that the CSI driver understands.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com # This tells Kubernetes which CSI driver to use
parameters:
  type: gp3 # These are parameters specific to the AWS EBS CSI driver
  fsType: ext4
  encrypted: "true"

When you create a PVC, you can reference a StorageClass.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd # References the StorageClass above
  resources:
    requests:
      storage: 10Gi

The provisioner field in the StorageClass is critical. It’s the name of the CSI driver that Kubernetes will invoke to provision the storage. The parameters field is a map of key-value pairs that are passed directly to the specified CSI driver. These parameters are entirely dependent on the CSI driver itself. For AWS EBS, you might specify type (e.g., gp3, io2), fsType (e.g., ext4, xfs), or encrypted. For a Ceph CSI driver, you might specify clusterID, pool, or storageClass.

The accessModes on the PVC (e.g., ReadWriteOnce, ReadOnlyMany, ReadWriteMany) are also translated by the CSI driver into capabilities supported by the underlying storage. ReadWriteOnce often means a block device that can be attached to a single node, while ReadWriteMany might require a network file system like NFS or CephFS.

The CSI specification defines three main types of CSI drivers:

  1. Controller Plugin: This runs as a single deployment and handles operations like volume provisioning, attachment, and detachment that don’t need to happen on every node.
  2. Node Plugin: This runs as a DaemonSet on every Kubernetes node. It’s responsible for mounting and unmounting volumes on the node where a pod is scheduled.
  3. Identity Plugin: A subset of the controller plugin that provides information about the CSI driver itself.

The interaction between Kubernetes and the CSI driver involves gRPC calls. When a PVC is created, the Kubernetes controller-manager makes a CreateVolume RPC call to the CSI driver’s controller plugin. If successful, the driver provisions the storage and returns a volume ID. Later, when a pod needs to use this volume, the controller-manager makes ControllerPublishVolume to attach the volume to the node, and then the node plugin makes NodeStageVolume and NodePublishVolume RPCs to make the volume available to the pod’s filesystem.

Most people assume that when a StorageClass is specified, the provisioner field directly points to an executable binary. In reality, the provisioner is a unique string identifier registered by the CSI driver’s controller plugin. Kubernetes uses this string to find the correct CSI driver deployment (or sometimes a StatefulSet) in the cluster and communicate with its gRPC endpoints. This indirection allows for more flexible driver deployment and management.

The next step is understanding how to manage the lifecycle of these provisioned volumes, including snapshots and resizing.

Want structured learning?

Take the full Storage course →