A block storage volume isn’t just a raw disk; it’s a sophisticated abstraction that hides a complex, distributed system from you.
Imagine you’re writing a file to a block storage volume. Your application thinks it’s writing to a single, contiguous disk. The reality is far more intricate. The block storage system breaks your file into small, fixed-size chunks called "blocks," typically 4KB or 8KB. These blocks are then independently written to physical storage devices (SSDs or HDDs) spread across multiple servers in a datacenter. The system keeps meticulous track of where each block is stored, often using a distributed metadata service. When you read the file, the system retrieves these blocks from wherever they are, reassembles them, and presents them to your application as a coherent file.
Here’s a simplified look at a typical block storage system in action. Let’s say we’re using a fictional system that exposes volumes via iSCSI.
First, a server (the "initiator") needs to connect to the storage array (the "target"). This involves network configuration.
# On the initiator server
sudo apt-get update
sudo apt-get install open-iscsi
# Configure iSCSI initiator to discover the target portal
sudo nano /etc/iscsi/iscsid.conf
# Uncomment and set node.discovery.sendtargets.address = 192.168.1.100
# Discover available targets
sudo iscsiadm -m discovery -t sendtargets -p 192.168.1.100
# Login to the target
sudo iscsiadm -m node -l
# Check if the new device is visible
lsblk
You’ll see a new device, something like /dev/sdX, appear. This is your block volume, presented over the network.
Now, you’d partition and format it like any other disk:
# Partition the device (using fdisk or gdisk)
sudo fdisk /dev/sdX
# n (new partition), p (primary), 1, default start, default end, w (write changes)
# Format the partition with a filesystem (e.g., ext4)
sudo mkfs.ext4 /dev/sdX1
# Mount the volume
sudo mkdir /mnt/my_volume
sudo mount /dev/sdX1 /mnt/my_volume
# Write some data
echo "This is my test data." | sudo tee /mnt/my_volume/test.txt
Internally, when that tee command writes "This is my test data." to /mnt/my_volume/test.txt, the ext4 filesystem breaks it into blocks. The iSCSI initiator then takes these blocks and sends them over the network to the iSCSI target. The target receives these blocks and, based on its own internal logic (which might involve RAID, erasure coding, or replication), writes these blocks to its physical storage media. The target also updates its metadata to record the location of these blocks. When you later cat /mnt/my_volume/test.txt, the process reverses: the initiator requests the blocks, the target retrieves them from its physical media, reassembles them, and sends them back to the initiator for the filesystem to reconstruct the file.
The primary problem block storage solves is decoupling compute from storage. You can have many servers (compute) accessing the same storage volume without the storage being physically attached to any single server. This allows for easier scaling, high availability (if one server fails, the volume is still accessible from others), and simplified management. The system handles the complexity of data placement, redundancy, and access control, presenting a simple, unified block device to the operating system.
The "blocks" your OS sees are actually logical block addresses (LBAs). The storage system’s internal controller translates these LBAs into physical block addresses on the actual storage media. This translation layer is where features like deduplication, compression, and thin provisioning happen. A single LBA might not correspond to a single physical block if, for instance, data has been compressed, or if the block is a snapshot copy and only the changed data needs to be stored.
The next concept to explore is how block storage systems achieve high availability and performance through techniques like RAID and replication, and how these translate to the logical volumes you interact with.