Vault’s GCP KMS auto-unseal is actually a clever hack that relies on a race condition between Vault’s startup and GCP’s KMS key rotation schedule.

Let’s see it in action. Imagine you have a Vault server starting up, and it needs to unseal. Normally, this would be a manual process, requiring someone to enter the unseal keys. With auto-unseal, Vault tries to use a GCP KMS key to decrypt its master key.

Here’s a simplified view of the process:

  1. Vault Starts: Vault initializes and needs to unseal.
  2. KMS Key Check: Vault checks if it can access the GCP KMS key specified in its configuration.
  3. Encryption/Decryption: If Vault can access the key, it uses it to decrypt its master key stored on disk. If successful, Vault is unsealed.
  4. GCP KMS Rotation: GCP KMS periodically rotates keys. This is the critical piece.

The "trick" is that Vault needs to perform the unseal operation before GCP KMS rotates the key that Vault is configured to use for unsealing. If GCP rotates the key, Vault loses its ability to decrypt its master key using the old key version, and it won’t automatically pick up the new version without a restart or manual intervention.

This is why "removing manual unlock" isn’t quite accurate. You’re not removing manual unlock entirely; you’re shifting the timing of the manual unlock to be a restart of Vault, which then triggers the auto-unseal process.

Here’s the configuration you’d typically see in Vault’s server block (usually in /etc/vault.d/vault.hcl):

storage "raft" {
  path = "/vault/data"
  node_id = "vault1"
  retry_join {
    leader_api_addr = "https://vault1.example.com:8200"
    auto_join = "provider=gcp_kms"
  }
}

seal "gcpckms" {
  project     = "your-gcp-project-id"
  region      = "your-gcp-region"
  key_ring    = "your-key-ring-name"
  secret_name = "your-secret-name" # This is actually the KMS Key Name
}

The secret_name in the gcpckms stanza is actually the name of your KMS Key. Vault uses this key to encrypt and decrypt its master seal key.

The retry_join stanza with auto_join = "provider=gcp_kms" is for Raft clustering, not directly for the auto-unseal mechanism itself, though it uses the same GCP KMS credentials.

The Mental Model:

Think of Vault’s master key as a physical safe that needs a combination. Instead of a person remembering the combination (manual unseal), the combination is stored in a secure vault (GCP KMS). Vault has a special tool (the gcpckms seal stanza) that can go to GCP KMS and ask for the combination if it has the right credentials and knows which lockbox (KMS Key) to use.

The problem arises because GCP KMS is designed for security and rotates its "lockboxes" (KMS Keys) periodically. If Vault asks for the combination for "Lockbox A" and GCP KMS has just replaced "Lockbox A" with "Lockbox B" (a new version of the same key), Vault’s old tool won’t work on the new lockbox.

The "Remove Manual Unlock" Part:

What you’re trying to achieve is to have Vault automatically unseal itself when it starts up, without you having to type in the unseal keys. This works because Vault caches the decrypted master key in memory. As long as the Vault process is running, it’s unsealed.

When Vault restarts, it needs to re-decrypt its master key. It does this by calling out to GCP KMS. If the KMS key it’s configured to use is still the same version that was used to encrypt the master key, and Vault has the correct GCP credentials, it will successfully decrypt the master key and unseal itself.

The "manual unlock" you’re removing is the interactive vault operator unseal command. You replace that with a systemctl restart vault or equivalent, which triggers the auto-unseal flow.

The Crucial Counter-Intuitive Detail:

The entire auto-unseal mechanism, particularly with cloud provider KMS, relies on the fact that Vault will attempt to use the latest version of the specified KMS key. However, the actual encryption of the master seal key on disk was performed with a specific version of the KMS key. If GCP KMS rotates the key to a new version, and Vault’s configuration points to the key name (not a specific version), Vault will attempt to use the latest version. If that latest version is not the one that originally encrypted the master seal key, the decryption will fail. The common pattern for successful auto-unseal is to configure Vault with a KMS key that is not set to auto-rotate or is managed such that the version used for encryption remains accessible. The "race condition" is more about ensuring Vault starts after a potential rotation, or that the key version it uses for decryption is the one that was active during the last seal.

Troubleshooting and Common Pitfalls:

  1. Incorrect GCP Permissions: Vault’s service account needs cloudkms.cryptoKeyEncrypterDecrypter role on the specific KMS key.

    • Diagnosis: gcloud kms keys describe YOUR_KMS_KEY_NAME --keyring=YOUR_KEY_RING_NAME --location=YOUR_GCP_REGION --project=YOUR_GCP_PROJECT_ID to get key details. Then check IAM policies for the Vault service account.
    • Fix: gcloud kms keys add-iam-policy-binding YOUR_KMS_KEY_NAME --keyring=YOUR_KEY_RING_NAME --location=YOUR_GCP_REGION --project=YOUR_GCP_PROJECT_ID --member="serviceAccount:YOUR_VAULT_SERVICE_ACCOUNT@YOUR_GCP_PROJECT_ID.iam.gserviceaccount.com" --role="roles/cloudkms.cryptoKeyEncrypterDecrypter"
    • Why it works: This grants the Vault service account the necessary permissions to encrypt and decrypt data using the specified KMS key.
  2. KMS Key Rotation: If the KMS key is configured to auto-rotate, and Vault restarts after a rotation, it will fail to unseal because the key version used to encrypt the master seal key is no longer the latest.

    • Diagnosis: Check the KMS key’s settings in the GCP Console for auto-rotation. Examine Vault logs for errors related to decryption or KMS API calls.
    • Fix: Disable auto-rotation on the KMS key if you intend to rely solely on this auto-unseal mechanism. Alternatively, if auto-rotation is a must, ensure Vault is restarted before the key rotation occurs, or have a plan for manual intervention post-rotation. For long-term stability, consider using a KMS key that is not auto-rotating.
    • Why it works: Prevents the key version mismatch that would otherwise occur after rotation.
  3. Incorrect Configuration: Typos in project ID, region, key ring, or secret name (KMS key name).

    • Diagnosis: Double-check the seal "gcpckms" stanza in your Vault configuration file (/etc/vault.d/vault.hcl or similar) against your GCP resource names.
    • Fix: Correct any discrepancies in the project, region, key_ring, or secret_name values in your vault.hcl.
    • Why it works: Ensures Vault is looking for the correct KMS key in the correct GCP location.
  4. Network/Firewall Issues: Vault server cannot reach the GCP KMS API endpoint.

    • Diagnosis: From the Vault server, attempt to reach cloudkms.googleapis.com on port 443. Use curl -v https://cloudkms.googleapis.com/v1/projects/YOUR_GCP_PROJECT_ID/locations/YOUR_GCP_REGION/keyRings/YOUR_KEY_RING_NAME/cryptoKeys/YOUR_KMS_KEY_NAME.
    • Fix: Configure firewall rules or VPC network peering to allow outbound HTTPS traffic from the Vault server to cloudkms.googleapis.com.
    • Why it works: Enables Vault to communicate with the GCP KMS service.
  5. Vault Service Account Not Attached or Incorrect: The Compute Engine instance running Vault doesn’t have the correct service account attached, or the service account has been modified.

    • Diagnosis: Check the service account assigned to the Compute Engine instance running Vault in the GCP Console. Verify that this service account is the one you granted KMS permissions to.
    • Fix: Re-attach the correct service account to the Compute Engine instance or create a new one with the necessary KMS permissions and attach it.
    • Why it works: Ensures Vault has the identity required to authenticate with GCP and use KMS.
  6. Vault Not Started Correctly: Vault might have started in a state where it couldn’t access its data directory, leading to a perceived unseal failure.

    • Diagnosis: Check Vault’s systemd journal logs (journalctl -u vault -f) for messages indicating it cannot access its storage (/vault/data in the example) or seal status.
    • Fix: Ensure the Vault process has read/write permissions to its data directory and that the underlying storage is healthy. Restart Vault with sudo systemctl restart vault.
    • Why it works: Guarantees Vault can access its encrypted state file and attempt the KMS unseal.

The next error you’ll likely encounter if you have issues with auto-unseal is a persistent "Vault is sealed" state, requiring manual intervention or a full restart of the Vault service.

Want structured learning?

Take the full Vault course →