Most people think TLS certificates are just about "encrypting traffic," but their real power lies in establishing trust and controlling access in a distributed system.

Let’s see this in action. Imagine a microservice, frontend-api, that needs to talk to a user-service.

# frontend-api's config
apiVersion: v1
kind: Pod
metadata:
  name: frontend-api
spec:
  containers:
  - name: app
    image: my-repo/frontend-api:latest
    ports:
    - containerPort: 8080
    volumeMounts:
    - name: tls-certs
      mountPath: "/etc/ssl/certs/user-service"
      readOnly: true
  volumes:
  - name: tls-certs
    secret:
      secretName: user-service-tls-cert

Here, frontend-api expects to find the user-service’s TLS certificate at /etc/ssl/certs/user-service/tls.crt. When frontend-api makes a request to user-service, it will:

  1. Load the tls.crt from the mounted secret.
  2. Use this certificate to verify the identity of user-service during the TLS handshake.
  3. If the certificate is valid and matches the expected hostname (e.g., user-service.internal.mycompany.com), the connection proceeds. Otherwise, it fails.

This mechanism ensures that frontend-api isn’t just talking to any service claiming to be user-service, but specifically to the one it trusts, identified by its certificate. This is crucial for preventing man-in-the-middle attacks and for implementing granular authorization policies based on service identity.

The system you’re dealing with is often a Kubernetes cluster, where managing these certificates for inter-service communication, external ingress, and even internal APIs becomes a significant operational challenge. The core problem is that certificates have an expiration date, and manual renewal is a recipe for disaster. Automated tracking, renewal, and auditing are therefore essential.

The primary tool for this in Kubernetes is cert-manager. It acts as a certificate authority (CA) within your cluster, capable of issuing certificates from various sources (like Let’s Encrypt, HashiCorp Vault, or even self-signed CAs) and automatically handling their lifecycle.

Here’s a typical setup for automatically obtaining and renewing a TLS certificate for an Ingress resource:

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-prod
  namespace: default
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@mycompany.com
    privateKeySecretRef:
      name: letsencrypt-prod-private-key
    solvers:
    - http01:
        ingress:
          class: nginx

This Issuer resource tells cert-manager how to obtain certificates from Let’s Encrypt using the HTTP01 challenge.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: my-app-tls
  namespace: default
spec:
  secretName: my-app-tls-secret
  dnsNames:
  - myapp.mycompany.com
  issuerRef:
    name: letsencrypt-prod
    kind: Issuer

This Certificate resource requests a certificate for myapp.mycompany.com from the letsencrypt-prod Issuer. cert-manager will then:

  1. Create a Kubernetes Secret named my-app-tls-secret to store the issued certificate and its private key.
  2. Configure the Ingress resource (which you’d also define, referencing my-app-tls-secret) to use this certificate.
  3. Monitor the certificate’s expiration.
  4. Automatically renew it before it expires, updating the my-app-tls-secret.
  5. Ensure the Ingress continues to serve traffic without interruption.

The beauty of this is that the frontend-api (or any other service) doesn’t need to know about the renewal process. It simply consumes the certificate from its mounted secret. When cert-manager updates the secret, the mounting mechanism in Kubernetes ensures the frontend-api pod sees the new certificate.

Auditing involves checking the status of your certificates. cert-manager provides custom resources like Certificate and CertificateRequest that you can query. For instance, to see the status of all certificates managed by cert-manager:

kubectl get certificates --all-namespaces

This command will show you certificates, their issuers, and their readiness status. If a certificate is not ready, it might indicate a configuration issue with the Issuer or a problem with the challenge solver (e.g., the Ingress controller isn’t properly configured for HTTP01 challenges).

cert-manager also logs extensively, which is invaluable for debugging. You can check the logs of the cert-manager pod itself:

kubectl logs -n cert-manager cert-manager-xxxxxxxxx-yyyyy

When cert-manager renews a certificate, it doesn’t just update the secret. It also triggers a rollout of any Deployment, StatefulSet, or DaemonSet that is configured to use that secret for TLS. This is usually handled by cert-manager’s integration with the Kubernetes API, which effectively causes pods to restart, pick up the new certificate, and re-establish TLS connections. If you have a service that doesn’t automatically pick up secret changes (e.g., a custom application that needs to be reloaded), you might need to implement a more complex reconciliation loop.

The most surprising thing about automated certificate renewal is how often it breaks due to subtle DNS or network misconfigurations that prevent the ACME challenge from succeeding. You might have a perfectly valid Issuer and Certificate resource, but if your DNS records don’t propagate fast enough, or if a firewall blocks the HTTP01 challenge traffic, Let’s Encrypt will fail to validate ownership, and your certificate will never be issued or renewed.

The next concept you’ll grapple with is managing trust for internal services where you don’t control public DNS or want to use external CAs. This often leads to using cert-manager with a self-signed CA or integrating it with internal PKI solutions like HashiCorp Vault.

Want structured learning?

Take the full Internet Protocol Deep Dives course →