Most people think TLS certificates are just about "encrypting traffic," but their real power lies in establishing trust and controlling access in a distributed system.
Let’s see this in action. Imagine a microservice, frontend-api, that needs to talk to a user-service.
# frontend-api's config
apiVersion: v1
kind: Pod
metadata:
name: frontend-api
spec:
containers:
- name: app
image: my-repo/frontend-api:latest
ports:
- containerPort: 8080
volumeMounts:
- name: tls-certs
mountPath: "/etc/ssl/certs/user-service"
readOnly: true
volumes:
- name: tls-certs
secret:
secretName: user-service-tls-cert
Here, frontend-api expects to find the user-service’s TLS certificate at /etc/ssl/certs/user-service/tls.crt. When frontend-api makes a request to user-service, it will:
- Load the
tls.crtfrom the mounted secret. - Use this certificate to verify the identity of
user-serviceduring the TLS handshake. - If the certificate is valid and matches the expected hostname (e.g.,
user-service.internal.mycompany.com), the connection proceeds. Otherwise, it fails.
This mechanism ensures that frontend-api isn’t just talking to any service claiming to be user-service, but specifically to the one it trusts, identified by its certificate. This is crucial for preventing man-in-the-middle attacks and for implementing granular authorization policies based on service identity.
The system you’re dealing with is often a Kubernetes cluster, where managing these certificates for inter-service communication, external ingress, and even internal APIs becomes a significant operational challenge. The core problem is that certificates have an expiration date, and manual renewal is a recipe for disaster. Automated tracking, renewal, and auditing are therefore essential.
The primary tool for this in Kubernetes is cert-manager. It acts as a certificate authority (CA) within your cluster, capable of issuing certificates from various sources (like Let’s Encrypt, HashiCorp Vault, or even self-signed CAs) and automatically handling their lifecycle.
Here’s a typical setup for automatically obtaining and renewing a TLS certificate for an Ingress resource:
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt-prod
namespace: default
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@mycompany.com
privateKeySecretRef:
name: letsencrypt-prod-private-key
solvers:
- http01:
ingress:
class: nginx
This Issuer resource tells cert-manager how to obtain certificates from Let’s Encrypt using the HTTP01 challenge.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: my-app-tls
namespace: default
spec:
secretName: my-app-tls-secret
dnsNames:
- myapp.mycompany.com
issuerRef:
name: letsencrypt-prod
kind: Issuer
This Certificate resource requests a certificate for myapp.mycompany.com from the letsencrypt-prod Issuer. cert-manager will then:
- Create a Kubernetes Secret named
my-app-tls-secretto store the issued certificate and its private key. - Configure the Ingress resource (which you’d also define, referencing
my-app-tls-secret) to use this certificate. - Monitor the certificate’s expiration.
- Automatically renew it before it expires, updating the
my-app-tls-secret. - Ensure the Ingress continues to serve traffic without interruption.
The beauty of this is that the frontend-api (or any other service) doesn’t need to know about the renewal process. It simply consumes the certificate from its mounted secret. When cert-manager updates the secret, the mounting mechanism in Kubernetes ensures the frontend-api pod sees the new certificate.
Auditing involves checking the status of your certificates. cert-manager provides custom resources like Certificate and CertificateRequest that you can query. For instance, to see the status of all certificates managed by cert-manager:
kubectl get certificates --all-namespaces
This command will show you certificates, their issuers, and their readiness status. If a certificate is not ready, it might indicate a configuration issue with the Issuer or a problem with the challenge solver (e.g., the Ingress controller isn’t properly configured for HTTP01 challenges).
cert-manager also logs extensively, which is invaluable for debugging. You can check the logs of the cert-manager pod itself:
kubectl logs -n cert-manager cert-manager-xxxxxxxxx-yyyyy
When cert-manager renews a certificate, it doesn’t just update the secret. It also triggers a rollout of any Deployment, StatefulSet, or DaemonSet that is configured to use that secret for TLS. This is usually handled by cert-manager’s integration with the Kubernetes API, which effectively causes pods to restart, pick up the new certificate, and re-establish TLS connections. If you have a service that doesn’t automatically pick up secret changes (e.g., a custom application that needs to be reloaded), you might need to implement a more complex reconciliation loop.
The most surprising thing about automated certificate renewal is how often it breaks due to subtle DNS or network misconfigurations that prevent the ACME challenge from succeeding. You might have a perfectly valid Issuer and Certificate resource, but if your DNS records don’t propagate fast enough, or if a firewall blocks the HTTP01 challenge traffic, Let’s Encrypt will fail to validate ownership, and your certificate will never be issued or renewed.
The next concept you’ll grapple with is managing trust for internal services where you don’t control public DNS or want to use external CAs. This often leads to using cert-manager with a self-signed CA or integrating it with internal PKI solutions like HashiCorp Vault.