You’ve got a TLS certificate that’s about to expire, and you need a way to get an alert before it becomes a problem. This isn’t just about seeing the expiry date; it’s about actively monitoring it and acting on it proactively.

Let’s see how this actually works in practice. Imagine you have a web server, say Nginx, serving a site example.com on https://example.com:443.

server {
    listen 443 ssl;
    server_name example.com;

    ssl_certificate /etc/nginx/ssl/example.com.crt;
    ssl_certificate_key /etc/nginx/ssl/example.com.key;

    # ... other configurations
}

The certificate file, /etc/nginx/ssl/example.com.crt, is what we need to check.

The most surprising truth about monitoring TLS certificate expiry is that the certificate itself doesn’t actively do anything to alert you. It’s a static file. The "monitoring" is entirely an external process that needs to be built or configured. You’re not subscribing to an alert from the certificate; you’re asking a separate system to periodically inspect the certificate file and compare its Not After date against the current date.

Here’s a concrete example of how you might check this from the command line. If you have openssl installed, you can extract the expiry date directly:

openssl x509 -in /etc/nginx/ssl/example.com.crt -noout -dates

This command will output something like:

notBefore=Jan 1 00:00:00 2023 GMT
notAfter=Dec 31 23:59:59 2023 GMT

The notAfter field is your target. You need a script or a monitoring tool that runs this command (or a similar check) regularly, parses the output, and compares the notAfter date to the current date. If the difference is below a certain threshold (e.g., 30 days), it triggers an alert.

This proactive monitoring is crucial because a sudden expiry means your website or service becomes inaccessible to clients who expect secure connections. Browsers will show stark, often frightening, security warnings, leading to lost traffic and trust. The system doesn’t "fail" in a way that throws an error; it simply stops working for users who rely on that TLS connection.

The mental model to build is that of a detached observer. Your TLS certificate is like a driver’s license: it has an expiry date printed on it. The license itself doesn’t remind you when it’s about to expire. You need a separate system (like a calendar reminder, or a recurring task) to look at the license, read the date, and tell you to renew it. In our case, the "observer" is a monitoring agent, and the "license" is the certificate file.

To implement this, you’d typically use a combination of:

  1. A check script: This script, often written in Python, Bash, or Go, uses libraries or command-line tools (like openssl or Go’s crypto/x509) to read the certificate file and extract the notAfter date. It then compares this date to the current date.

    • Diagnosis: Run the script manually. Does it correctly identify the expiry date? Does it calculate the difference correctly?
    • Fix: If the script is buggy, fix the date parsing or comparison logic. For example, a common mistake is not handling timezones correctly or misinterpreting the date format. Ensure your script uses robust date/time parsing.
    • Why it works: The script acts as the "observer," programmatically validating the certificate’s validity period.
  2. A scheduler: Tools like cron (on Linux/macOS) or Task Scheduler (on Windows) are used to run the check script at regular intervals (e.g., daily).

    • Diagnosis: Check cron.log or system logs for the scheduled task’s execution. Did it run? Did it error out?
    • Fix: Ensure the cron entry is correct and the user running it has read permissions on the certificate file. For example, a typical cron entry might look like: 0 3 * * * /usr/local/bin/check_cert.sh >> /var/log/cert_check.log 2>&1.
    • Why it works: Scheduling ensures the check happens consistently, so you’re not relying on manual inspection.
  3. An alerting mechanism: If the check script finds an impending expiry, it needs to notify you. This could be via email, Slack, PagerDuty, or any other notification system.

    • Diagnosis: Configure a test alert. Does it arrive? Is the message clear and actionable?
    • Fix: Ensure your alerting integration (e.g., sendmail for email, webhook for Slack) is correctly configured and authenticated.
    • Why it works: Alerts bridge the gap between detection and action, ensuring you’re informed promptly.

Many modern monitoring systems (Nagios, Zabbix, Prometheus with blackbox_exporter, Datadog, etc.) have built-in checks for TLS certificate expiry. If you’re using one, you’d configure it to monitor the specific host and port, and it would handle the external check and alerting for you. The key is to configure the warning threshold appropriately. For example, in Prometheus with blackbox_exporter, you might configure a probe like this:

modules:
  tls_probe:
    prober: tls
    timeout: 5s
    tls_config:
      insecure_skip_verify: false # Crucial: Ensure it's not skipping verification
    # Optional: Specify the target port if not standard (e.g., 443)
    # port: 443
    # Optional: Specify a hostname to check if it differs from the target IP
    # hostname: example.com
    # Configure the relabeling to extract the certificate expiry
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __param_module
        replacement: tls_probe
      - source_labels: [__address__]
        regex: (.+) # Capture the host:port
        target_label: __param_target
        replacement: $1

# In your Prometheus configuration, you'd scrape this exporter:
# scrape_configs:
#   - job_name: 'blackbox'
#     metrics_path: /probe
#     params:
#       module: [tls_probe]
#     static_configs:
#       - targets:
#         - 'your-blackbox-exporter-host:9115' # The blackbox exporter itself
#     relabel_configs:
#       - source_labels: [__address__]
#         target_label: __param_target
#         regex: 'your-blackbox-exporter-host:9115' # Use the actual exporter host
#         replacement: 'example.com:443' # The target service you want to check

# Then, in Alertmanager, you'd set up rules like:
# - alert: TLSCertificateExpiringSoon
#   expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30 # 30 days in seconds
#   for: 5m
#   labels:
#     severity: warning
#   annotations:

#     summary: "TLS certificate for {{ $labels.instance }} is expiring soon."


#     description: "The certificate for {{ $labels.instance }} expires in less than 30 days."

The probe_ssl_earliest_cert_expiry metric (provided by blackbox_exporter) is a timestamp of when the certificate expires. Subtracting the current time() gives you the remaining duration in seconds. Setting the alert to trigger when this is less than 86400 * 30 (30 days) is a common and effective practice.

One aspect that often trips people up is the difference between the certificate’s notBefore and notAfter dates and the actual time it takes for a certificate to be considered "invalid" by clients. While the notAfter date is the hard cutoff, intermediate certificates in a chain might also expire, or certificate authorities might revoke certificates before their expiry date due to security incidents. Robust monitoring often checks the entire chain and may involve performing actual TLS handshakes to ensure the certificate is accepted by clients.

The next problem you’ll likely encounter after setting up expiry alerts is managing certificate renewals themselves, ensuring the new certificate is correctly deployed and the old one is revoked or properly retired.

Want structured learning?

Take the full Tls-ssl course →