Spring Boot applications on Kubernetes can achieve zero-downtime rolling deployments by carefully orchestrating the lifecycle of new and old pods.

Let’s see this in action. Imagine we have a simple Spring Boot application exposing a /health endpoint and a /api/greeting endpoint.

@SpringBootApplication
@RestController
public class GreetingApplication {

    public static void main(String[] args) {
        SpringApplication.run(GreetingApplication.class, args);
    }

    @GetMapping("/health")
    public String health() {
        return "OK";
    }

    @GetMapping("/api/greeting")
    public String greeting() {
        // Simulate some work
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
        return "Hello from Spring Boot!";
    }
}

Here’s a Kubernetes Deployment manifest designed for rolling updates:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: greeting-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: greeting-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1 # Allows one pod to be unavailable during the update
      maxSurge: 1       # Allows one extra pod to be created beyond the desired replica count
  template:
    metadata:
      labels:
        app: greeting-app
    spec:
      containers:
      - name: greeting-app-container
        image: your-docker-repo/greeting-app:v1.0.0
        ports:
        - containerPort: 8080
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 20
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10

And a Service to direct traffic:

apiVersion: v1
kind: Service
metadata:
  name: greeting-app-service
spec:
  selector:
    app: greeting-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: ClusterIP

The magic happens in the strategy section of the Deployment. type: RollingUpdate tells Kubernetes to manage the update process. maxUnavailable: 1 means that during the update, at least replicas - maxUnavailable pods must be available. So, if you have 3 replicas, at least 2 must be running and ready. maxSurge: 1 means Kubernetes can create up to replicas + maxSurge pods. This allows it to bring up a new pod before tearing down an old one, ensuring there’s always at least the minimum number of replicas available.

Kubernetes orchestrates this by:

  1. Creating a new pod with the updated image.
  2. Waiting for the new pod to become ready (passing its readiness probe).
  3. Terminating an old pod.
  4. Repeating until all old pods are replaced.

The livenessProbe and readinessProbe are crucial. The livenessProbe tells Kubernetes if the application is still running; if it fails, Kubernetes will restart the container. The readinessProbe tells Kubernetes if the application is ready to serve traffic. Only pods that pass the readiness probe will receive traffic from the Service. This ensures that traffic is only sent to healthy, ready instances.

During a deployment, if you update the image tag in your Deployment manifest to your-docker-repo/greeting-app:v1.0.1 and apply it, Kubernetes will start the rolling update. It will bring up a new pod running v1.0.1. Once that pod passes its readiness probe, Kubernetes will take down one of the old pods running v1.0.0. This continues until all old pods are replaced by new ones.

The key to zero downtime isn’t just Kubernetes’ rolling update strategy, but also how your application handles graceful shutdown. When Kubernetes sends a SIGTERM signal to a pod it’s about to terminate, your Spring Boot application has a grace period (defined by terminationGracePeriodSeconds in the Pod spec, defaulting to 30 seconds) to shut down cleanly. This means finishing in-flight requests, closing database connections, and releasing resources before the pod is forcefully killed.

To ensure graceful shutdown in Spring Boot, you can implement a GracefulShutdown bean:

@Configuration
public class GracefulShutdown {

    @Bean
    public GracefulShutdownLifecycle gracefulShutdownLifecycle(ApplicationContext context) {
        return new GracefulShutdownLifecycle(context);
    }

    public static class GracefulShutdownLifecycle implements ApplicationListener<ContextClosedEvent> {
        private static final Logger logger = LoggerFactory.getLogger(GracefulShutdownLifecycle.class);
        private final ApplicationContext context;

        public GracefulShutdownLifecycle(ApplicationContext context) {
            this.context = context;
        }

        @Override
        public void onApplicationEvent(ContextClosedEvent event) {
            logger.info("Received ContextClosedEvent. Initiating graceful shutdown...");
            // Add custom shutdown logic here, e.g., stopping a custom thread pool
            // Spring Boot's embedded Tomcat/Jetty will usually handle HTTP requests shutdown automatically.
            logger.info("Graceful shutdown complete.");
        }
    }
}

When Kubernetes sends SIGTERM, the Spring Boot application context receives a ContextClosedEvent, allowing you to execute custom cleanup logic before the JVM exits. This prevents abrupt termination of requests.

The interplay between Kubernetes’ RollingUpdate strategy, robust liveness and readiness probes, and your application’s ability to gracefully shut down is what enables zero-downtime deployments. Without well-defined probes, Kubernetes might send traffic to a pod that isn’t actually ready, or it might terminate a pod mid-request. Without graceful shutdown, requests in flight will be abruptly dropped.

The next challenge is managing stateful applications or ensuring that database schema migrations don’t cause downtime during deployments.

Want structured learning?

Take the full Spring-boot course →