The Saga pattern is often presented as a simpler, more resilient alternative to two-phase commit (2PC), but its true magic lies in how it transforms transactional guarantees from ACID’s atomicity to eventual consistency with explicit compensation.

Let’s see a simple Saga in action. Imagine an e-commerce order process that involves three services: OrderService, PaymentService, and InventoryService.

Order Service:

{
  "orderId": "ORD123",
  "customerId": "CUST456",
  "items": [
    {"productId": "PROD789", "quantity": 2}
  ],
  "status": "PENDING",
  "paymentId": null,
  "inventoryReservationId": null
}

1. Initiate Order (OrderService): A new order is created with status: PENDING. This is the first local transaction.

2. Process Payment (Triggered by OrderService): OrderService sends a command to PaymentService to create a payment.

// PaymentService request
{
  "orderId": "ORD123",
  "customerId": "CUST456",
  "amount": 100.00
}

PaymentService processes the payment and, if successful, updates its internal state and returns a paymentId.

// PaymentService response
{
  "paymentId": "PAY789",
  "orderId": "ORD123",
  "status": "COMPLETED"
}

OrderService receives this, updates its order:

{
  "orderId": "ORD123",
  "customerId": "CUST456",
  "items": [
    {"productId": "PROD789", "quantity": 2}
  ],
  "status": "PAYMENT_SUCCESSFUL",
  "paymentId": "PAY789",
  "inventoryReservationId": null
}

3. Reserve Inventory (Triggered by OrderService): OrderService then sends a command to InventoryService to reserve the items.

// InventoryService request
{
  "orderId": "ORD123",
  "items": [
    {"productId": "PROD789", "quantity": 2}
  ]
}

InventoryService checks stock, decrements it, and returns an inventoryReservationId.

// InventoryService response
{
  "inventoryReservationId": "INV456",
  "orderId": "ORD123",
  "status": "RESERVED"
}

OrderService updates its order again:

{
  "orderId": "ORD123",
  "customerId": "CUST456",
  "items": [
    {"productId": "PROD789", "quantity": 2}
  ],
  "status": "COMPLETED",
  "paymentId": "PAY789",
  "inventoryReservationId": "INV456"
}

The Saga is complete. Each step was a local transaction.

Now, what if InventoryService fails? Suppose it returns an error:

// InventoryService response (error)
{
  "orderId": "ORD123",
  "error": "INSUFFICIENT_STOCK"
}

OrderService sees this failure. The order cannot be completed. This is where compensation comes in. OrderService must initiate a rollback.

Compensation Steps:

1. Compensate Payment (Triggered by OrderService): OrderService sends a command to PaymentService to refund the payment using the paymentId.

// PaymentService compensation request
{
  "paymentId": "PAY789",
  "reason": "ORDER_CANCELLED"
}

PaymentService marks the payment as refunded.

// PaymentService response
{
  "paymentId": "PAY789",
  "status": "REFUNDED"
}

OrderService updates its order:

{
  "orderId": "ORD123",
  "customerId": "CUST456",
  "items": [
    {"productId": "PROD789", "quantity": 2}
  ],
  "status": "CANCELLED",
  "paymentId": "PAY789",
  "inventoryReservationId": null
}

The Saga is now successfully rolled back. The overall state is consistent, even though atomicity was never achieved.

The problem this solves is maintaining data consistency across multiple independent microservices without the tight coupling and blocking nature of 2PC. 2PC requires all participants to be available and to lock resources until the commit phase is complete. If any participant fails during the commit or abort, the entire transaction can stall. Saga, on the other hand, uses a sequence of local transactions, where each transaction is atomic within its own service. If a step fails, compensating transactions are executed in reverse order to undo the work of preceding successful steps.

The exact levers you control are the local transaction logic within each service and the choreography or orchestration of these steps. In choreography, each service publishes an event that triggers the next service. In orchestration, a central orchestrator (like OrderService in our example) explicitly calls each service and manages the compensation flow.

The most surprising thing about Sagas is that they don’t guarantee atomicity in the traditional sense; instead, they guarantee eventual consistency through explicit error handling and rollback mechanisms. This means that at any given point during the Saga, the system might be in an intermediate state that wouldn’t be allowed under ACID. For instance, a customer might see a "pending" order that has been paid for but not yet inventoried. The system recovers from these intermediate states via compensation.

This pattern also introduces the concept of "idempotency" for both forward and compensating actions. Every operation – creating a payment, reserving inventory, refunding a payment – must be designed to be safely executed multiple times. This is crucial because network issues or service restarts can lead to duplicate messages or retries. For example, if the PaymentService successfully processes a refund but fails to send its confirmation back to OrderService, OrderService might retry the refund. The PaymentService must recognize that this payment has already been refunded and simply acknowledge it without performing the action again.

The next conceptual hurdle you’ll encounter is managing the complexity of compensating transactions, especially when business logic evolves or when compensating a failed step itself fails.

Want structured learning?

Take the full System Design course →