SRE PagerDuty Setup: On-Call Schedules and Escalation (2026)

PagerDuty schedules are the linchpin of your on-call system, dictating who gets alerted and when.

Let’s see how it works in practice. Imagine a simple on-call rotation for a single service, "WebFrontend," during business hours (9 AM to 5 PM, Monday to Friday).

{
  "name": "WebFrontend On-Call",
  "description": "Primary on-call for WebFrontend service",
  "schedules": [
    {
      "name": "Weekly Rotation",
      "description": "Rotating weekly primary on-call",
      "schedule_type": "rotation",
      "users": ["user1@example.com", "user2@example.com"],
      "rotation_turn_length_days": 7,
      "start_time": "2023-10-27T09:00:00-07:00"
    }
  ]
}

In this setup, user1@example.com is on-call for the first week, starting on Friday, October 27th, 2023, at 9 AM Pacific Time. The rotation is 7 days long, so user2@example.com will take over the following Friday. This schedule is then linked to a PagerDuty service under its "Integrations" tab. When an alert for "WebFrontend" fires, PagerDuty checks the associated schedules.

The problem PagerDuty solves is avoiding alert fatigue and ensuring prompt response by routing incidents to the right people at the right time. It moves beyond simple "who gets paged" to a sophisticated system of "who gets paged, how, and when do we bother someone else." This is managed through Services, Incidents, Escalation Policies, and Schedules. A Service represents a system or application you’re monitoring. An Incident is a triggered alert. An Escalation Policy defines the multi-level response process for an incident, referencing Schedules for who is actually on-call at each level.

Consider an escalation policy for the "WebFrontend" service:

Level 1: Page the primary on-call engineer for "WebFrontend" immediately. If they don’t acknowledge within 5 minutes, escalate.
Level 2: Page the secondary on-call engineer for "WebFrontend" (could be a different schedule or user) immediately. If they don’t acknowledge within 10 minutes, escalate.
Level 3: Page the entire "WebFrontend Support Team" (a user group) immediately. If they don’t acknowledge within 15 minutes, escalate.
Level 4: Send an email notification to the "WebFrontend Mailing List."

This policy would be configured in PagerDuty under "Services" -> "WebFrontend" -> "Integrations" -> "Escalation Policy." Each level specifies a user, group, or schedule, and a delay before escalating to the next level. The "acknowledgement timeout" is critical here – it’s the window during which the on-call person must actively respond to the alert.

The most surprising truth about PagerDuty scheduling is that the start_time on a rotation schedule is not just for the first person, but the anchor for all subsequent turns. If you have a 7-day rotation and set start_time to 2023-10-27T09:00:00-07:00, the first rotation starts then. The second rotation begins exactly 7 days later, at 2023-11-03T09:00:00-07:00, and so on. This means that if you have a 1-day rotation and set the start_time to a Monday 9 AM, every single day the rotation will flip at 9 AM. If you have a 7-day rotation, it flips on the same day of the week each time. This fixed anchor point is crucial for predictable handoffs and avoids drift over time, which is something many users initially struggle to grasp when trying to align rotations with specific days or times.

This predictable anchor allows for precise control over when shifts change, preventing confusion about whether a rotation flips at midnight or at the beginning of the business day.

The next concept you’ll grapple with is the interplay between different schedule types, like overrides and exceptions, and how they interact with your primary rotations.

More Deep Dives in Sre