systemd-analyze can tell you what’s taking so long to boot, but understanding why is the real trick.
Let’s see what’s happening under the hood. Imagine you’ve got a fresh server, or maybe you just upgraded your OS. You reboot, and it feels like it’s taking ages. You want to know which service is the culprit.
Here’s a live example. We’ll boot up a hypothetical system and then use systemd-analyze to dissect the boot process.
$ systemd-analyze
Startup finished in 3.567s (kernel) + 15.123s (userspace) = 18.690s
graphical.target reached in 12.345s
This tells us the kernel took 3.567 seconds to initialize, and the userspace (everything after the kernel hands off to systemd) took 15.123 seconds. The total boot time was 18.690 seconds, and graphical.target (your desktop environment or login screen) was reached after 12.345 seconds of userspace initialization.
Now, let’s dive deeper to find the slowpoke.
$ systemd-analyze blame
10.234s postgresql.service
3.456s networking.service
1.123s docker.service
...
The blame command lists all running units (services, targets, etc.) sorted by the time they took to initialize. Here, postgresql.service is clearly the biggest offender, taking over 10 seconds. networking.service and docker.service are also notable.
But blame only shows individual service times. What if a service is waiting for another, or multiple services are running in parallel but one is blocking others? That’s where systemd-analyze critical-chain comes in.
$ systemd-analyze critical-chain
graphical.target @12.345s
└─ multi-user.target @12.300s
└─ docker.service @11.177s +1.123s
└─ network-online.target @11.000s
└─ systemd-networkd.service @10.500s +500ms
└─ systemd-udev-settle.service @10.400s
└─ systemd-udevd.service @10.200s +200ms
This shows the critical path – the sequence of units that must complete before the target can be reached. graphical.target depends on multi-user.target, which depends on docker.service, and so on. Notice how docker.service starts after network-online.target is met, and network-online.target waits for systemd-networkd.service. This chain reveals dependencies that might not be obvious from blame alone. postgresql.service might be taking a long time, but if it’s not on the critical path, it’s not directly delaying your login screen.
To visualize this, systemd-analyze plot generates an SVG file you can open in a web browser.
$ systemd-analyze plot > boot.svg
This graph is incredibly powerful. It shows units as bars, with their start and end times. Different colors indicate the state of the unit (initializing, active, exited). You can see parallel execution, dependencies, and precisely where delays occur. You’ll see postgresql.service as a long bar, but if it starts after graphical.target has already been reached, it’s not a boot time problem. It might be a performance problem, but not a boot speed problem.
The most common culprits for slow boot times are:
-
Network services waiting for network: Services like
NetworkManager-wait-online.serviceorsystemd-networkd-wait-online.serviceintentionally block until the network is up. If your network hardware or DHCP is slow, this adds significant delay.- Diagnosis: Look for
*-wait-online.serviceunits high in thecritical-chain. - Fix: If you don’t strictly need the network before other services start, disable the
*-wait-online.serviceunits. For example,sudo systemctl disable NetworkManager-wait-online.service. This allows services to start concurrently with network bring-up. - Why it works: Boot proceeds without waiting for a network that might not be immediately available or necessary for early services.
- Diagnosis: Look for
-
Disk I/O bottlenecks: Slow storage (especially HDDs) or a filesystem check (
fsck) taking a long time can delay everything.- Diagnosis:
systemd-analyze blamewill show long times for mounting filesystems orsystemd-fsck@.service. TheplotSVG will show long bars for these. - Fix: Ensure your
fstabentries havenofailfor non-critical partitions and consider runningfsckmanually during maintenance windows if it’s consistently slow. For critical partitions, investigate hardware health.sudo tune2fs -c 1 -i 0 /dev/sdXNcan reduce the frequency of filesystem checks (use with caution). - Why it works:
nofailallows boot to continue even if a disk is absent or fails. Reducingfsckfrequency or running it offline minimizes its impact on boot.
- Diagnosis:
-
Heavy initialization services: Databases (like PostgreSQL, MySQL), container runtimes (Docker, Podman), or complex application servers starting up can take a long time.
- Diagnosis:
systemd-analyze blameandcritical-chainwill highlight these services. Theplotshows their duration. - Fix: If the service isn’t needed immediately at boot, disable it and start it manually or via a timer.
sudo systemctl disable postgresql.service. You can also usesystemctl edit --full postgresql.serviceto adjustTimeoutStartSecif it’s timing out, but this doesn’t speed up the actual initialization. - Why it works: Disabling defers the initialization cost until the service is actually required, significantly shortening boot.
- Diagnosis:
-
Systemd-udevd delays:
systemd-udevd.serviceandsystemd-udev-settle.servicecan sometimes take a long time if there are many devices to enumerate or if device initialization is slow.- Diagnosis: Look for
systemd-udevd.serviceandsystemd-udev-settle.serviceinblameandcritical-chain. - Fix: This is harder to fix directly. Ensure all hardware is properly detected and drivers are loaded. Sometimes, removing unnecessary hardware or disabling unused kernel modules can help.
- Why it works: Faster device enumeration and driver loading reduces the time
udevdneeds to process device events.
- Diagnosis: Look for
-
Over-reliance on
multi-user.targetdependencies: Some services might be incorrectly configured to depend onmulti-user.targetwhen they could start earlier or later.- Diagnosis: Examine
systemd-analyze critical-chainfor services that seem out of place or are unnecessarily delayingmulti-user.target. Check their.servicefiles forAfter=andWants=directives. - Fix: Edit the service file (using
systemctl edit --full <service-name>.service) to remove or adjust dependencies. For instance, if a GUI application doesn’t need the network, ensure it doesn’t depend onnetwork-online.target. - Why it works: Realigning dependencies allows
systemdto start services in a more optimal, parallelized order.
- Diagnosis: Examine
-
Frequent or slow filesystem checks: If
fsckruns on partitions at every boot or takes a long time, it can be a major bottleneck.- Diagnosis:
systemd-analyze blamewill showsystemd-fsck@<partition>.servicetaking a long time. - Fix: Configure
fsckto run less often. Edit/etc/fstaband change the last field (pass number) from2to0for non-root filesystems to disable automatic checks. For the root filesystem, you can usetune2fs -c 1 -i 6m /dev/sdXYto set a maximum of 1 check and a maximum interval of 6 months (adjust as needed). - Why it works: Skipping checks or spacing them out significantly reduces the time spent on disk validation during boot.
- Diagnosis:
After addressing these, your next step will likely be optimizing individual service configurations or exploring advanced systemd features like timers and socket activation.