The standalone watchdog package only checks that the PID is still running from an OS point of view. It does not check that the PID is still running from a service point of view (accepting connections and responding normally). Services can sometimes hang without crashing for a variety of reasons. Watchdog isn't intended to deal with that, it's intended to detect a problem with the system as a whole and reboot. A systemd-aware service daemon sends a periodic sd_notify(3) "WATCHDOG=1" message to systemd, allowing systemd to detect a service which has hung but not crashed (theoretically, at least, it depends on a good implementation in the service daemon). See WatchdogSec= in systemd.service(5).How is that different and more useful than watchdog’s process checking?Systemd has more useful functionality for this. You can set "Restart=on-failure", for example, to automatically restart the process if it exits with an unclean exit code.
The functionality is similar, but the system level watchdog won't detect all problems with a service, and it reboots rather than restarting the service (which the OP desires). It's also inconvenient when you want to administer the service; the systemd service watchdog allows you to simply "systemctl stop servicename" without needing to take extra steps to disarm the watchdog.
Statistics: Posted by Murph9000 — Tue Sep 03, 2024 9:13 pm