Goal
Survive a power loss without corrupting disks or VMs. The UPS signals a controller, which drains guests, parks storage, and brings things back up cleanly.
Topology
- UPS USB/serial into a small controller (Raspberry Pi/thin client/VM).
- NUT server or apcupsd running on that controller.
- NUT clients on: Proxmox hypervisor(s), NAS, key Docker host.
Install & Configure
- Pick NUT (multi-vendor) or apcupsd (APC-focused); install on controller.
- Confirm driver works:
upsc ups@localhostorapcaccess status. - Enable TCP listener; bind to management VLAN; add strong NUT passwords.
- On clients, install NUT/apcupsd in net-client mode; point to controller IP.
Shutdown Ordering
- Set low battery threshold and kill delay on controller.
- On Proxmox: create hook script to gracefully stop VMs/CTs by priority.
- On NAS: trigger dataset sync and stop SMB/NFS before shutdown.
- On Docker host: stop containers; flush logs if you centralize to Loki.
Restore Flow
- BIOS: power-on after AC loss.
- Bring up NAS first, then hypervisor, then app hosts; delay stacks until storage is ready.
- Check time sync (NTP), then start proxies and tunnels last.
Testing
- Simulate UPS on battery for 2-3 minutes; watch events on controller and clients.
- Verify Proxmox tasks log clean stops; NAS shows clean export unmounts.
- After power return, confirm services come back without fsck or ZFS scrub errors.
Hardening & Monitoring
- Alert on battery tests, low runtime, and communication loss.
- Restrict NUT listener to management VLAN; firewall it off from IoT/Guest.
- Quarterly battery self-test; replace batteries on schedule.