Unified Logs with Loki + Promtail

Centralize logs from Proxmox, TrueNAS, Docker hosts, and Pi-hole with Grafana dashboards and sane retention.

Goal

Stand up a lightweight Loki stack, ship logs with Promtail from your lab nodes, and view them in Grafana with retention and compaction sized for homelab gear.

Architecture

  • Loki single-binary on a VM or container (8-16 GB RAM recommended).
  • Promtail on: Proxmox host, NAS (if supported), Docker hosts, Pi-hole.
  • Grafana on the same VM or adjacent; datasource points to Loki.
  • Optional: Nginx/Traefik front with SSO (Authelia/ForwardAuth).

Install Loki & Grafana

  1. Deploy Docker Compose or systemd unit for Loki; enable boltdb-shipper + filesystem.
  2. Set retention (e.g., 14d-30d) and compaction; keep chunk_target_size modest for low-RAM nodes.
  3. Install Grafana; add Loki datasource; import basic log panels.

Promtail Targets

  • Proxmox: tail /var/log/syslog, /var/log/pve*, andjournal (if using systemd scrape).
  • TrueNAS: use syslog to Promtail's TCP/UDP listener, or run Promtail in a jail/VM.
  • Docker hosts: use Docker service discovery in Promtail; scrape container logs.
  • Pi-hole: tail /var/log/pihole.log and FTL.log.

Labeling Strategy

  • Keep labels low-cardinality: job, host, service.
  • Drop noisy labels (container ids, request ids) with relabel_configs.
  • Add env = lab to separate from production.

Dashboards & Alerts

  • Grafana panels: error rates per service, failed logins, DHCP/DNS anomalies.
  • Alerts: auth failures burst, service restarts, disk errors on hypervisor/NAS.
  • Silence windows for maintenance; alert routing via email/webhook/ntfy.

Security

  • Front Loki/Grafana with SSO; disable anonymous access.
  • mTLS or at least HTTPS between Promtail and Loki if crossing subnets.
  • Limit Loki to LAN/VPN; no WAN exposure.

Capacity Tips

  • Size disk for retention: e.g., 14d logs ~ a few GB/day for small labs.
  • Set max_parallel and batchsize modestly on Promtail.
  • Monitor Loki WAL/chunks directories; alert on disk 80%.

Validation

  • Query logs in Grafana by host and service; confirm timestamps are correct.
  • Kill a test container: logs should appear; alert fires if configured.
  • Reboot a node: ensure Promtail reconnects and backfills.

- Crafted by Axiom|Spectre