Alerting kanalas
Platform: ntfy.sh — push notifications be infrastruktūros.
Topic: play-army-alerts
Kas siunčia alertus
| Šaltinis | Tipas | Prioritetas |
|---|---|---|
restic-backup.sh | Backup OK/FAIL | default / urgent |
restic-restore-drill.sh | Restore drill OK/FAIL | default / urgent |
play-army-alert@.service | Systemd OnFailure hook | urgent |
fail2ban ntfy-local | Ban event (sshd, nginx) | high |
lynis-weekly-audit.sh | Lynis score alert | high (jei <70) |
CF Worker play-army-status | Site DOWN detection | urgent |
Skriptai
send-ntfy-alert.sh— bendras notifier (naudojamas visų)fail2ban-ntfy-event.sh— formatuoja ban event detalessystemd-notify-failure.sh— journal excerpt su unit info
External uptime monitor
Stack: Cloudflare Worker + KV + Cron Trigger
CF Worker play-army-status kas minutę tikrina 3 endpointus per HTTPS:
| Target | Tikrinimas |
|---|---|
play.army | HTTP 200 |
panel.play.army | HTTP 200 (CF Access redirect = OK) |
node.play.army | HTTP 401 = UP (Wings auth required) |
Jei DOWN → urgent ntfy alert.
VPS Heartbeat
VPS kas minutę siunčia heartbeat su 8 servisų statusu:
nginx, mariadb, redis, wings, crowdsec, cloudflared, fail2ban, ssh
Jei heartbeat neateina >3 min → VPS unreachable alert.
Systemd: play-army-heartbeat.timer + play-army-heartbeat.service
Status page
Viešas status page rodo:
- Web servisų latency, status code, uptime istoriją (1h)
- VPS servisų heartbeat grid (8 servisai)
- Overall status: operational / degraded / major outage
API: https://status.play.army/api/status (JSON)
Internet → status.play.army
│
├── GET / → HTML status page
├── GET /api/status → JSON API
└── POST /heartbeat → VPS heartbeat receiver (Bearer auth)