ops: nightly DB backup + self-hosted uptime monitoring
CI/CD / CI · API (dotnet build + test) (push) Successful in 41s
CI/CD / CI · Admin API (dotnet build) (push) Successful in 30s
CI/CD / CI · Dashboard (tsc) (push) Successful in 1m10s
CI/CD / CI · Admin Web (tsc) (push) Successful in 37s
CI/CD / CI · Website (tsc) (push) Successful in 44s
CI/CD / CI · Koja (tsc) (push) Successful in 50s
CI/CD / Deploy · all services (push) Successful in 1m48s
CI/CD / CI · API (dotnet build + test) (push) Successful in 41s
CI/CD / CI · Admin API (dotnet build) (push) Successful in 30s
CI/CD / CI · Dashboard (tsc) (push) Successful in 1m10s
CI/CD / CI · Admin Web (tsc) (push) Successful in 37s
CI/CD / CI · Website (tsc) (push) Successful in 44s
CI/CD / CI · Koja (tsc) (push) Successful in 50s
CI/CD / Deploy · all services (push) Successful in 1m48s
Backup (production data-loss protection — was none):
- meezi-backup sidecar in docker-compose.yml runs pg_dump nightly at 02:00
Tehran, gzip, 14-day rotation, atomic .partial→final, into ./backups
(persists across deploys; rsync off-box per RESTORE.md).
- Wired into the deploy job (up -d --no-deps backup); takes one dump on boot.
- scripts/backup/pg-backup-loop.sh + RESTORE.md (restore + off-box guidance).
Monitoring:
- docker-compose.monitoring.yml: Uptime Kuma stack (own volume), stood up
once, independent of app deploys.
- Caddyfile status.{$DOMAIN} route; docs/monitoring.md lists the exact
monitors (incl. /q guest-menu 200 check) + TLS-expiry alerts (catches the
~90-day cert breakage early) + alert-channel setup.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,47 @@
|
||||
# Meezi uptime monitoring (Uptime Kuma)
|
||||
|
||||
Self-hosted uptime + TLS-expiry monitoring with alerting. Runs as a separate
|
||||
compose stack so it stays up independently of app deploys.
|
||||
|
||||
## Stand it up (one time, on the prod host)
|
||||
```bash
|
||||
cd /path/to/meezi
|
||||
docker compose -f docker-compose.monitoring.yml up -d
|
||||
```
|
||||
Then either:
|
||||
- add a DNS A record `status.meezi.ir → server IP` and reload Caddy
|
||||
(`docker exec meezi-caddy caddy reload` or restart the caddy stack) — the
|
||||
`status.{$DOMAIN}` block is already in the Caddyfile, **or**
|
||||
- reach it directly at `http://SERVER:3201` for the initial setup.
|
||||
|
||||
First visit creates the admin account — set a strong password.
|
||||
|
||||
## Monitors to add (in the Uptime Kuma UI)
|
||||
Add one **HTTP(s)** monitor per public surface, interval 60s, accept 2xx/3xx:
|
||||
|
||||
| Name | URL | Notes |
|
||||
|------|-----|-------|
|
||||
| Website | https://meezi.ir/fa | marketing |
|
||||
| Dashboard | https://app.meezi.ir/fa/login | merchant panel |
|
||||
| API health | https://api.meezi.ir/api/public/security-config | returns JSON 200 |
|
||||
| Koja | https://koja.meezi.ir/fa | public discovery |
|
||||
| Admin | https://admin.meezi.ir | internal panel |
|
||||
| Guest menu | https://app.meezi.ir/q/healthcheck | should be 200 (not 500) |
|
||||
|
||||
For each HTTPS monitor enable **"Certificate Expiry Notification"** — this
|
||||
catches the recurring ~90-day Let's Encrypt cert-chain breakages early
|
||||
(see the mirror-cert runbook). Set the threshold to 14 days.
|
||||
|
||||
## Alerts
|
||||
Settings → Notifications → add a channel (Telegram bot or email/SMTP), then
|
||||
attach it to every monitor. Telegram is simplest: create a bot via @BotFather,
|
||||
get the chat id, paste both into Uptime Kuma.
|
||||
|
||||
## What this does NOT replace
|
||||
- **Backups** — see `scripts/backup/RESTORE.md`.
|
||||
- **Crash auto-recovery** — Docker `restart: unless-stopped` already restarts
|
||||
crashed containers; Uptime Kuma tells you when one is flapping or down.
|
||||
|
||||
## Status page (optional)
|
||||
Uptime Kuma can publish a public status page (Settings → Status Pages) at
|
||||
`status.meezi.ir/status/meezi` if you want customers to see uptime.
|
||||
Reference in New Issue
Block a user