Operations

No DB, no migrations, no workers. Operations = deploying the proxy + portal, watching Traefik metrics, and tuning resilience.

1. Deployment

Artifact	Image	Port	Built from
Traefik	`traefik:v3.6`	`:80` web, `:8080` dashboard/metrics	`config/traefik.yml` + `config/dynamic/` mounted
Portal	static build → `nginxinc/nginx-unprivileged:1.27-alpine`	`:8080`	`portal/Dockerfile`
Local-dev gateway	`nginx:1.27-alpine` (`local-nx-gateway`)	`:80` (host net)	`local/docker-compose.yml` + `local/nginx.conf`

Portal build & run

bash

cd packages/gateway/portal
bun install        # separate dependency tree from monorepo root
bun run dev        # Astro dev server on :3003
bun run rebuild    # clean + production static build
bun run lint       # scripts/lint.sh

The Traefik and Nginx configs require no build step - they are mounted directly. Backend services self-register with Traefik via Docker labels; no gateway redeploy is needed when a service is added (see Routing).

Native-dev gateway

bash

# Linux only - host networking lets Nginx reach 127.0.0.1:31xx and bind :80
docker compose -f packages/gateway/local/docker-compose.yml up -d
curl http://localhost/__gateway_health   # → {"status":"ok","gateway":"local-nx-gateway"}

2. Observability

Signal	Source	Where to look
Access logs	Traefik JSON access log (`Authorization`/`Cookie` dropped)	container stdout / Loki
App logs	Traefik JSON log, `level: INFO`	container stdout
Metrics	Prometheus on Traefik `:8080`	Prometheus + Grafana
Dashboard	Traefik dashboard `:8080` (basic-auth)	`/dashboard/`
Per-service health	gateway-portal	portal Monitor page

Full metric list, Prometheus scrape config, and Grafana provisioning: see Observability. Key metrics: traefik_service_requests_total, traefik_service_request_duration_seconds, traefik_service_server_up.

3. Security

Concern	Mitigation	Source
TLS	Terminated at edge Nginx - Traefik has no HTTPS entrypoint	`config/traefik.yml`
Rate limit (general)	`rate-limit` 200/s, burst 400, per-IP	`middlewares.yml`
Rate limit (auth)	`rate-limit-auth` 30/min, burst 60, per-IP	`middlewares.yml`
Circuit breaker	`NetworkErrorRatio() > 0.10 \|\| LatencyAtQuantileMS(95.0) > 3000`	`middlewares.yml`
Security headers	XSS filter, nosniff, frame deny; strip `Server`/`X-Powered-By`	`middlewares.yml`
Dashboard / portal access	basic auth (`dashboard-auth`)	`middlewares.yml`
Docker socket	read-only mount; `exposedByDefault: false`	`config/traefik.yml`
Real client IP	`ipStrategy.depth=1` reads `X-Forwarded-For` behind Nginx	`middlewares.yml`

Per-IP rate limiting only works correctly because ipStrategy.depth=1 extracts the real client IP - without it every request would share Nginx's IP bucket. See Resilience.

4. Runbook

4.1 Alert classes (recommended - not yet deployed)

Alert	Trigger	Check	Fix
`GatewayHighErrorRate`	5xx ratio > 5% over 5m	Traefik access logs, target service health	Inspect the failing backend; circuit breaker may already be open
`GatewayHighLatency`	p99 > 2s over 5m	per-service latency histogram	Backend slowness; check DB / downstream
`BackendDown`	`traefik_service_server_up == 0`	service `/health`	Restart/scale the backend; Traefik auto-re-adds on recovery

Example PromQL + alert rules: see Observability §8.

4.2 Common operations

Operation	Command / action
Tail Traefik logs	`docker logs -f <traefik-container>`
Verify a route registered	Traefik dashboard `/dashboard/` → HTTP Routers
Reload shared middleware	edit `config/dynamic/middlewares.yml` - file provider hot-reloads (no restart)
Add a backend route (prod)	add Traefik labels to the service compose; auto-discovered
Add a backend route (dev)	add `upstream` + `location /v1/api/<svc>/` to `local/nginx.conf`, restart `local-nx-gateway`
Add a service to the portal	add an entry to `portal/src/constants/services.constant.ts`, rebuild portal
Check dev gateway liveness	`curl http://localhost/__gateway_health`
Inspect a tripped circuit	dashboard → router middlewares; watch `NetworkErrorRatio` / p95 latency

Configuration
Observability
Resilience
/runbook/ - central runbook for cross-service incidents
Decisions

Providers

Invoice Types

Operations ​

1. Deployment ​

Portal build & run ​

Native-dev gateway ​

2. Observability ​

3. Security ​

4. Runbook ​

4.1 Alert classes (recommended - not yet deployed) ​

4.2 Common operations ​

5. Related Pages ​