Skip to content

Operations

No DB, no migrations, no workers. Operations = deploying the proxy + portal, watching Traefik metrics, and tuning resilience.

1. Deployment

ArtifactImagePortBuilt from
Traefiktraefik:v3.6:80 web, :8080 dashboard/metricsconfig/traefik.yml + config/dynamic/ mounted
Portalstatic build → nginxinc/nginx-unprivileged:1.27-alpine:8080portal/Dockerfile
Local-dev gatewaynginx:1.27-alpine (local-nx-gateway):80 (host net)local/docker-compose.yml + local/nginx.conf

Portal build & run

bash
cd packages/gateway/portal
bun install        # separate dependency tree from monorepo root
bun run dev        # Astro dev server on :3003
bun run rebuild    # clean + production static build
bun run lint       # scripts/lint.sh

The Traefik and Nginx configs require no build step — they are mounted directly. Backend services self-register with Traefik via Docker labels; no gateway redeploy is needed when a service is added (see Routing).

Native-dev gateway

bash
# Linux only — host networking lets Nginx reach 127.0.0.1:31xx and bind :80
docker compose -f packages/gateway/local/docker-compose.yml up -d
curl http://localhost/__gateway_health   # → {"status":"ok","gateway":"local-nx-gateway"}

2. Observability

SignalSourceWhere to look
Access logsTraefik JSON access log (Authorization/Cookie dropped)container stdout / Loki
App logsTraefik JSON log, level: INFOcontainer stdout
MetricsPrometheus on Traefik :8080Prometheus + Grafana
DashboardTraefik dashboard :8080 (basic-auth)/dashboard/
Per-service healthgateway-portalportal Monitor page

Full metric list, Prometheus scrape config, and Grafana provisioning: see Observability. Key metrics: traefik_service_requests_total, traefik_service_request_duration_seconds, traefik_service_server_up.

3. Security

ConcernMitigationSource
TLSTerminated at edge Nginx — Traefik has no HTTPS entrypointconfig/traefik.yml
Rate limit (general)rate-limit 200/s, burst 400, per-IPmiddlewares.yml
Rate limit (auth)rate-limit-auth 30/min, burst 60, per-IPmiddlewares.yml
Circuit breakerNetworkErrorRatio() > 0.10 || LatencyAtQuantileMS(95.0) > 3000middlewares.yml
Security headersXSS filter, nosniff, frame deny; strip Server/X-Powered-Bymiddlewares.yml
Dashboard / portal accessbasic auth (dashboard-auth)middlewares.yml
Docker socketread-only mount; exposedByDefault: falseconfig/traefik.yml
Real client IPipStrategy.depth=1 reads X-Forwarded-For behind Nginxmiddlewares.yml

Per-IP rate limiting only works correctly because ipStrategy.depth=1 extracts the real client IP — without it every request would share Nginx's IP bucket. See Resilience.

4. Runbook

AlertTriggerCheckFix
GatewayHighErrorRate5xx ratio > 5% over 5mTraefik access logs, target service healthInspect the failing backend; circuit breaker may already be open
GatewayHighLatencyp99 > 2s over 5mper-service latency histogramBackend slowness; check DB / downstream
BackendDowntraefik_service_server_up == 0service /healthRestart/scale the backend; Traefik auto-re-adds on recovery

Example PromQL + alert rules: see Observability §8.

4.2 Common operations

OperationCommand / action
Tail Traefik logsdocker logs -f <traefik-container>
Verify a route registeredTraefik dashboard /dashboard/ → HTTP Routers
Reload shared middlewareedit config/dynamic/middlewares.yml — file provider hot-reloads (no restart)
Add a backend route (prod)add Traefik labels to the service compose; auto-discovered
Add a backend route (dev)add upstream + location /v1/api/<svc>/ to local/nginx.conf, restart local-nx-gateway
Add a service to the portaladd an entry to portal/src/constants/services.constant.ts, rebuild portal
Check dev gateway livenesscurl http://localhost/__gateway_health
Inspect a tripped circuitdashboard → router middlewares; watch NetworkErrorRatio / p95 latency

Proprietary and Confidential. Unauthorized copying, distribution, or use of this software is strictly prohibited.