Operations
1. Deployment
| Property | Value |
|---|---|
| Image | registry/helpdesk:<tag> |
| Processes | API (RUN_MODE=startup) + Worker (RUN_MODE=worker) + one-shot Migration (RUN_MODE=migrate) |
| Container Port | 3000 (external 31130) |
| Snowflake node id | 12 |
| Probes | GET /healthz (live), GET /readyz (ready) — IGNIS defaults |
| Migration mode | run-once job (bun run migrate) |
The API and Worker share one image; the role is selected by
RUN_MODE. The Worker process must run for SLA monitoring, assignment, escalation, notifications, context enrichment, and survey triggers to function.
Traefik labels
labels:
- "traefik.enable=true"
- "traefik.http.routers.helpdesk.rule=PathPrefix(`/v1/api/helpdesk`)"
- "traefik.http.services.helpdesk.loadbalancer.server.port=3000"2. Observability
| Signal | Source | Where to look |
|---|---|---|
| Logs | stdout, prefixed [ComponentName] (e.g. [QueueComponent], [ProcessEscalation]) via console.log/console.error | kubectl logs <pod> / Loki |
| Metrics | IGNIS defaults if exposed | Grafana |
| Traces | None wired | — |
| Health | GET /healthz, GET /readyz | Gateway portal |
| Queue depth | BullMQ (Redis) | Bull board / Redis inspection |
Key log markers
| Marker | Meaning |
|---|---|
[QueueComponent] SLA Monitor cron job scheduled | Worker booted SLA cron |
[ProcessEscalation] Reassigning ticket … | Escalation reassign path (see Known Issues) |
[EventHandlerRegistry] Handler failed for <event> | In-process listener error (caught) |
Logging is unstructured
console.*(no key-value logger in workers/listeners). Cross-service request-id propagation applies at the IGNIS HTTP layer only.
3. Security
| Concern | Mitigation |
|---|---|
| AuthN | JWT (ES256) verified against identity JWKS — VerifierApplication |
| AuthZ | Per-merchant scoping via assertMerchantAccess() + useRequestContext(); helpdesk PermissionService for finer checks. No per-route @authenticate decorator — controllers enforce access in handlers |
| Secrets | Env (APP_ENV_*) incl. SMTP + Redis passwords; never in code. .env.development contains real-looking dev secrets — rotate before any non-dev use |
| Tenancy | All ticket queries scoped by merchantId / organizerId |
| Soft-delete | deletedAt — no hard-delete by default |
| Internal/private notes | TicketMessage.isInternal hides notes from reporters |
4. Runbook
4.1 Alert classes
| Alert | Trigger | Check | Fix | Escalate |
|---|---|---|---|---|
helpdeskWorkerDown | No SLA-monitor cron logs | Worker pod status | Restart worker (RUN_MODE=worker) | on-call backend |
helpdeskQueueLag | BullMQ pending high | Redis queue depth | Scale workers / inspect failures | on-call SRE |
helpdeskNotificationFail | notification jobs hitting max retries (5) | SMTP creds / notification-delivery-log | Fix SMTP, replay DLQ | on-call backend |
helpdeskSlaBreachSpike | breach rate up | SlaTracker status counts | Check agent staffing / policies | support-team |
4.2 Common operations
| Operation | Command / action |
|---|---|
| Tail API logs | kubectl logs -n <ns> -f deploy/helpdesk-api |
| Tail worker logs | kubectl logs -n <ns> -f deploy/helpdesk-worker |
| Trigger SLA check | QueueComponent.triggerSlaCheck({ ticketId? }) (manual job, priority HIGH) |
| Reset SLA cron | Worker restart re-registers sla-monitor-cron (removes stale repeatables) |
| Run migration | bun run migrate:dev (per-package; never run by the agent) |
| Replay dead-lettered jobs | via HandleDeadLetterUseCase / dlq.helper.ts |
5. Known Issues
⚠ The TypeScript build of
@nx/helpdeskcurrently fails. Per the packageAGENTS.md, the cause is a deadassignTicketUseCasereference path. The Level-2+ escalation reassignment call (this.assignTicketUseCase.execute(...)) insrc/application/use-cases/sla-policy/process-escalation.use-case.tsis commented out — that use-case is not injected intoProcessEscalationUseCase, so SLA escalation does not reassign tickets to a senior agent; it only sends an escalation-assignment notification. Do not fix the source as part of documentation work.
| Impact | Detail |
|---|---|
| Build | bun run rebuild / tsc fails; deployable artifacts cannot be produced until repaired |
| Escalation reassignment | Disabled — senior-agent reassignment on Level 2/3 is a no-op (notification only) |
| Identity drift | .env.development carries stale SVC-00030 / 31032 / 0; reconcile to app-info.json (SVC-00120 / 31130 / 12) when build is fixed (see Configuration) |
| Dead dependency | @platformatic/kafka declared but unused — candidate for removal |
6. Related Pages
- Configuration
- API Events
/runbook/— central runbook for cross-service incidents- Decisions