Operations
1. Deployment
| Property | Value |
|---|---|
| Image | registry/nx-seller-inventory:<tag> |
| Container Port | 3000 |
| External Port | 31050 |
| Snowflake ID | 5 |
| Replicas (default) | 1 (dev) / 2+ (staging+) |
| Resources (req/lim) | 100m / 500m CPU, 256Mi / 1Gi memory |
| HPA target | CPU 70% (when scaled) |
| Migration mode | RUN_MODE=migrate job before rollout; on-boot for dev |
| Live probe | GET /v1/api/inventory/healthz |
| Ready probe | GET /v1/api/inventory/readyz |
| Graceful shutdown | SIGTERM/SIGINT → close Kafka producer + consumer (isForce=false) |
Traefik routing labels
yaml
labels:
- "traefik.enable=true"
- "traefik.http.routers.inventory.rule=PathPrefix(`/v1/api/inventory`)"
- "traefik.http.services.inventory.loadbalancer.server.port=3000"Required infrastructure
| Dependency | Why |
|---|---|
| PostgreSQL | Primary datastore (schema inventory) |
| Kafka brokers | Mandatory — service refuses to start if APP_ENV_KAFKA_BROKERS empty |
| Debezium | Required for CDC topics (MERCHANT, PRODUCT_VARIANT) |
| Redis | Optional — auth cache; service starts without it |
| @nx/identity reachable | JWKS verification on every JWT |
2. Observability
| Signal | Source | Where to look |
|---|---|---|
| Logs | stdout (IGNIS structured logger, key: %s format) | kubectl logs deploy/inventory / Loki |
| Health | GET /v1/api/inventory/healthz, GET /readyz | Gateway portal |
| OpenAPI live spec | GET /v1/api/inventory/doc/openapi.json | Gateway portal explorer |
| Metrics | Traefik gateway :30800 (Prometheus scrape) | Grafana — gateway dashboard |
| Kafka lag | Kafka cluster broker / Burrow | TBD |
Key log fields
| Field | Source | Notes |
|---|---|---|
requestId | header X-Request-Id | Propagated cross-service |
userId | JWT subject | — |
merchantId | request scope | — |
topic / partition / offset | Kafka consumer | Logged on every message |
referenceType / referenceId | inventory tracking writes | For audit trail correlation |
Useful log search queries
| Question | Query |
|---|---|
| Stock deduct failures (oversell-blocked) | note=OVERSELL_BLOCKED |
| Re-deliveries | topic=payment.success AND idempotent skip |
| PO emit failures | Kafka emit failed AND topic=purchase-order.received |
3. Security
| Concern | Mitigation |
|---|---|
| AuthN | JWT (ES256, JWKS pulled from identity at boot + on-demand) |
| AuthZ | Casbin via PolicyDefinitionService; permissions cached in Redis (or in-memory if Redis disabled) |
| Service-to-service | BASIC strategy (commerce → inventory direct calls in shared TX) |
| Secrets | K8s Secret mounted as env (APP_ENV_DB_URL, APP_ENV_KAFKA_SASL_PASSWORD, etc.) — never in code |
| TLS | Terminated at Nginx → Traefik → service in plaintext (intra-cluster) |
| Rate limit | Traefik middleware (default 100 rps/IP) |
| Network policy | Cilium — allow only gateway + Kafka + Postgres + Redis + identity (JWKS) |
| Soft-delete | deletedAt — no hard-delete by default; InventoryTracking immutable |
| Idempotency | All Kafka handlers idempotent — safe under at-least-once delivery |
4. Runbook
4.1 Alert classes
| Alert | Trigger | Check | Fix | Escalate |
|---|---|---|---|---|
InventoryHighErrorRate | 5xx >5% over 5m | kubectl logs deploy/inventory | grep level=error | Identify failing endpoint; restart if stuck | on-call backend |
InventoryKafkaLag | Consumer group lag >10k on any topic | Burrow / Kafka admin tools | Scale replicas if CPU-bound; check handler errors | on-call SRE |
InventoryDBConnectionExhausted | pool exhausted errors | pg_stat_activity | Bump APP_ENV_DB_POOL_MAX; check long-running TX | on-call backend |
InventoryStockNegative | quantityOnHand < 0 rows in DB | Run reconciliation query | Audit InventoryTracking for cause; manual adjustment | on-call backend + finance |
InventoryOversellSpike | note=OVERSELL_BLOCKED count rising | Log search | Check sale order patterns; consider raising allowOversell for fast-moving items | on-call business + backend |
InventoryPOEmitFailure | purchase-order.received emit log errors | Grep logs | Manual replay job; verify Kafka cluster health | on-call SRE |
4.2 Common operations
| Operation | Command |
|---|---|
| Tail logs | kubectl logs -n <ns> -f deploy/inventory |
| Run migrations manually | kubectl exec -it deploy/inventory -- bun run migrate |
| Reset Kafka consumer offset | Use Kafka admin: kafka-consumer-groups --reset-offsets --group SVC-00050-INVENTORY_CONSUMER_GROUP --topic <topic> --to-earliest --execute |
Replay a PAYMENT_SUCCESS for a single order | Manually re-emit message with same key from sale; idempotency lookup will skip if already processed |
| Force re-seed permissions | Bump migration version + re-run migrate |
| Inspect stock for an item across locations | SELECT * FROM "InventoryStock" WHERE inventory_item_id = '...' AND deleted_at IS NULL; |
| Audit movements for a SaleOrder | SELECT * FROM "InventoryTracking" WHERE reference_type = 'SALE_ORDER' AND reference_id = '...'; |
4.3 Recovery scenarios
| Scenario | Recovery |
|---|---|
| Service crash mid-handler | Kafka offset NOT committed → message re-delivered on restart; idempotency lookup skips already-processed entries |
| DB transaction succeeded, Kafka emit failed (PO receive) | Log records the failed emit; manual replay tool publishes from PurchaseOrder snapshot |
| Wrong stock count detected | (1) Pause sale traffic, (2) run cycle count via InventoryTicket type=CYCLE_COUNT, (3) write ADJUSTMENT_NEUTRAL tracking rows |
| Lost merchant default location | Re-emit Merchant CDC event → ensureDefaultLocation recreates |
5. Cross-Service Runbook
For incidents that span multiple services, see central runbook/:
- Sale ↔ Inventory payment-success not deducting stock — TBD
- PO receive but no Finance EXPENSE created — TBD
6. Related Pages
- Configuration
- API Events — Kafka topic constants for replay commands
- Inventory Tracking — audit query patterns
- Decisions