Cluster Design
Staging Cluster (3 nodes)
Minimal cluster for internal testing, demos, and integration testing.
| Node Pool | Count | Spec | Taint | Workloads |
|---|---|---|---|---|
default | 2 | 4 vCPU, 8 GB | — | nginx-ingress, Traefik, backend, frontend, cert-manager, monitoring |
stateful | 1 | 8 vCPU, 16 GB | — | PostgreSQL, Redis, Kafka ×3, Typesense |
Staging keeps it simple — no dedicated system or monitoring nodes. Everything except data runs on default nodes. All data services (including 3 Kafka brokers) colocate on a single stateful node with extra memory. Deployment is manual via kubectl apply -k.
Production Cluster (7+ nodes)
Comprehensive production-grade cluster with dedicated node pools, HA, autoscaling, and full observability.
| Node Pool | Count | Spec | Taint | Workloads |
|---|---|---|---|---|
system | 2 | 2 vCPU, 4 GB | dedicated=system:NoSchedule | nginx-ingress (HA), cert-manager, Sealed Secrets controller |
app | 3+ | 4 vCPU, 8 GB | — | Traefik, backend services, frontend apps |
stateful | 2 | 8 vCPU, 16 GB | — | PostgreSQL, Redis, Kafka, Typesense |
monitoring | 1 | 4 vCPU, 8 GB | dedicated=monitoring:NoSchedule | Prometheus, Grafana, Loki, Tempo, OTel, Promtail |
Why Dedicated Node Pools (Production)
- System nodes (tainted): Ingress and cert-manager must never be evicted by app workloads. Taints ensure only system pods schedule here.
- Monitoring node (tainted): Observability stack is resource-hungry. Isolating it prevents monitoring from stealing app resources (and vice versa).
- App nodes (autoscalable): Cluster autoscaler can add
app-4,app-5, etc. during traffic spikes. No risk of scaling a node that has stateful data on it. - Stateful nodes: Dedicated to data services. Pod anti-affinity spreads Kafka brokers across both nodes for fault tolerance.
Node Labels & Taints
System nodes
# System nodes
node.kubernetes.io/pool: system
taint: dedicated=system:NoSchedule
# App nodes
node.kubernetes.io/pool: app
# Stateful nodes
node.kubernetes.io/pool: stateful
# Monitoring node
node.kubernetes.io/pool: monitoring
taint: dedicated=monitoring:NoScheduleStateful Node Distribution
Staging — single node, all data colocated:
Production — spread across 2 nodes for fault tolerance:
Namespace Strategy
Both clusters use the same 7 namespaces:
| Namespace | Purpose | Contents |
|---|---|---|
nx-internal | Infrastructure | nginx-ingress, Traefik (API gateway), cert-manager, API Portal |
nx-backend | Backend workloads | All backend services |
nx-app | Frontend workloads | Frontend nginx pods (client, bo, overture, sale-renderer, wiki) |
nx-persistent | Database | PostgreSQL primary + replica, PgBouncer |
nx-broker | Message broker & cache | Redis Cluster, Kafka KRaft |
nx-search | Search & CDC | Typesense, Debezium |
nx-watcher | Observability stack | (not yet deployed) |
Namespace YAML
Namespace: nx-internal
apiVersion: v1
kind: Namespace
metadata:
name: nx-internal
labels:
app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
name: nx-backend
labels:
app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
name: nx-app
labels:
app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
name: nx-persistent
labels:
app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
name: nx-broker
labels:
app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
name: nx-search
labels:
app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
name: nx-watcher
labels:
app.kubernetes.io/part-of: banaResource Allocation
App Workloads (default/app nodes)
| Workload | Replicas (staging) | Replicas (prod) | CPU req | CPU lim | Mem req | Mem lim |
|---|---|---|---|---|---|---|
| traefik | 1 | 2 | 100m | 1 | 128Mi | 512Mi |
| identity | 1 | 2 | 200m | 2 | 256Mi | 1Gi |
| commerce | 1 | 1 | 200m | 2 | 256Mi | 1Gi |
| sale | 1 | 2 | 200m | 2 | 256Mi | 1Gi |
| finance | 1 | 1 | 200m | 2 | 256Mi | 1Gi |
| inventory | 1 | 1 | 200m | 2 | 256Mi | 1Gi |
| ledger | 1 | 1 | 200m | 2 | 384Mi | 1Gi |
| pricing | 1 | 1 | 200m | 2 | 320Mi | 1Gi |
| payment-api | 1 | 2 | 200m | 2 | 256Mi | 1Gi |
| payment-worker | 1 | 1 | 100m | 1 | 256Mi | 1Gi |
| signal | 1 | 2 | 100m | 1 | 256Mi | 512Mi |
| client | 1 | 1 | 50m | 500m | 64Mi | 256Mi |
| bo | 1 | 1 | 50m | 500m | 64Mi | 256Mi |
| overture | 1 | 1 | 50m | 500m | 64Mi | 256Mi |
| sale-renderer | 1 | 1 | 50m | 500m | 64Mi | 256Mi |
| wiki | 1 | 1 | 50m | 500m | 64Mi | 256Mi |
Data Workloads (stateful nodes)
| Workload | Instances | CPU req | Mem req | Storage |
|----------|-----------|---------|---------|---------|
| PG Primary | 1 | 500m | 1Gi | 20Gi |
| PG Replica | 1 | 250m | 512Mi | 20Gi |
| PgBouncer | 1 | 100m | 256Mi | - |
| Redis | 3 | 300m | 1.125Gi | 15Gi |
| Kafka | 3 | 1.5 CPU | 3.75Gi | 30Gi |
| Typesense | 1 | 200m | 512Mi | 5Gi |
| Debezium | 1 | 250m | 512Mi | - || Workload | Instances | CPU req | Mem req | Storage |
|----------|-----------|---------|---------|---------|
| PostgreSQL (CNPG) | 3 (1 primary + 2 replicas) | 1.5 CPU | 3Gi | 60Gi + 15Gi WAL |
| Redis + Sentinel | 3 + 3 | 750m | 1.7Gi | 15Gi |
| Kafka (Strimzi) | 3 | 1.5 CPU | 3.75Gi | 30Gi |
| Typesense (raft) | 3 | 600m | 1.5Gi | 15Gi |See Data Layer for full operator configs, failover mechanics, and backup strategy.
Storage Allocation
| PVC | Staging | Production |
|---|---|---|
pg-data | 20Gi (×1) | 20Gi (×3) + 5Gi WAL (×3) |
redis-data | 5Gi (×1) | 5Gi (×3) |
kafka-data | 10Gi (×3) | 10Gi (×3) |
typesense-data | 5Gi (×1) | 5Gi (×3) |
prometheus-data | 10Gi | 10Gi |
grafana-data | 5Gi | 5Gi |
loki-data | 5Gi | 5Gi |
tempo-data | — | 10Gi |
| Total | 80Gi | 170Gi |
Resource Governance
ResourceQuota
Each namespace has a ResourceQuota to prevent runaway workloads from consuming all cluster resources.
nx-backend — largest namespace (all backend services)
# nx-backend — largest namespace (all backend services)
apiVersion: v1
kind: ResourceQuota
metadata:
name: nx-backend-quota
namespace: nx-backend
spec:
hard:
requests.cpu: "12"
requests.memory: 16Gi
limits.cpu: "24"
limits.memory: 32Gi
pods: "30"
---
# nx-app — frontend apps (lightweight nginx pods)
apiVersion: v1
kind: ResourceQuota
metadata:
name: nx-app-quota
namespace: nx-app
spec:
hard:
requests.cpu: "2"
requests.memory: 4Gi
limits.cpu: "4"
limits.memory: 8Gi
pods: "12"
---
# nx-persistent — PostgreSQL + PgBouncer
apiVersion: v1
kind: ResourceQuota
metadata:
name: nx-persistent-quota
namespace: nx-persistent
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
pods: "8"
persistentvolumeclaims: "10"
---
# nx-broker — Redis + Kafka
apiVersion: v1
kind: ResourceQuota
metadata:
name: nx-broker-quota
namespace: nx-broker
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 12Gi
pods: "12"
persistentvolumeclaims: "12"
---
# nx-search — Typesense + Debezium
apiVersion: v1
kind: ResourceQuota
metadata:
name: nx-search-quota
namespace: nx-search
spec:
hard:
requests.cpu: "1"
requests.memory: 2Gi
limits.cpu: "2"
limits.memory: 4Gi
pods: "4"
---
# nx-internal — nginx-ingress, Traefik, cert-manager, API Portal
apiVersion: v1
kind: ResourceQuota
metadata:
name: nx-internal-quota
namespace: nx-internal
spec:
hard:
requests.cpu: "2"
requests.memory: 4Gi
limits.cpu: "4"
limits.memory: 8Gi
pods: "6"
---
# nx-watcher — observability stack (not yet deployed)
apiVersion: v1
kind: ResourceQuota
metadata:
name: nx-watcher-quota
namespace: nx-watcher
spec:
hard:
requests.cpu: "2"
requests.memory: 4Gi
limits.cpu: "4"
limits.memory: 8Gi
pods: "8"LimitRange
LimitRange sets default requests/limits for containers that don't specify them, and enforces min/max boundaries.
Default for nx-backend
# Default for nx-backend
apiVersion: v1
kind: LimitRange
metadata:
name: nx-backend-limits
namespace: nx-backend
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 100m
memory: 128Mi
min:
cpu: 50m
memory: 64Mi
max:
cpu: "2"
memory: 2Gi
---
# Default for nx-app (lighter limits for frontend nginx)
apiVersion: v1
kind: LimitRange
metadata:
name: nx-app-limits
namespace: nx-app
spec:
limits:
- type: Container
default:
cpu: 200m
memory: 256Mi
defaultRequest:
cpu: 50m
memory: 64Mi
min:
cpu: 25m
memory: 32Mi
max:
cpu: "1"
memory: 1Gi
---
# Default for nx-broker
apiVersion: v1
kind: LimitRange
metadata:
name: nx-broker-limits
namespace: nx-broker
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 768Mi
defaultRequest:
cpu: 200m
memory: 256Mi
min:
cpu: 100m
memory: 128Mi
max:
cpu: "2"
memory: 2Gi
---
# Default for nx-persistent
apiVersion: v1
kind: LimitRange
metadata:
name: nx-persistent-limits
namespace: nx-persistent
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 1Gi
defaultRequest:
cpu: 250m
memory: 512Mi
min:
cpu: 100m
memory: 256Mi
max:
cpu: "2"
memory: 4Gi
---
# Default for nx-search
apiVersion: v1
kind: LimitRange
metadata:
name: nx-search-limits
namespace: nx-search
spec:
limits:
- type: Container
default:
cpu: 200m
memory: 512Mi
defaultRequest:
cpu: 100m
memory: 256Mi
min:
cpu: 50m
memory: 128Mi
max:
cpu: "1"
memory: 2Gi
---
# Default for nx-internal
apiVersion: v1
kind: LimitRange
metadata:
name: nx-internal-limits
namespace: nx-internal
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 100m
memory: 128Mi
min:
cpu: 50m
memory: 64Mi
max:
cpu: "2"
memory: 2Gi
---
# Default for nx-watcher
apiVersion: v1
kind: LimitRange
metadata:
name: nx-watcher-limits
namespace: nx-watcher
spec:
limits:
- type: Container
default:
cpu: 200m
memory: 256Mi
defaultRequest:
cpu: 50m
memory: 64Mi
min:
cpu: 25m
memory: 32Mi
max:
cpu: "1"
memory: 1GiTIP
LimitRange applies to pods that don't explicitly declare resources. All BANA workloads specify resources, so LimitRange acts as a safety net for ad-hoc debugging pods or jobs.
Manifest Directory Structure
The actual deployment uses numbered manifest directories applied sequentially with kubectl apply -f, not Kustomize overlays.
infrastructure/deployments/staging/
├── 00-cluster-setup/ # Namespaces, ResourceQuotas, LimitRanges
├── 01-network-policies/ # NetworkPolicy per namespace
├── 02-persistent/ # PostgreSQL, PgBouncer StatefulSets & Services
├── 03-broker/ # Redis Cluster, Kafka KRaft StatefulSets
├── 04-search/ # Typesense, Debezium
├── 05-internal/ # nginx-ingress, Traefik, cert-manager, API Portal
├── 06-backend/ # All backend service Deployments & Services
├── 07-app/ # Frontend nginx Deployments (client, bo, overture, etc.)
├── 08-watcher/ # Observability stack (not yet deployed)
├── 09-jobs/ # One-off Jobs (DDL migrations, seed, Kafka topic init)
└── kc # kubectl wrapper script (sets kubeconfig)Deployment
# Apply all manifests in order
for dir in 00-* 01-* 02-* 03-* 04-* 05-* 06-* 07-*; do
./kc apply -f "$dir/"
doneProduction (planned)
- System + monitoring taints applied
- nginx-ingress HA (2 replicas on system nodes)
- Traefik HA (2 replicas on app nodes)
- HPA for critical services (identity, sale, payment-api, signal)
- PodDisruptionBudgets (minAvailable: 1 for HA services)
- topologySpreadConstraints + podAntiAffinity
- Deployed via GitLab CI/CD pipeline