Cluster Design

Staging Cluster (3 workers)

Minimal cluster for internal testing, demos, and integration testing. The 3 control-plane nodes are managed by VNPAY Cloud; workloads schedule onto 3 worker nodes (2 default + 1 stateful).

Node Pool	Count	Spec	Taint	Workloads
`default`	2	4 vCPU, 8 GB	-	nginx-ingress, Traefik, backend, frontend, cert-manager, monitoring
`stateful`	1	8 vCPU, 16 GB	-	PostgreSQL, Redis, Kafka ×3, Typesense

Staging keeps it simple - no dedicated system or monitoring nodes. Everything except data runs on default nodes. All data services (including 3 Kafka brokers) colocate on a single stateful node with extra memory. Deployment is manual: the numbered manifest directories are applied with kubectl through the ./kc wrapper.

Production Cluster (7+ nodes)

Comprehensive production-grade cluster with dedicated node pools, HA, autoscaling, and full observability.

Node Pool	Count	Spec	Taint	Workloads
`system`	2	2 vCPU, 4 GB	`dedicated=system:NoSchedule`	nginx-ingress (HA), cert-manager, Sealed Secrets controller
`app`	3+	4 vCPU, 8 GB	-	Traefik, backend services, frontend apps
`stateful`	2	8 vCPU, 16 GB	-	PostgreSQL, Redis, Kafka, Typesense
`monitoring`	1	4 vCPU, 8 GB	`dedicated=monitoring:NoSchedule`	Prometheus, Grafana, Loki, Tempo, OTel, Promtail

Why Dedicated Node Pools (Production)

System nodes (tainted): Ingress and cert-manager must never be evicted by app workloads. Taints ensure only system pods schedule here.
Monitoring node (tainted): Observability stack is resource-hungry. Isolating it prevents monitoring from stealing app resources (and vice versa).
App nodes (autoscalable): Cluster autoscaler can add app-4, app-5, etc. during traffic spikes. No risk of scaling a node that has stateful data on it.
Stateful nodes: Dedicated to data services. Pod anti-affinity spreads Kafka brokers across both nodes for fault tolerance.

Node Labels & Taints

System nodes

yaml

# System nodes
node.kubernetes.io/pool: system
taint: dedicated=system:NoSchedule

# App nodes
node.kubernetes.io/pool: app

# Stateful nodes
node.kubernetes.io/pool: stateful

# Monitoring node
node.kubernetes.io/pool: monitoring
taint: dedicated=monitoring:NoSchedule

Stateful Node Distribution

Staging - single node, all data colocated:

Production - spread across 2 nodes for fault tolerance:

Namespace Strategy

Both clusters use the same 7 namespaces:

Namespace	Purpose	Contents
`nx-internal`	Infrastructure	nginx-ingress, Traefik (API gateway), cert-manager, API Portal
`nx-backend`	Backend workloads	All backend services
`nx-app`	Frontend workloads	Frontend nginx pods (client, bo, overture, sale-renderer, wiki)
`nx-persistent`	Database	PostgreSQL primary + replica, PgBouncer
`nx-broker`	Message broker & cache	Redis Cluster, Kafka KRaft
`nx-search`	Search & CDC	Typesense, Debezium
`nx-watcher`	Observability stack	(not yet deployed)

Namespace YAML

Namespace: nx-internal

yaml

apiVersion: v1
kind: Namespace
metadata:
  name: nx-internal
  labels:
    app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
  name: nx-backend
  labels:
    app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
  name: nx-app
  labels:
    app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
  name: nx-persistent
  labels:
    app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
  name: nx-broker
  labels:
    app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
  name: nx-search
  labels:
    app.kubernetes.io/part-of: bana
---
apiVersion: v1
kind: Namespace
metadata:
  name: nx-watcher
  labels:
    app.kubernetes.io/part-of: bana

Resource Allocation

App Workloads (default/app nodes)

Workload	Replicas (staging)	Replicas (prod)	CPU req	CPU lim	Mem req	Mem lim
traefik	1	2	100m	1	128Mi	512Mi
identity	1	2	200m	2	256Mi	1Gi
commerce	1	1	200m	2	256Mi	1Gi
sale	1	2	200m	2	256Mi	1Gi
finance	1	1	200m	2	256Mi	1Gi
inventory	1	1	200m	2	256Mi	1Gi
ledger	1	1	200m	2	384Mi	1Gi
pricing	1	1	200m	2	320Mi	1Gi
payment-api	1	2	200m	2	256Mi	1Gi
payment-worker	1	1	100m	1	256Mi	1Gi
signal	1	2	100m	1	256Mi	512Mi
client	1	1	50m	500m	64Mi	256Mi
bo	1	1	50m	500m	64Mi	256Mi
overture	1	1	50m	500m	64Mi	256Mi
sale-renderer	1	1	50m	500m	64Mi	256Mi
wiki	1	1	50m	500m	64Mi	256Mi

Data Workloads (stateful nodes)

Staging (13 pods, 100Gi)Production (15 pods, 135Gi)

| Workload | Instances | CPU req | Mem req | Storage |
|----------|-----------|---------|---------|---------|
| PG Primary | 1 | 500m | 1Gi | 20Gi |
| PG Replica | 1 | 250m | 512Mi | 20Gi |
| PgBouncer | 1 | 100m | 256Mi | - |
| Redis | 3 | 300m | 1.125Gi | 15Gi |
| Kafka | 3 | 1.5 CPU | 3.75Gi | 30Gi |
| Typesense (raft) | 3 | 600m | 1.5Gi | 15Gi |
| Debezium | 1 | 250m | 512Mi | - |

| Workload | Instances | CPU req | Mem req | Storage |
|----------|-----------|---------|---------|---------|
| PostgreSQL (CNPG) | 3 (1 primary + 2 replicas) | 1.5 CPU | 3Gi | 60Gi + 15Gi WAL |
| Redis + Sentinel | 3 + 3 | 750m | 1.7Gi | 15Gi |
| Kafka (Strimzi) | 3 | 1.5 CPU | 3.75Gi | 30Gi |
| Typesense (raft) | 3 | 600m | 1.5Gi | 15Gi |

See Data Layer for full operator configs, failover mechanics, and backup strategy.

Storage Allocation

PVC	Staging	Production
`pg-data`	20Gi (×1)	20Gi (×3) + 5Gi WAL (×3)
`redis-data`	5Gi (×3)	5Gi (×3)
`kafka-data`	10Gi (×3)	10Gi (×3)
`typesense-data`	5Gi (×3)	5Gi (×3)
`prometheus-data`	10Gi	10Gi
`grafana-data`	5Gi	5Gi
`loki-data`	5Gi	5Gi
`tempo-data`	-	10Gi
Total	100Gi	170Gi

Resource Governance

ResourceQuota

Each namespace has a ResourceQuota to prevent runaway workloads from consuming all cluster resources.

nx-backend - largest namespace (all backend services)

yaml

# nx-backend - largest namespace (all backend services)
apiVersion: v1
kind: ResourceQuota
metadata:
  name: nx-backend-quota
  namespace: nx-backend
spec:
  hard:
    requests.cpu: "12"
    requests.memory: 16Gi
    limits.cpu: "24"
    limits.memory: 32Gi
    pods: "30"
---
# nx-app - frontend apps (lightweight nginx pods)
apiVersion: v1
kind: ResourceQuota
metadata:
  name: nx-app-quota
  namespace: nx-app
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 4Gi
    limits.cpu: "4"
    limits.memory: 8Gi
    pods: "12"
---
# nx-persistent - PostgreSQL + PgBouncer
apiVersion: v1
kind: ResourceQuota
metadata:
  name: nx-persistent-quota
  namespace: nx-persistent
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    pods: "8"
    persistentvolumeclaims: "10"
---
# nx-broker - Redis + Kafka
apiVersion: v1
kind: ResourceQuota
metadata:
  name: nx-broker-quota
  namespace: nx-broker
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 12Gi
    pods: "12"
    persistentvolumeclaims: "12"
---
# nx-search - Typesense + Debezium
apiVersion: v1
kind: ResourceQuota
metadata:
  name: nx-search-quota
  namespace: nx-search
spec:
  hard:
    requests.cpu: "1"
    requests.memory: 2Gi
    limits.cpu: "2"
    limits.memory: 4Gi
    pods: "4"
---
# nx-internal - nginx-ingress, Traefik, cert-manager, API Portal
apiVersion: v1
kind: ResourceQuota
metadata:
  name: nx-internal-quota
  namespace: nx-internal
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 4Gi
    limits.cpu: "4"
    limits.memory: 8Gi
    pods: "6"
---
# nx-watcher - observability stack (not yet deployed)
apiVersion: v1
kind: ResourceQuota
metadata:
  name: nx-watcher-quota
  namespace: nx-watcher
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 4Gi
    limits.cpu: "4"
    limits.memory: 8Gi
    pods: "8"

LimitRange

LimitRange sets default requests/limits for containers that don't specify them, and enforces min/max boundaries.

Default for nx-backend

yaml

# Default for nx-backend
apiVersion: v1
kind: LimitRange
metadata:
  name: nx-backend-limits
  namespace: nx-backend
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      min:
        cpu: 50m
        memory: 64Mi
      max:
        cpu: "2"
        memory: 2Gi
---
# Default for nx-app (lighter limits for frontend nginx)
apiVersion: v1
kind: LimitRange
metadata:
  name: nx-app-limits
  namespace: nx-app
spec:
  limits:
    - type: Container
      default:
        cpu: 200m
        memory: 256Mi
      defaultRequest:
        cpu: 50m
        memory: 64Mi
      min:
        cpu: 25m
        memory: 32Mi
      max:
        cpu: "1"
        memory: 1Gi
---
# Default for nx-broker
apiVersion: v1
kind: LimitRange
metadata:
  name: nx-broker-limits
  namespace: nx-broker
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 768Mi
      defaultRequest:
        cpu: 200m
        memory: 256Mi
      min:
        cpu: 100m
        memory: 128Mi
      max:
        cpu: "2"
        memory: 2Gi
---
# Default for nx-persistent
apiVersion: v1
kind: LimitRange
metadata:
  name: nx-persistent-limits
  namespace: nx-persistent
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 1Gi
      defaultRequest:
        cpu: 250m
        memory: 512Mi
      min:
        cpu: 100m
        memory: 256Mi
      max:
        cpu: "2"
        memory: 4Gi
---
# Default for nx-search
apiVersion: v1
kind: LimitRange
metadata:
  name: nx-search-limits
  namespace: nx-search
spec:
  limits:
    - type: Container
      default:
        cpu: 200m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 256Mi
      min:
        cpu: 50m
        memory: 128Mi
      max:
        cpu: "1"
        memory: 2Gi
---
# Default for nx-internal
apiVersion: v1
kind: LimitRange
metadata:
  name: nx-internal-limits
  namespace: nx-internal
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      min:
        cpu: 50m
        memory: 64Mi
      max:
        cpu: "2"
        memory: 2Gi
---
# Default for nx-watcher
apiVersion: v1
kind: LimitRange
metadata:
  name: nx-watcher-limits
  namespace: nx-watcher
spec:
  limits:
    - type: Container
      default:
        cpu: 200m
        memory: 256Mi
      defaultRequest:
        cpu: 50m
        memory: 64Mi
      min:
        cpu: 25m
        memory: 32Mi
      max:
        cpu: "1"
        memory: 1Gi

TIP

LimitRange applies to pods that don't explicitly declare resources. All BANA workloads specify resources, so LimitRange acts as a safety net for ad-hoc debugging pods or jobs.

Manifest Directory Structure

Deployment uses numbered manifest directories applied in order with kubectl via the ./kc kubeconfig wrapper. There is no Kustomize - no kustomization.yaml and no k8s/overlays anywhere in the repo.

infrastructure/deployments/staging/
├── kc                           # kubectl wrapper (loads the staging kubeconfig)
├── deploy.sh                    # single-service image bump + rollout
└── manifests/
    ├── 00-cluster-setup/        # Namespaces, PriorityClasses, ResourceQuotas, LimitRanges
    ├── 01-network-policies/     # NetworkPolicy + CiliumNetworkPolicy per namespace
    ├── 02-secrets/              # Per-service secret creation scripts & templates
    ├── 03-data-layer/           # PostgreSQL + PgBouncer, Redis Cluster, Kafka, Typesense, Debezium
    ├── 04-ingress-controller/   # nginx ingress controller + IngressClass
    ├── 05-gateway/              # Traefik API gateway (config, deployment, service)
    ├── 06-services/             # Backend Deployments, frontend nginx Deployments, shared-config
    └── 07-ingress/              # Domain routing Ingress (backend, frontend, webhook)

Deployment

Secrets (02-secrets/) are created first via the interactive create-*.sh scripts; the remaining numbered directories are plain manifests applied in order:

bash

# From infrastructure/deployments/staging - apply numbered manifests in order via ./kc
for step in 00-cluster-setup 01-network-policies 03-data-layer \
            04-ingress-controller 05-gateway 06-services 07-ingress; do
  ./kc apply -R -f "manifests/$step/"
done

Stateful components in 03-data-layer/ come up with readiness gates between StatefulSets (PostgreSQL primary before replica, Kafka and Redis cluster-init jobs). Redeploying a single service uses deploy.sh <service> [tag], which bumps the image and waits for the rollout.

Production (planned)

System + monitoring taints applied
nginx-ingress HA (2 replicas on system nodes)
Traefik HA (2 replicas on app nodes)
HPA for critical services (identity, sale, payment-api, signal)
PodDisruptionBudgets (minAvailable: 1 for HA services)
topologySpreadConstraints + podAntiAffinity
Deployed via GitLab CI/CD pipeline

Providers

Invoice Types

Cluster Design ​

Staging Cluster (3 workers) ​

Production Cluster (7+ nodes) ​

Why Dedicated Node Pools (Production) ​

Node Labels & Taints ​

Stateful Node Distribution ​

Namespace Strategy ​

Namespace YAML ​

Resource Allocation ​

App Workloads (default/app nodes) ​

Data Workloads (stateful nodes) ​

Storage Allocation ​

Resource Governance ​

ResourceQuota ​

LimitRange ​

Manifest Directory Structure ​

Deployment ​

Production (planned) ​

Cluster Design

Staging Cluster (3 workers)

Production Cluster (7+ nodes)

Why Dedicated Node Pools (Production)

Node Labels & Taints

Stateful Node Distribution

Namespace Strategy

Namespace YAML

Resource Allocation

App Workloads (default/app nodes)

Data Workloads (stateful nodes)

Storage Allocation

Resource Governance

ResourceQuota

LimitRange

Manifest Directory Structure

Deployment

Production (planned)