Workload

Mọi dịch vụ chạy trong cluster BANA, với thông số đầy đủ về bản sao, tài nguyên, health check, phân bổ Snowflake ID, và các tính năng HA cho production.

Dịch vụ Backend

Tất cả dịch vụ backend chia sẻ mẫu chung:

Image: bcr.bana.com.vn/nx-\<service\>:\<tag\>
Base image: oven/bun:1.3.10-alpine
Port: 3000 (nội bộ)
Đường dẫn health check: /v1/api/\<service\>/health
Chính sách khởi động lại: Always
Namespace: nx-backend

Staging vs Production

Khía cạnh	Staging	Production
Node selector	`node.kubernetes.io/pool: default`	`node.kubernetes.io/pool: app`
PodDisruptionBudget	Không	Có (dịch vụ HA)
HPA	Không	Có (dịch vụ quan trọng)
topologySpreadConstraints	Không	Có (dịch vụ HA)
podAntiAffinity	Soft (preferred) - tất cả dịch vụ	Hard (required) - dịch vụ HA
Bản sao	Tối thiểu	Nhiều hơn cho dịch vụ HA

Mẫu Deployment

Deployment: nx-

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nx-<service>
  namespace: nx-backend
  labels:
    app.kubernetes.io/name: <service>
    app.kubernetes.io/part-of: bana
    app.kubernetes.io/component: backend
spec:
  replicas: <count>
  selector:
    matchLabels:
      app.kubernetes.io/name: <service>
  template:
    metadata:
      labels:
        app.kubernetes.io/name: <service>
    spec:
      nodeSelector:
        node.kubernetes.io/pool: default  # staging: default, production: app
      initContainers:
        - name: wait-for-identity
          image: busybox:1.37
          command: ['sh', '-c', 'until wget -qO- http://nx-identity.nx-backend.svc.cluster.local:3000/v1/api/identity/health; do sleep 2; done']
      containers:
        - name: <service>
          image: bcr.bana.com.vn/nx-<service>:<tag>
          ports:
            - containerPort: 3000
          envFrom:
            - configMapRef:
                name: nx-<service>-config
            - secretRef:
                name: nx-<service>-secret
          resources:
            requests:
              cpu: <cpu-req>
              memory: <mem-req>
            limits:
              cpu: <cpu-lim>
              memory: <mem-lim>
          readinessProbe:
            httpGet:
              path: /v1/api/<service>/health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /v1/api/<service>/health
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 30
            failureThreshold: 3
          startupProbe:
            httpGet:
              path: /v1/api/<service>/health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 36  # 5s × 36 = 180s thời gian khởi động tối đa
          lifecycle:
            preStop:
              exec:
                command: ["sh", "-c", "sleep 15"]  # Chờ hủy đăng ký endpoint
          imagePullPolicy: IfNotPresent  # staging; production: Always
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL
          volumeMounts:
            - name: tmp
              mountPath: /tmp
      terminationGracePeriodSeconds: 45  # 15s preStop + 30s tắt ứng dụng
      automountServiceAccountToken: false
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        runAsGroup: 1000
        fsGroup: 1000
        seccompProfile:
          type: RuntimeDefault
      volumes:
        - name: tmp
          emptyDir:
            sizeLimit: 64Mi

Soft Anti-Affinity (Tất cả Deployment)

Tất cả backend deployment hiện đều bao gồm soft pod anti-affinity để phân tán pod trên các node khi có thể. Áp dụng cho cả staging và production:

Tất cả backend deployment hiện đều bao gồm soft pod anti-aff...

yaml

affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: <service>
          topologyKey: kubernetes.io/hostname

Ở staging (2 node default), đây là phân tán theo khả năng tốt nhất. Ở production, các dịch vụ HA bổ sung thêm hard anti-affinity (requiredDuringSchedulingIgnoredDuringExecution) để đảm bảo phân tán bắt buộc. :::

Tại sao startupProbe + preStop + terminationGracePeriod?

startupProbe: Cho dịch vụ khởi động chậm (identity khởi tạo JWKS, kết nối Kafka) tối đa 180s để sẵn sàng mà không bị liveness probe giết.
preStop sleep 15: Khi pod đang kết thúc, Kubernetes endpoint controller cần thời gian để hủy đăng ký khỏi Service. Sleep 15s đảm bảo request đang xử lý hoàn tất trước khi SIGTERM đến.
terminationGracePeriodSeconds: 45: 15s preStop + 30s cho ứng dụng thoát kết nối và tắt một cách graceful.

INFO

Init container wait-for-identity có mặt trên tất cả dịch vụ ngoại trừ identity. Identity là IssuerApplication (JWKS issuer); tất cả dịch vụ còn lại là VerifierApplication cần khóa công khai của identity để xác thực token.

Tính năng HA cho Production

Đối với các dịch vụ HA (identity, sale, payment-api, signal), production bổ sung thêm:

PodDisruptionBudget

PodDisruptionBudget: nx-{service}-pdb

yaml

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: nx-<service>-pdb
  namespace: nx-backend
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: <service>

HorizontalPodAutoscaler

HorizontalPodAutoscaler: nx-{service}-hpa

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nx-<service>-hpa
  namespace: nx-backend
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nx-<service>
  minReplicas: 2
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Topology & Anti-Affinity (Chỉ Production)

#### Topology & Anti-Affinity (Chỉ Production)

yaml

spec:
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: <service>
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app.kubernetes.io/name: <service>
                topologyKey: kubernetes.io/hostname

Thông số từng Dịch vụ

identity

JWKS issuer - phải khởi động trước. Không có init container wait-for-identity.

Thuộc tính	Staging	Production
Bản sao	1	2 (HPA: 2-5)
CPU yêu cầu/giới hạn	200m / 2	200m / 2
Mem yêu cầu/giới hạn	256Mi / 1Gi	256Mi / 1Gi
Đường dẫn health	`/v1/api/identity/health`	`/v1/api/identity/health`
HA	Có	Có (PDB + HPA + topology)
Dải Snowflake	10-19	10-19
PriorityClass	`nx-high`	`nx-high`
Middleware	`rate-limit-auth`, `circuit-breaker`, `security-headers`	`rate-limit-auth`, `circuit-breaker`, `security-headers`

| Middleware | rate-limit-auth, circuit-breaker, `securi...

yaml

env:
  - name: SNOWFLAKE_MACHINE_ID
    valueFrom:
      fieldRef:
        fieldPath: metadata.annotations['snowflake-id']

commerce

Thuộc tính	Giá trị
Bản sao	1
CPU yêu cầu/giới hạn	200m / 2
Mem yêu cầu/giới hạn	256Mi / 1Gi
Đường dẫn health	`/v1/api/commerce/health`
HA	Không
Dải Snowflake	20-29
Middleware	`rate-limit`, `circuit-breaker`, `security-headers`

sale

Thuộc tính	Staging	Production
Bản sao	1	2 (HPA: 2-5)
CPU yêu cầu/giới hạn	200m / 2	200m / 2
Mem yêu cầu/giới hạn	256Mi / 1Gi	256Mi / 1Gi
Đường dẫn health	`/v1/api/sale/health`	`/v1/api/sale/health`
HA	Có	Có (PDB + HPA + topology)
Dải Snowflake	30-39	30-39
PriorityClass	`nx-high`	`nx-high`
Middleware	`rate-limit`, `circuit-breaker`, `security-headers`	`rate-limit`, `circuit-breaker`, `security-headers`

finance

Thuộc tính	Giá trị
Bản sao	1
CPU yêu cầu/giới hạn	200m / 2
Mem yêu cầu/giới hạn	256Mi / 1Gi
Đường dẫn health	`/v1/api/finance/health`
HA	Không
Dải Snowflake	40-49

inventory

Thuộc tính	Giá trị
Bản sao	1
CPU yêu cầu/giới hạn	200m / 2
Mem yêu cầu/giới hạn	256Mi / 1Gi
Đường dẫn health	`/v1/api/inventory/health`
HA	Không
Dải Snowflake	50-59

ledger

Thuộc tính	Giá trị
Bản sao	1
CPU yêu cầu/giới hạn	200m / 2
Mem yêu cầu/giới hạn	384Mi / 1Gi
Đường dẫn health	`/v1/api/ledger/health`
HA	Không
Dải Snowflake	60-69

pricing

Thuộc tính	Giá trị
Bản sao	1
CPU yêu cầu/giới hạn	200m / 2
Mem yêu cầu/giới hạn	320Mi / 1Gi
Đường dẫn health	`/v1/api/pricing/health`
HA	Không
Dải Snowflake	70-79

Payment (2 Deployment, 1 Image)

Payment sử dụng một container image duy nhất được triển khai theo hai cách qua biến môi trường APP_MODE.

payment-api

Thuộc tính	Staging	Production
Bản sao	1	2 (HPA: 2-4)
CPU yêu cầu/giới hạn	200m / 2	200m / 2
Mem yêu cầu/giới hạn	256Mi / 1Gi	256Mi / 1Gi
Đường dẫn health	`/v1/api/payment/health`	`/v1/api/payment/health`
HA	Có	Có (PDB + HPA + topology)
Snowflake ID	8	80-84
Env bổ sung	`APP_ENV_MQ_PAY_MODE=api`	`APP_MODE=api`
PriorityClass	`nx-high`	`nx-high`
Middleware	`rate-limit`, `circuit-breaker`, `security-headers`	`rate-limit`, `circuit-breaker`, `security-headers`

IngressRoute bổ sung cho chuyển đổi đường dẫn webhook:

Rewrites hook.staging.bana.com.vn/v1/api/* -> /v1/api/payment/*

yaml

# Rewrites hook.staging.bana.com.vn/v1/api/* -> /v1/api/payment/*
- name: payment-webhook
  match: Host(`hook.staging.bana.com.vn`)
  priority: 100
  middlewares:
    - name: payment-add-prefix
    - name: security-headers

payment-worker

Thuộc tính	Giá trị
Bản sao	1
CPU yêu cầu/giới hạn	100m / 1
Mem yêu cầu/giới hạn	256Mi / 1Gi
HA	Không
Snowflake ID	90
Env bổ sung	`APP_ENV_MQ_PAY_MODE=worker`
PriorityClass	`nx-low`
Liveness	Process check (`kill -0 1`), không phải HTTP
Traefik	Tắt (không có Service/IngressRoute)

Signal (Dual-Route)

Thuộc tính	Staging	Production
Bản sao	1	2 (HPA: 2-5)
CPU yêu cầu/giới hạn	100m / 1	100m / 1
Mem yêu cầu/giới hạn	256Mi / 512Mi	256Mi / 512Mi
Đường dẫn health	`/v1/api/signal/health`	`/v1/api/signal/health`
HA	Có	Có (PDB + HPA + topology)
Dải Snowflake	90-99	90-99
PriorityClass	`nx-high`	`nx-high`

INFO

Định tuyến Signal (REST API có middleware và WebSocket không có rate limiting) được cấu hình trong file-based dynamic config của Traefik, không qua IngressRoute CRD.

Traefik API Gateway

Traefik chạy dưới dạng Deployment trong namespace nx-internal. Nó hoạt động như API gateway cho việc định tuyến backend, không phải là ingress controller. Lưu lượng từ bên ngoài đi vào qua nginx-ingress trong nx-internal, sau đó chuyển tiếp đến Traefik.

Deployment: nx-traefik

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nx-traefik
  namespace: nx-internal
  labels:
    app.kubernetes.io/name: traefik
    app.kubernetes.io/part-of: bana
    app.kubernetes.io/component: gateway
spec:
  replicas: 1  # staging: 1, production: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: traefik
  template:
    metadata:
      labels:
        app.kubernetes.io/name: traefik
    spec:
      nodeSelector:
        node.kubernetes.io/pool: default  # staging: default, production: app
      containers:
        - name: traefik
          image: traefik:v3.6
          args:
            - --api.dashboard=true
            - --api.insecure=true
            - --entrypoints.web.address=:8000
            - --entrypoints.traefik.address=:8080
            - --providers.file.directory=/etc/traefik/dynamic
            - --providers.file.watch=true
            - --log.format=json
            - --log.level=INFO
            - --metrics.prometheus=true
            - --metrics.prometheus.addEntryPointsLabels=true
            - --metrics.prometheus.addServicesLabels=true
            - --accesslog=true
            - --accesslog.format=json
          ports:
            - name: web
              containerPort: 8000
            - name: traefik
              containerPort: 8080
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: "1"
              memory: 512Mi
          readinessProbe:
            httpGet:
              path: /ping
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: nx-traefik
  namespace: nx-internal
spec:
  selector:
    app.kubernetes.io/name: traefik
  ports:
    - name: web
      port: 80
      targetPort: 8000
      protocol: TCP
    - name: traefik
      port: 8080
      targetPort: 8080
      protocol: TCP
  type: ClusterIP

TIP

Ở staging, Traefik chạy bản sao đơn trong nx-internal trên node default. Ở production, chạy 2 bản sao trên node app với podAntiAffinity để phân tán trên các host.

Phân bổ Snowflake ID

Mỗi dịch vụ được gán một dải machine ID riêng để tránh xung đột ID giữa các bản sao.

Dịch vụ	Dải	Pod 0	Pod 1
identity	10-19	10	-
commerce	20-29	20	-
sale	30-39	30	-
finance	40-49	40	-
inventory	50-59	50	-
ledger	60-69	60	-
pricing	70-79	70	-
payment-api	80-84	80	-
payment-worker	90	90	-
signal	90-99	90	-

Phân bổ được thực hiện qua chỉ số thứ tự pod lấy từ tên pod:

yaml

env:
  - name: POD_NAME
    valueFrom:
      fieldRef:
        fieldPath: metadata.name
  - name: SNOWFLAKE_MACHINE_ID
    value: "$(echo $POD_NAME | grep -oE '[0-9]+$' | awk '{print $1 + <base>}')"

Đối với Deployment (không phải StatefulSet), sử dụng script init qua ConfigMap hoặc Downward API với sidecar để tính toán ID.

Dịch vụ Frontend

Tất cả dịch vụ frontend sử dụng nginx:1.27-alpine phục vụ tài nguyên tĩnh trong namespace nx-app.

Dịch vụ	Đường dẫn	Port	CPU yêu cầu/giới hạn	Mem yêu cầu/giới hạn
client	`/client`	8080	50m/500m	64Mi/256Mi
bo	`/bo`	8080	50m/500m	64Mi/256Mi
sale-renderer	`/sale`	8080	50m/500m	64Mi/256Mi
overture	`/`	8080	50m/500m	64Mi/256Mi
wiki	`/wiki`	8080	50m/500m	64Mi/256Mi

Deployment: nx-client

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nx-client
  namespace: nx-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: client
  template:
    spec:
      nodeSelector:
        node.kubernetes.io/pool: default  # staging: default, production: app
      containers:
        - name: client
          image: bcr.bana.com.vn/nx-client:latest
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: 50m
              memory: 64Mi
            limits:
              cpu: 500m
              memory: 256Mi
          readinessProbe:
            httpGet:
              path: /
              port: 8080
            initialDelaySeconds: 2
            periodSeconds: 10
          volumeMounts:
            - name: nginx-config
              mountPath: /etc/nginx/conf.d/default.conf
              subPath: default.conf
              readOnly: true
      volumes:
        - name: nginx-config
          configMap:
            name: nx-client-nginx

INFO

Image frontend được build bởi CI/CD và push lên bcr.bana.com.vn/nx-\<app\>:\<tag\>. Tài nguyên tĩnh được đóng gói vào image trong bước build CI.

Đối tượng Service

Mỗi Deployment có một Service ClusterIP tương ứng:

Service: nx-

yaml

apiVersion: v1
kind: Service
metadata:
  name: nx-<service>
  namespace: nx-backend
spec:
  selector:
    app.kubernetes.io/name: <service>
  ports:
    - port: 3000      # backend
      targetPort: 3000
      protocol: TCP
  type: ClusterIP

Service frontend sử dụng port 8080 thay vì 3000. DNS backend phân giải dạng nx-\<service\>.nx-backend.svc.cluster.local. DNS frontend phân giải dạng nx-\<service\>.nx-app.svc.cluster.local.

Nhà cung cấp

Các loại Hóa đơn

Workload ​

Dịch vụ Backend ​

Staging vs Production ​

Mẫu Deployment ​

Tính năng HA cho Production ​

PodDisruptionBudget ​

HorizontalPodAutoscaler ​

Topology & Anti-Affinity (Chỉ Production) ​

Thông số từng Dịch vụ ​

identity ​

commerce ​

sale ​

finance ​

inventory ​

ledger ​

pricing ​

Payment (2 Deployment, 1 Image) ​

payment-api ​

payment-worker ​

Signal (Dual-Route) ​

Traefik API Gateway ​

Phân bổ Snowflake ID ​

Dịch vụ Frontend ​

Đối tượng Service ​

Workload

Dịch vụ Backend

Staging vs Production

Mẫu Deployment

Tính năng HA cho Production

PodDisruptionBudget

HorizontalPodAutoscaler

Topology & Anti-Affinity (Chỉ Production)

Thông số từng Dịch vụ

identity

commerce

sale

finance

inventory

ledger

pricing

Payment (2 Deployment, 1 Image)

payment-api

payment-worker

Signal (Dual-Route)

Traefik API Gateway

Phân bổ Snowflake ID

Dịch vụ Frontend

Đối tượng Service