Skip to content

CDC / Debezium

1. Overview

Change Data Capture (CDC) keeps the Typesense search index in sync with PostgreSQL. Debezium monitors PostgreSQL's WAL (Write-Ahead Log) and publishes row-level changes to Kafka topics. The Search service consumes these events, transforms them, and upserts/deletes documents in Typesense.

PostgreSQL (WAL) → Debezium → Kafka → Search Service → Typesense

2. Architecture

3. CDC Topic Registry

All topics follow the Debezium naming convention: {prefix}.{schema}.{table}

Kafka TopicPostgreSQL TableTypesense CollectionNotes
nx.seller.public.Organizerpublic.Organizerorganizers
nx.seller.public.Merchantpublic.Merchantmerchants
nx.seller.public.Categorypublic.Categorycategories
nx.seller.public.Devicepublic.Devicedevices
nx.seller.public.SaleChannelpublic.SaleChannelsale-channels
nx.seller.public.Productpublic.Productproducts
nx.seller.public.ProductInfopublic.ProductInfoproductsPartial updates only

Topic Prefix: nx.seller (configured in the Debezium connector)

4. Debezium Payload Structure

4.1. Envelope Format

json
{
  "before": { ... },
  "after": { ... },
  "op": "c",
  "ts_ms": 1711785600000,
  "source": {
    "version": "2.x",
    "connector": "postgresql",
    "name": "nx.seller",
    "ts_ms": 1711785600000,
    "snapshot": "false",
    "db": "nx_seller",
    "schema": "public",
    "table": "Product"
  }
}

4.2. Operation Types

CodeOperationbeforeafterAction
cCreatenullrow dataUpsert document
uUpdateold datanew dataUpsert document
dDeleteold datanullDelete document
rSnapshot (read)nullrow dataUpsert document

5. Processing Pipeline

5.1. Consumer Configuration

SettingValue
Consumer IDcdc-consumer
Auto-commitfalse (manual offset management)
Max Wait Time500ms
Max Bytes5MB
Fallback Modeearliest

5.2. Batching

Messages are buffered and flushed in batches for efficient Typesense operations:

SettingValue
Max Batch Size200 messages
Flush Interval2000ms

5.3. Processing Flow

6. Entity Registry

The CDC service uses an extensible registry pattern to map database tables to Typesense collections:

TableCollectionTransformSpecial Behavior
OrganizerorganizersFull document mapping
MerchantmerchantsFull document mapping
CategorycategoriesFull document mapping
DevicedevicesFull document mapping
SaleChannelsale-channelsFull document mapping
ProductproductsFull document mapping
ProductInfoproductsPartial update onlyDeletes ignored (see below)

6.1. ProductInfo Special Case

ProductInfo rows are supplementary data for Product documents. When a ProductInfo row is deleted, the parent Product document should NOT be removed from the search index. Therefore:

  • Create/Update/Snapshot operations trigger partial updates to the parent Product document
  • Delete operations on ProductInfo are intentionally ignored

7. Dead Letter Queue (DLQ)

Failed messages are sent to a DLQ topic for investigation:

SettingValue
DLQ Topicnx.seller.cdc.dlq

DLQ messages include:

  • Original Debezium payload
  • Error message and stack trace
  • Source topic and partition
  • Timestamp of failure

8. Adding New CDC Entities

To add a new table to CDC sync:

  1. Debezium Connector — Add the table to the connector's table.include.list
  2. CDC Topic — Define the new topic name in @nx/search constants
  3. Typesense Collection — Create the collection schema in SearchCollections
  4. Entity Registry — Add a new entry to CDC_ENTITY_REGISTRY with the collection name and transform function
  5. Consumer — Add the topic to the consumer's subscription list

9. i18n Field Handling

Typesense does not natively support nested JSON objects. For i18n fields (e.g., { en: "Coffee", vi: "Ca phe" }), the CDC transform flattens them into separate fields:

name: { en: "Coffee", vi: "Ca phe" }
  → name_en: "Coffee"
  → name_vi: "Ca phe"

This enables language-specific search and filtering in Typesense.

DocumentDescription
Kafka ArchitectureKafka topics and consumer groups
Search ServiceTypesense integration
Data LayerPostgreSQL and Typesense setup

Proprietary and Confidential. Unauthorized copying, distribution, or use of this software is strictly prohibited.