CDC / Debezium
1. Overview
Change Data Capture (CDC) keeps the Typesense search index in sync with PostgreSQL. Debezium monitors PostgreSQL's WAL (Write-Ahead Log) and publishes row-level changes to Kafka topics. The Search service consumes these events, transforms them, and upserts/deletes documents in Typesense.
PostgreSQL (WAL) → Debezium → Kafka → Search Service → Typesense2. Architecture
3. CDC Topic Registry
All topics follow the Debezium naming convention: {prefix}.{schema}.{table}
| Kafka Topic | PostgreSQL Table | Typesense Collection | Notes |
|---|---|---|---|
nx.seller.public.Organizer | public.Organizer | organizers | |
nx.seller.public.Merchant | public.Merchant | merchants | |
nx.seller.public.Category | public.Category | categories | |
nx.seller.public.Device | public.Device | devices | |
nx.seller.public.SaleChannel | public.SaleChannel | sale-channels | |
nx.seller.public.Product | public.Product | products | |
nx.seller.public.ProductInfo | public.ProductInfo | products | Partial updates only |
Topic Prefix: nx.seller (configured in the Debezium connector)
4. Debezium Payload Structure
4.1. Envelope Format
{
"before": { ... },
"after": { ... },
"op": "c",
"ts_ms": 1711785600000,
"source": {
"version": "2.x",
"connector": "postgresql",
"name": "nx.seller",
"ts_ms": 1711785600000,
"snapshot": "false",
"db": "nx_seller",
"schema": "public",
"table": "Product"
}
}4.2. Operation Types
| Code | Operation | before | after | Action |
|---|---|---|---|---|
c | Create | null | row data | Upsert document |
u | Update | old data | new data | Upsert document |
d | Delete | old data | null | Delete document |
r | Snapshot (read) | null | row data | Upsert document |
5. Processing Pipeline
5.1. Consumer Configuration
| Setting | Value |
|---|---|
| Consumer ID | cdc-consumer |
| Auto-commit | false (manual offset management) |
| Max Wait Time | 500ms |
| Max Bytes | 5MB |
| Fallback Mode | earliest |
5.2. Batching
Messages are buffered and flushed in batches for efficient Typesense operations:
| Setting | Value |
|---|---|
| Max Batch Size | 200 messages |
| Flush Interval | 2000ms |
5.3. Processing Flow
6. Entity Registry
The CDC service uses an extensible registry pattern to map database tables to Typesense collections:
| Table | Collection | Transform | Special Behavior |
|---|---|---|---|
Organizer | organizers | Full document mapping | — |
Merchant | merchants | Full document mapping | — |
Category | categories | Full document mapping | — |
Device | devices | Full document mapping | — |
SaleChannel | sale-channels | Full document mapping | — |
Product | products | Full document mapping | — |
ProductInfo | products | Partial update only | Deletes ignored (see below) |
6.1. ProductInfo Special Case
ProductInfo rows are supplementary data for Product documents. When a ProductInfo row is deleted, the parent Product document should NOT be removed from the search index. Therefore:
- Create/Update/Snapshot operations trigger partial updates to the parent Product document
- Delete operations on ProductInfo are intentionally ignored
7. Dead Letter Queue (DLQ)
Failed messages are sent to a DLQ topic for investigation:
| Setting | Value |
|---|---|
| DLQ Topic | nx.seller.cdc.dlq |
DLQ messages include:
- Original Debezium payload
- Error message and stack trace
- Source topic and partition
- Timestamp of failure
8. Adding New CDC Entities
To add a new table to CDC sync:
- Debezium Connector — Add the table to the connector's
table.include.list - CDC Topic — Define the new topic name in
@nx/searchconstants - Typesense Collection — Create the collection schema in
SearchCollections - Entity Registry — Add a new entry to
CDC_ENTITY_REGISTRYwith the collection name and transform function - Consumer — Add the topic to the consumer's subscription list
9. i18n Field Handling
Typesense does not natively support nested JSON objects. For i18n fields (e.g., { en: "Coffee", vi: "Ca phe" }), the CDC transform flattens them into separate fields:
name: { en: "Coffee", vi: "Ca phe" }
→ name_en: "Coffee"
→ name_vi: "Ca phe"This enables language-specific search and filtering in Typesense.
10. Related Documentation
| Document | Description |
|---|---|
| Kafka Architecture | Kafka topics and consumer groups |
| Search Service | Typesense integration |
| Data Layer | PostgreSQL and Typesense setup |