ADR-0002. CDC-driven denormalized Typesense index with cascade fan-out
| Field | Value |
|---|---|
| Status | Accepted |
| Date | 2026-02-15 |
| Deciders | @search-team |
| Supersedes | — |
Context
- Source data is normalized across many tables in three schemas (
public,pricing,inventory); a product document needs fields from Product, ProductInfo, ProductCategory, MetaLink, FareSet/Fare, ProductBundler. - Typesense documents must be flat/denormalized for fast keyword + facet queries;
referencefields only support a single scalar FK. - A Debezium → Kafka CDC pipeline already streams row changes; search should consume it rather than poll or be written to synchronously.
- Some source tables are doc principals (Product →
products); others are join/related tables that only modify a principal's fields (ProductCategory affectscategoryIds).
Decision
We will maintain denormalized Typesense documents fed by CDC. TableToCollectionMap routes direct sources to a collection via entity mappers; cascade-only sources (join/related tables) are fanned out by CDCCascadeService to recompute affected fields on the principal documents, with CDCEnrichmentService hydrating cross-collection data. Out-of-order safety comes from LSN/version guards (source_lsn, deleted_at).
Consequences
| Pros | Cons |
|---|---|
| Fast flat queries + facets without runtime joins | Eventual consistency (CDC delay) vs the source DB |
| Real-time-ish sync without synchronous writes from sources | Cascade logic must know every dependent field per principal |
| LSN guards make replays/out-of-order events idempotent | Denormalized fields (e.g. parent names, flags) can drift and need backfill/reindex |
| Single Kafka consumer covers 18 topics across 3 schemas | Coupling to Debezium topic naming + connector config |
Alternatives Considered
| Option | Pros | Cons | Why rejected |
|---|---|---|---|
| Query-time joins in Typesense / app | Always fresh | Slow; reference limited to single scalar FK | Defeats the point of a search index |
| Synchronous dual-write from source services | Immediate consistency | Couples every writer to Typesense; partial-failure risk | Fragile; spreads search concerns everywhere |
| Periodic full reindex from DB | Simple | Stale between runs; heavy load | Not real-time enough for POS |
References
packages/search/src/common/kafka-topics.ts—TableToCollectionMap,ALL_CDC_TOPICS- API Events
- Domain Model