Skip to content

ADR-0002. CDC-driven denormalized Typesense index with cascade fan-out

FieldValue
StatusAccepted
Date2026-02-15
Deciders@search-team
Supersedes

Context

  • Source data is normalized across many tables in three schemas (public, pricing, inventory); a product document needs fields from Product, ProductInfo, ProductCategory, MetaLink, FareSet/Fare, ProductBundler.
  • Typesense documents must be flat/denormalized for fast keyword + facet queries; reference fields only support a single scalar FK.
  • A Debezium → Kafka CDC pipeline already streams row changes; search should consume it rather than poll or be written to synchronously.
  • Some source tables are doc principals (Product → products); others are join/related tables that only modify a principal's fields (ProductCategory affects categoryIds).

Decision

We will maintain denormalized Typesense documents fed by CDC. TableToCollectionMap routes direct sources to a collection via entity mappers; cascade-only sources (join/related tables) are fanned out by CDCCascadeService to recompute affected fields on the principal documents, with CDCEnrichmentService hydrating cross-collection data. Out-of-order safety comes from LSN/version guards (source_lsn, deleted_at).

Consequences

ProsCons
Fast flat queries + facets without runtime joinsEventual consistency (CDC delay) vs the source DB
Real-time-ish sync without synchronous writes from sourcesCascade logic must know every dependent field per principal
LSN guards make replays/out-of-order events idempotentDenormalized fields (e.g. parent names, flags) can drift and need backfill/reindex
Single Kafka consumer covers 18 topics across 3 schemasCoupling to Debezium topic naming + connector config

Alternatives Considered

OptionProsConsWhy rejected
Query-time joins in Typesense / appAlways freshSlow; reference limited to single scalar FKDefeats the point of a search index
Synchronous dual-write from source servicesImmediate consistencyCouples every writer to Typesense; partial-failure riskFragile; spreads search concerns everywhere
Periodic full reindex from DBSimpleStale between runs; heavy loadNot real-time enough for POS

References

Proprietary and Confidential. Unauthorized copying, distribution, or use of this software is strictly prohibited.