Data Sources
This page documents where each ledger form's data is fetched from, what every field means at the source level, and non-obvious behaviors you must account for when working on fetchers or debugging output. It complements the rendering reference in HKD Templates.
Each fetcher implements AbstractLedgerDataFetcherService.fetch() and is registered in LedgerDataFetcherService via the FETCHER_MAP keyed by TLedgerIdentifierCode.
S1a-HKD
Form title: SỔ DOANH THU BÁN HÀNG, DỊCH VỤ Legal basis: Thông tư 152/2025/TT-BTC Fetcher: S1aHkdDataFetcherService (packages/ledger/src/services/fetchers/s1a-hkd-data-fetcher.service.ts) Data status: Real (live DB query)
Overview
S1a is the primary revenue ledger — one row per completed SaleOrder within the period. It does not break down tax; it records gross revenue totals only. This form applies to any HKD merchant regardless of tax method.
The fetch runs two parallel queries: merchant header data (from TaxInfo) and completed orders (from SaleOrder).
Quick Reference — Field Meanings
Merchant Header Fields
These fields are shared across all forms and are resolved from TaxInfo joined with VnWard and VnProvince.
| Output Field | Source | Notes |
|---|---|---|
businessName | TaxInfo.fullName | Falls back to Merchant.name (default → vi → en) if fullName is empty |
businessAddress | Composed | [TaxInfo.addressLine, VnWard.fullName, VnProvince.fullName].join(', ') when ward and province exist; otherwise TaxInfo.fullAddress or TaxInfo.addressLine |
businessTaxCode | TaxInfo.taxCode | Vietnam MST number |
currentDay/Month/Year | dayjs() at generation time | Signature date on the printed form — not the period end date |
periodDescription | Computed from Ledger.period | Vietnamese label, e.g. Quý 1 năm 2026, Tháng 5 năm 2026 |
title | Constant 'SỔ DOANH THU BÁN HÀNG, DỊCH VỤ' | Hardcoded in the fetcher |
Entry Fields (one row per SaleOrder)
Sourced from SaleOrderRepository.findCompletedInPeriod() — selects only orderNumber, total, completedAt from SaleOrder.
| Output Field | Source Column | Type | Notes |
|---|---|---|---|
entries[].code | SaleOrder.orderNumber | string | The human-readable order code shown on the receipt |
entries[].transDate | SaleOrder.completedAt | string (ISO) | When the order was completed, not created |
entries[].description | Constant 'Thanh toán giao dịch' | string | Hardcoded — no per-order description is stored |
entries[].amount | SaleOrder.total | number | Number(order.total) — total is stored as numeric in Postgres, returned as string by Drizzle |
Aggregate Fields
| Output Field | Source | Notes |
|---|---|---|
totalRevenue | entries.reduce((sum, e) => sum + e.amount, 0) | Sum of all SaleOrder.total values in the period |
Resolution Pipeline
periodStart/periodEnd
→ _getStartAndEndOfPeriod() [start of day → exclusive end of next day]
→ SaleOrder WHERE status=COMPLETED
AND completedAt >= start
AND completedAt < end
AND merchantId = merchantId
AND deletedAt IS NULL
→ entries[] (one entry per order, ordered by completedAt ASC)Period bounds are normalized at the fetcher: periodStart is set to 00:00:00 of the start date, periodEnd is set to 00:00:00 of the day after the end date (exclusive upper bound). This avoids time-zone edge cases.
Notices
| # | Issue | Detail |
|---|---|---|
| 1 | total is a string from Drizzle | Postgres numeric columns are returned as strings. The fetcher wraps with Number(order.total). If a value fails numeric coercion, entries[].amount will be NaN silently — this would surface at Zod validation. |
| 2 | completedAt is the order completion time, not payment time | An order may have been placed on day N-1 but completed (payment confirmed) on day N. The period filter uses completedAt, so split-day orders land on their completion date. |
| 3 | No tax breakdown | S1a records gross revenue only. Tax detail goes in S2a. Do not confuse SaleOrder.total (which may include VAT) with the taxable base. |
| 4 | Soft-deleted orders are excluded | findCompletedInPeriod explicitly adds isNull(SaleOrderSchema.deletedAt). Cancelled and soft-deleted orders do not appear. |
S2a-HKD
Form title: SỔ DOANH THU BÁN HÀNG HOÁ, DỊCH VỤ Legal basis: Thông tư 152/2025/TT-BTC, applies to merchants on the DIRECT tax method (khai theo tỷ lệ % trên doanh thu) Fetcher: S2aHkdDataFetcherService (packages/ledger/src/services/fetchers/s2a-hkd-data-fetcher.service.ts) Data status: Real (live DB query)
Overview
S2a breaks revenue down by tax sector (ngành nghề) — each sector gets its own column group showing revenue, VAT, and PIT. The sectors are discovered dynamically from the orders in the period; there is no fixed list. The form requires the merchant to be on the DIRECT tax method — the fetcher validates this before generating.
The fetch runs three phases:
- Fetch items (one row per
SaleOrderItem) and merchant header in parallel - Resolve each item's
taxSetIdto aTaxGroup(the sector) - Aggregate items into per-order entries and per-sector totals
Quick Reference — Field Meanings
Merchant Header Fields
Same source as S1a — see Merchant Header Fields above.
The periodDescription for S2a uses the same buildVietnamesePeriodDescription(period) utility.
Raw Query — What findCompletedItemsInPeriod Returns
This query joins SaleOrder ⨝ SaleOrderItem (INNER JOIN), so it returns one row per item, not per order. A single order with 3 items produces 3 rows, each with the same orderNumber and completedAt but a different priceMetadata.
| Column | Source | Type |
|---|---|---|
orderNumber | SaleOrder.orderNumber | string |
completedAt | SaleOrder.completedAt | string (ISO) |
priceMetadata | SaleOrderItem.priceMetadata | TPriceMetadata | null (JSONB) |
priceMetadata JSONB Structure
SaleOrderItem.priceMetadata is a frozen snapshot written at checkout time. The fields relevant to S2a are:
| JSONB Path | Type | Meaning |
|---|---|---|
priceMetadata.pricing.taxSetId | string | undefined | ID of the pricing.TaxSet row that was active for this item at checkout — the anchor for sector resolution |
priceMetadata.pricing.appliedTaxes[] | array | All taxes applied to this item at checkout — frozen; later tax changes do not affect these values |
appliedTaxes[].amount | string (decimal) | Tax amount in VND — stored as a decimal string, must be wrapped with Number() before arithmetic |
appliedTaxes[].taxableBase | string (decimal) | The revenue amount on which this tax was calculated — also a decimal string |
appliedTaxes[].isVat | boolean | true = GTGT (VAT), false = TNCN (PIT/personal income tax) |
appliedTaxes[].taxId | string | ID of the Tax row — not used for sector resolution (see notice below) |
Sector (Tax Group) Fields
After resolution, each sector gets a key used throughout the three aggregation maps:
| Concept | Value | Description |
|---|---|---|
groupKey | TaxGroup.id (UUID) | The unique key for a resolved sector |
groupKey | 'other' | Fallback key for items that could not be resolved to any TaxGroup |
groupName | TaxGroup.name.vi (or en / default) | Human-readable sector name shown in the column header |
label | Computed rate string | Rate annotation in the column header — see Rate Derivation below |
Output — taxGroupChunks
Sectors are sorted (named groups first, 'other' always last) then split into chunks of GROUPS_PER_CHUNK = 3 for horizontal PDF layout. Each chunk becomes one landscape page.
| Output Field | Type | Description |
|---|---|---|
taxGroupChunks | ITaxGroup[][] | 2-D array — outer = pages, inner = sectors on that page |
taxGroupChunks[i][j].key | string | groupKey (UUID or 'other') |
taxGroupChunks[i][j].groupName | string | Sector display name |
taxGroupChunks[i][j].label | string | Rate annotation (empty string if no annotation needed) |
taxGroupChunks[i][j].totalRevenue | number | Period total taxable revenue for this sector |
taxGroupChunks[i][j].totalVat | number | Period total VAT for this sector |
taxGroupChunks[i][j].totalPit | number | Period total PIT for this sector |
Output — entries
One entry per unique orderNumber, regardless of how many items it contained.
| Output Field | Type | Description |
|---|---|---|
entries[].code | string | SaleOrder.orderNumber |
entries[].transDate | string | SaleOrder.completedAt (ISO) |
entries[].description | string | Constant 'Thanh toán giao dịch' |
entries[].taxValues | Record<groupKey, TSectorTotal> | Per-sector totals for this order only; keys are groupKey values (UUID or 'other'); missing key = order had no items in that sector |
Sector Resolution Chain
Each SaleOrderItem is mapped to exactly one sector through a three-step lookup:
Why taxSetId and not appliedTaxes[].taxId:
taxSetIdis a single value on the item's pricing snapshot — one lookup per unique set of tax IDs used for this itemtaxIdis per applied-tax entry: an item with two applied taxes has twotaxIdvalues, requiring three hops (Tax → TaxGroupItem → TaxGroup) eachtaxSetIdrequires only two hops and is always item-scoped, not tax-entry-scoped
Both lookups are batched — all unique taxSetId values across all items in the period are resolved in two parallel queries (TaxSet then TaxGroup).
Rate Derivation & Label Logic
The effective tax rate for a sector is derived from the observed appliedTaxes data — not from any stored rate in the catalog. This is by design: TaxProvisioningService freezes the rate onto Tax.value at provisioning time, and later edits do not propagate. The snapshot is the source of truth.
How rates are tracked:
For each appliedTax entry, the fetcher computes rate = amount / taxableBase and records min and max per (groupKey, isVat):
if first observation → { min: rate, max: rate }
else → { min: Math.min(cur.min, rate), max: Math.max(cur.max, rate) }How labels are built (_buildRateLabel):
| Condition | Label output |
|---|---|
| Named sector, uniform rate | '' (empty) — the sector name already encodes the rate |
| Named sector, mixed VAT rate (min ≠ max) | 'VAT 8.0%–10.0%' |
| Named sector, mixed PIT rate | 'TNCN 0.3%–0.5%' |
| Named sector, both mixed | 'VAT 8.0%–10.0% - TNCN 0.3%–0.5%' |
'other' sector, VAT present | 'VAT 10.0%' (always shown — the name carries no rate information) |
'other' sector, both present | 'VAT 10.0% - TNCN 0.5%' |
The threshold for "mixed" is Math.abs(max - min) >= 1e-9 (epsilon guard for floating-point drift).
Notices
| # | Issue | Detail |
|---|---|---|
| 1 | Soft-deleted TaxSet → silent fallback to "Khác" | TaxSetRepository extends SoftDeletableRepository, which filters WHERE deleted_at IS NULL. If a product's pricing config was updated after an order was placed, the old TaxSet may have been soft-deleted. The priceMetadata snapshot still holds the original taxSetId, but the repository query silently excludes it — so all items referencing that TaxSet fall to 'other'. The raw SQL will find the record; the ORM will not. |
| 2 | appliedTaxes amounts are decimal strings | amount and taxableBase are stored as Postgres numeric and returned by Drizzle as strings. Always wrap with Number() before arithmetic. Failure to do so results in string concatenation rather than addition — '100' + '50' = '10050'. |
| 3 | Rate variation within a period is real | Two items in the same TaxGroup can legitimately have different amount/taxableBase ratios. This happens when the tax rate was changed between provisioning events (each product provisioning freezes the rate at that moment), or when inclusive vs. exclusive tax calculation was applied differently. The rate range label (VAT 8.0%–10.0%) is how this surfaces to the end user. |
| 4 | One SaleOrder can span multiple tax sectors | Resolution is item-level, not order-level. An order with two items belonging to different TaxGroups contributes to two sector columns. In entries[].taxValues, that order's entry will have two keys. |
| 5 | GROUPS_PER_CHUNK = 3 controls pagination | More than 3 sectors in a period creates additional landscape PDF pages. The XLSX flattens all sectors into a single sheet regardless of chunk count. |
| 6 | 'other' is always the last sector | The groups loop defers the "other" bucket and appends it after all named sectors, regardless of Map insertion order. This is intentional — named sectors should be visually stable across period regenerations. |
| 7 | Tax method guard | validate() throws MERCHANT_TAX_METHOD_NOT_DIRECT if Merchant.taxMethod !== 'DIRECT'. This is enforced before fetch — the generation job is rejected at the validation step, not mid-fetch. |
S2b–S2e-HKD
These forms are currently served from fixture files in resources/fixtures/ and do not have live fetchers. Data source documentation will be added when real fetchers are implemented.
Related Pages
- HKD Templates — rendering shapes, PDF/XLSX structure
- Generation Pipeline — where fetchers fit in the worker flow
- Domain Model —
Ledger,LedgerJob,LedgerSnapshottables