Skip to main content

Transaction History Data Specification


For sales analytics, elasticity modeling, and retail intelligence
Check the sample sample_transactions

1. Objective

Use this dataset for:
  • Sales performance analytics
  • Price elasticity modeling
  • Demand curve estimation
  • Price optimization
  • Promotion impact analysis
  • Store performance benchmarking
You must provide data at transaction line level.

2. Data granularity requirement

Each row must represent:
One product in one transaction
Correct:
transaction_idproduct_idquantity
TRX_001P0012
TRX_001P0021
Incorrect:
  • Daily aggregated totals
  • Store-level summaries
  • Pivoted exports

3. Transaction History – Unified Data Specification

Field NameRequiredTypeFormatDescriptionExampleAlternative Names
transaction_id✅ YesstringUniqueUnique identifier of the transaction. Must be stable and not reused.TRX_000123order id, receipt id, invoice id, ticket id
store_id✅ YesstringTextIdentifier of store or sales location.S001location id, branch id, outlet id
product_id✅ YesstringTextProduct identifier. Must match product_id in Product Master file.P001item id, sku id, article id
transaction_timestamp✅ YesdatetimeYYYY-MM-DD HH:MM:SSDate and time of transaction in ISO 8601 format. Preferably UTC.2026-02-18 14:22:05sale timestamp, order datetime, event time
quantity✅ Yesdecimal(12,2) or integerPositive for sales, negative for returnsUnits sold (positive) or returned (negative). Must not be zero.3 / -1units, qty, quantity sold, units sold
selling_price✅ Yesdecimal(12,2)Dot separatorActual price paid per unit in that transaction (after discount).129.99unit price, final price, transaction price
regular_price⬜ Optionaldecimal(12,2)Dot separatorStandard non-discounted product price at time of sale.139.99base price, list price, shelf price
unit_cost⬜ Optionaldecimal(12,2)2 decimalsCost per unit at time of transaction (for margin calculation).80.00cost price, cogs unit, purchase cost
discount_rate⬜ Optionaldecimal(5,4)0–1Discount applied as a rate (0.15 = 15%).0.15discount pct, promo rate
sales_value⬜ Optionaldecimal(12,2)Dot separatorTotal transaction value per row. If provided, must equal quantity × selling_price.389.97revenue, line revenue, gross sales
gross_profit⬜ Optionaldecimal(12,2)Dot separatorGross profit per row. If provided, must equal (selling_price - unit_cost) × quantity.149.97margin value, contribution margin
stock_after_sale⬜ Optionalinteger≥ 0Inventory level after transaction occurred.37stock on_hand, inventory after
currency⬜ Optional*string(3)ISO 4217Currency code used for price fields. Required if multi-currency.EURcurrency code
tax_rate⬜ Optionaldecimal(5,4)0–1VAT/tax rate applied (0.20 = 20%).0.20vat rate, tax percentage
transaction_type⬜ Optionalstringsale / returnType of transaction. If return, quantity must be negative.saleorder type, operation type

4. Return handling policy (mandatory if returns exist)

Returns must be provided as separate transaction rows.

4.1 Format for Returns

FieldValue
transaction_typereturn
quantityNegative value
selling_pricePositive
transaction_timestampActual return time
product_idSame as original sale

4.2 Example

Original Sale
transaction_idproduct_idquantityselling_pricetransaction_type
TRX_1001P0012129.99sale
Return
transaction_idproduct_idquantityselling_pricetransaction_type
TRX_1055P001-1129.99return

4.3 Important Rules

  • Do NOT delete original sale rows
  • Do NOT aggregate net values before export
  • Do NOT use positive quantity for returns
  • unit_price must remain positive
  • Returns must reflect the actual return date

5. Data format requirements (strict)

5.1 Timestamp

Must follow ISO 8601:
YYYY-MM-DD HH:MM:SS
Example:
2026-02-18 14:22:05
Timezone:
  • Preferably UTC
  • Or timezone must be clearly specified

5.2 Numeric Fields

  • Decimal separator: .
  • No thousand separators
  • No currency symbols
  • No localized formats
Correct:
129.99
Incorrect:
129,99
1 299.99
$129.99

5.3 Discount Format

Must be a rate:
0.15  = 15%
0.00  = no discount
Not:
15%
-20

6. File format

Accepted:
  • CSV (UTF-8 encoding) — preferred
  • XLSX
No pivot tables
No formulas
Raw data only

7. Historical data requirements

Minimum:
  • 6–12 months of history
  • Preferably 24 months
Recommended minimum volume:
  • 10,000+ transaction rows per store

8. Data validation rules

  • quantity ≠ 0
  • quantity > 0 for sales
  • quantity < 0 for returns
  • unit_price ≥ 0
  • discount_rate between 0 and 1
  • transaction_timestamp not null
  • product_id must exist in product master file

9. SQL schema example

CREATE TABLE transaction_history (
    transaction_id VARCHAR(64) NOT NULL,
    store_id VARCHAR(32) NOT NULL,
    product_id VARCHAR(64) NOT NULL,
    transaction_timestamp TIMESTAMP NOT NULL,
    quantity INT NOT NULL,
    unit_price DECIMAL(12,2) NOT NULL,
    unit_cost DECIMAL(12,2),
    discount_rate DECIMAL(5,4),
    stock_after_sale INT,
    currency CHAR(3),
    transaction_type VARCHAR(16),
    sales_channel VARCHAR(16)
);

10. Minimal dataset (if system limitations exist)

If the system cannot provide full structure, minimum required:
transaction_id
store_id
product_id
transaction_timestamp
quantity
selling_price
Returns must still follow negative quantity logic.