Transaction History Data Specification
For sales analytics, elasticity modeling, and retail intelligence Check the sample sample_transactions
1. Objective
Use this dataset for:- Sales performance analytics
- Price elasticity modeling
- Demand curve estimation
- Price optimization
- Promotion impact analysis
- Store performance benchmarking
2. Data granularity requirement
Each row must represent:One product in one transactionCorrect:
| transaction_id | product_id | quantity |
|---|---|---|
| TRX_001 | P001 | 2 |
| TRX_001 | P002 | 1 |
- Daily aggregated totals
- Store-level summaries
- Pivoted exports
3. Transaction History – Unified Data Specification
| Field Name | Required | Type | Format | Description | Example | Alternative Names |
|---|---|---|---|---|---|---|
| transaction_id | ✅ Yes | string | Unique | Unique identifier of the transaction. Must be stable and not reused. | TRX_000123 | order id, receipt id, invoice id, ticket id |
| store_id | ✅ Yes | string | Text | Identifier of store or sales location. | S001 | location id, branch id, outlet id |
| product_id | ✅ Yes | string | Text | Product identifier. Must match product_id in Product Master file. | P001 | item id, sku id, article id |
| transaction_timestamp | ✅ Yes | datetime | YYYY-MM-DD HH:MM:SS | Date and time of transaction in ISO 8601 format. Preferably UTC. | 2026-02-18 14:22:05 | sale timestamp, order datetime, event time |
| quantity | ✅ Yes | decimal(12,2) or integer | Positive for sales, negative for returns | Units sold (positive) or returned (negative). Must not be zero. | 3 / -1 | units, qty, quantity sold, units sold |
| selling_price | ✅ Yes | decimal(12,2) | Dot separator | Actual price paid per unit in that transaction (after discount). | 129.99 | unit price, final price, transaction price |
| regular_price | ⬜ Optional | decimal(12,2) | Dot separator | Standard non-discounted product price at time of sale. | 139.99 | base price, list price, shelf price |
| unit_cost | ⬜ Optional | decimal(12,2) | 2 decimals | Cost per unit at time of transaction (for margin calculation). | 80.00 | cost price, cogs unit, purchase cost |
| discount_rate | ⬜ Optional | decimal(5,4) | 0–1 | Discount applied as a rate (0.15 = 15%). | 0.15 | discount pct, promo rate |
| sales_value | ⬜ Optional | decimal(12,2) | Dot separator | Total transaction value per row. If provided, must equal quantity × selling_price. | 389.97 | revenue, line revenue, gross sales |
| gross_profit | ⬜ Optional | decimal(12,2) | Dot separator | Gross profit per row. If provided, must equal (selling_price - unit_cost) × quantity. | 149.97 | margin value, contribution margin |
| stock_after_sale | ⬜ Optional | integer | ≥ 0 | Inventory level after transaction occurred. | 37 | stock on_hand, inventory after |
| currency | ⬜ Optional* | string(3) | ISO 4217 | Currency code used for price fields. Required if multi-currency. | EUR | currency code |
| tax_rate | ⬜ Optional | decimal(5,4) | 0–1 | VAT/tax rate applied (0.20 = 20%). | 0.20 | vat rate, tax percentage |
| transaction_type | ⬜ Optional | string | sale / return | Type of transaction. If return, quantity must be negative. | sale | order type, operation type |
4. Return handling policy (mandatory if returns exist)
Returns must be provided as separate transaction rows.4.1 Format for Returns
| Field | Value |
|---|---|
| transaction_type | return |
| quantity | Negative value |
| selling_price | Positive |
| transaction_timestamp | Actual return time |
| product_id | Same as original sale |
4.2 Example
Original Sale| transaction_id | product_id | quantity | selling_price | transaction_type |
|---|---|---|---|---|
| TRX_1001 | P001 | 2 | 129.99 | sale |
| transaction_id | product_id | quantity | selling_price | transaction_type |
|---|---|---|---|---|
| TRX_1055 | P001 | -1 | 129.99 | return |
4.3 Important Rules
- Do NOT delete original sale rows
- Do NOT aggregate net values before export
- Do NOT use positive quantity for returns
- unit_price must remain positive
- Returns must reflect the actual return date
5. Data format requirements (strict)
5.1 Timestamp
Must follow ISO 8601:- Preferably UTC
- Or timezone must be clearly specified
5.2 Numeric Fields
- Decimal separator:
. - No thousand separators
- No currency symbols
- No localized formats
5.3 Discount Format
Must be a rate:6. File format
Accepted:- CSV (UTF-8 encoding) — preferred
- XLSX
No formulas
Raw data only
7. Historical data requirements
Minimum:- 6–12 months of history
- Preferably 24 months
- 10,000+ transaction rows per store
8. Data validation rules
- quantity ≠ 0
- quantity > 0 for sales
- quantity < 0 for returns
- unit_price ≥ 0
- discount_rate between 0 and 1
- transaction_timestamp not null
- product_id must exist in product master file
