Store named datetime columns as INTEGER microseconds (datetime_columns)
This commit is contained in:
@@ -6,6 +6,24 @@ All notable changes to this project will be documented in this file.
|
||||
|
||||
---
|
||||
|
||||
## [1.12.0] - 2026-06-09
|
||||
|
||||
### ⚠️ Breaking
|
||||
- **`SCHEMA_VERSION` bumped `3` → `4`** — on upgrade the existing cache is wiped automatically (disk mode wipes the file in place, in-memory discards the backup) and reloaded from the source on next use. For a large cache (e.g. a multi-hundred-million-row table) the full reload can take a while; deploy in a maintenance window.
|
||||
- **`datetime_columns` change the public output contract for the chosen columns** — a column listed in `datetime_columns` is stored and returned as an **INTEGER (microseconds since the Unix epoch, UTC)**, not an ISO `TEXT` string. This is opt-in per column, so no table is affected unless you name its columns; consumers that read or filter such a column must adapt (compare against integer µs, or convert on read).
|
||||
|
||||
### Added
|
||||
- **`datetime_columns=` parameter on `CachingEngine` / `CacheManager`** — `datetime_columns={"VW_X": ["CHANGE_DATE"]}` stores the named datetime columns as INTEGER µs-since-epoch instead of ~28-byte ISO `TEXT`. Saves ~20 bytes per row and makes index comparisons on the column operate on native integers instead of string collation — worthwhile for a pure datetime column on a very large table (e.g. a delta change column that is also range-scanned).
|
||||
- `_coerce.to_sqlite_datetime()` converts datetimes (and ISO/`date` values) to exact integer microseconds via integer arithmetic (no float rounding); a naive datetime is treated as UTC, `None` passes through.
|
||||
- `load_table` declares those columns `INTEGER` and `upsert_rows` coerces them the same way, so full loads and delta upserts agree on the on-disk representation.
|
||||
- The delta high-watermark for such a column is the stored integer; `delta._bind_watermark(..., epoch_us=True)` reconstructs a real UTC `datetime` before binding, so the source still receives a typed timestamp (and the watermark fix from 1.8.0 keeps holding).
|
||||
|
||||
### Changed
|
||||
- `pyproject.toml` — bumped version to `1.12.0`.
|
||||
- `CacheManager.max_value` / `set_last_synced_at` now accept/return `int` watermarks alongside `str` (the INTEGER-µs watermark round-trips through the `last_synced_at` TEXT column as its digit string).
|
||||
|
||||
---
|
||||
|
||||
## [1.11.0] - 2026-06-09
|
||||
|
||||
### Added
|
||||
|
||||
Reference in New Issue
Block a user