138 lines
9.9 KiB
Markdown
138 lines
9.9 KiB
Markdown
# Changelog
|
||
|
||
All notable changes to this project will be documented in this file.
|
||
|
||
## [Unreleased]
|
||
|
||
---
|
||
|
||
## [1.6.0] - 2026-06-05
|
||
|
||
### Added
|
||
- **Secondary indexes** — `CachingEngine(engine, indexes={"VW_X": ["col", ["a", "b"]]})` creates indexes on the in-memory cache to accelerate `WHERE`/`JOIN` lookups. Index columns are auto-loaded so the index exists from the first load, and indexes are recreated after every (re)load and persist in `cache.db`. Combines freely with `delta` and `ttl`.
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `1.6.0`
|
||
|
||
---
|
||
|
||
## [1.5.0] - 2026-06-05
|
||
|
||
### Added
|
||
- **Per-table processing state in `stats`** — `TableStats` now carries `state` (`loading` / `refreshing` / `ready` / `stale` / `error`) and `tracking` (`delta` / `ttl` / `static`), so callers can see whether each table is up to date or being processed. In-progress first loads and failed loads also surface in `stats.tables`.
|
||
- `SQLMEM_FETCH_BATCH` env var (default `10000`) — rows fetched per batch when loading a table.
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `1.5.0`
|
||
- **Large-table loads are streamed in batches** — `load_table` no longer `fetchall()`s the whole table (which double-buffered every row in Python and could OOM/crash on tens of millions of rows). Rows are now fetched `SQLMEM_FETCH_BATCH` at a time into a staging table and swapped in atomically, so peak memory stays bounded, the previous copy stays queryable during a reload, and the network fetch no longer holds the cache lock. Delta catch-ups are streamed the same way.
|
||
- Orphan staging tables left by an interrupted load (crash/backup mid-load) are dropped on startup.
|
||
- Delta upserts compute `row_count` once per refresh instead of a full `COUNT(*)` after every batch (avoids O(rows×batches) work on large catch-ups).
|
||
|
||
---
|
||
|
||
## [1.4.0] - 2026-06-05
|
||
|
||
### Fixed
|
||
- **`decimal.Decimal` (and `datetime`) binding error** — `NUMERIC`/`DECIMAL`/`MONEY` columns from SQL Server (pyodbc) arrive as `decimal.Decimal`, which `sqlite3` cannot bind, crashing the cache load with `type 'decimal.Decimal' is not supported`. Values are now coerced to sqlite-bindable types (`Decimal`→`str`, `datetime`/`date`/`time`→ISO, `uuid.UUID`→`str`, `bytearray`→`bytes`) at the cache boundary — on full load, on delta upsert, and for WHERE parameters. Coercion is local (no global `sqlite3.register_adapter`), so the host application's `sqlite3` behaviour is untouched. Cache columns are `TEXT`, so the conversion is lossless and exact (no rounding).
|
||
|
||
### Added
|
||
- **Incremental (delta) refresh** — `CachingEngine(engine, delta={...})` with `DeltaConfig(change_column, key_columns)`. Delta-tracked tables are kept in sync by pulling only changed rows (`WHERE change_column >= watermark`) and upserting them by key, instead of full reloads.
|
||
- Data-driven high-watermark = `max(change_column)` cached, persisted in `cache.db`; `>=` overlap + idempotent upsert so no row is missed and boundary rows are harmlessly re-read.
|
||
- Catch-up on startup (since last shutdown) and a background thread refreshing every `SQLMEM_REFRESH_INTERVAL` seconds (default 300); `engine.refresh()` triggers a pull on demand.
|
||
- Primary key is auto-discovered from the source DB (`inspect(engine).get_pk_constraint`) when `key_columns` is omitted; required explicitly for views (raises `ValueError`).
|
||
- **Per-table TTL (time-based refresh)** — `CachingEngine(engine, ttl={"VW_X": 300})` for tables with no change column that can't be delta-synced. The cached copy is guaranteed never older than the TTL: a query touching an expired table triggers a full reload before it is answered (read-time guarantee), and the background thread proactively reloads expired tables. TTL age uses the persisted `last_refresh_at`, so the bound holds across restarts. A table in both `delta` and `ttl` raises `ValueError`.
|
||
- `DeltaConfig` exported from the public API.
|
||
- `engine.reset()` — wipes the whole cache (RAM + `cache.db`) for a clean rebuild after structural source changes.
|
||
- `SQLMEM_REFRESH_INTERVAL` env var (default `300`) — background refresh tick for delta pulls and proactive TTL reloads.
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `1.4.0`
|
||
- `cache.py` — schema version bumped to `3`; `_sqlmem_tables` gained a `last_synced_at` watermark column. New methods: `execute_in_memory` (lock-serialized read), `get_table_columns`, `create_unique_index`, `get/set_last_synced_at`, `max_value`, `upsert_rows`, `seconds_since_refresh`, `reset`. Existing on-disk caches are discarded and rebuilt on load.
|
||
- `executor.py` — delta-tracked tables augment their column set with key/change columns (unique key index + initial watermark); TTL-tracked tables full-reload at read time when expired; in-memory reads go through the cache lock.
|
||
|
||
---
|
||
|
||
## [1.2.0] - 2026-06-04
|
||
|
||
### Added
|
||
- **Parametrized queries (R1)** — `execute(sql, params)` accepts positional (`?` tuple/list) and named (`:name` dict) parameters; passed straight to SQLite during in-memory filtering. Cache loads still fetch the full table (parameters are not applied to source fetches).
|
||
- **JOIN support (R2)** — multi-table SELECTs are parsed into per-table column sets; each table is cached independently and the JOIN runs in the in-memory SQLite. Columns in a multi-table query must be qualified by table or alias.
|
||
- **`SELECT *` support (R3)** — wildcard (and `alias.*`) queries discover all columns from the source DB, cache the whole table, and mark it `is_full` so later column queries are guaranteed cache hits without re-fetch.
|
||
- **Three-part table names (R4)** — `[catalog].[schema].[table]` is parsed to its base name for caching; the in-memory query is rewritten to strip catalog/schema prefixes so it runs under SQLite.
|
||
- `SQLMEM_SQL_DIALECT` env var (default `tsql`) — sqlglot dialect used to parse incoming SQL; T-SQL also accepts ANSI SQL and MSSQL bracket quoting.
|
||
- `CacheManager.discover_columns()` and `CacheManager.is_table_full()`; `load_table()` gained a `full` flag.
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `1.2.0`
|
||
- `parser.py` — `ParsedQuery.table: str` replaced by `tables: list[str]` plus `columns_by_table`, `sqlite_sql`, `params`, and `wildcard_tables`; SQL is parsed with the configured dialect and rendered to SQLite for execution.
|
||
- `executor.py` — loads each referenced table independently and applies query parameters during in-memory execution.
|
||
- `cache.py` — schema version bumped to `2`; `_sqlmem_tables` gained an `is_full` column (existing on-disk caches are discarded and rebuilt on load).
|
||
|
||
---
|
||
|
||
## [1.1.0] - 2026-06-03
|
||
|
||
### Added
|
||
- `Stats` and `TableStats` frozen dataclasses — snapshot of runtime cache statistics (hit/miss/refetch counts, per-table row count, columns, last refresh timestamp)
|
||
- `StatsCollector` — internal thread-safe counter; increments on every cache hit, miss, and re-fetch
|
||
- `engine.stats` property — returns a `Stats` snapshot at any point in time
|
||
- `Stats` and `TableStats` exported from the public API
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `1.1.0`
|
||
|
||
---
|
||
|
||
## [1.0.0] - 2026-06-03
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `1.0.0`
|
||
|
||
---
|
||
|
||
## [0.4.0] - 2026-06-03
|
||
|
||
### Added
|
||
- `add_sink(sink, *, level, **kwargs)` — public API for routing sqlmem log records to any loguru-compatible sink (stream, file, callable); supports all loguru `logger.add()` kwargs including `rotation`, `retention`, etc.
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `0.4.0`
|
||
- `config.py` — replaced destructive `logger.remove()` + forced default sink with `logger.disable("sqlmem")`; sqlmem is now silent by default and does not interfere with the host application's logging setup
|
||
|
||
---
|
||
|
||
## [0.3.0] - 2026-06-03
|
||
|
||
### Added
|
||
- `README.md` — full project documentation: architecture overview, quick start, cache behaviour, persistence, configuration, exceptions, logging, and limitations
|
||
|
||
### Changed
|
||
- `pyproject.toml` — bumped version to `0.3.0`
|
||
- `parser.py` — `_extract_columns` now deduplicates column names while preserving order
|
||
- `.gitignore` — added `.env` and `.env.*` to prevent accidental commit of environment files
|
||
|
||
### Security
|
||
- Removed `.env` from git tracking (`git rm --cached`)
|
||
|
||
---
|
||
|
||
## [0.2.0] - 2026-06-01
|
||
|
||
### Added
|
||
- Project specification in `project.md` — architecture, API design, cache backend, metadata schema, logging strategy, and TODO for future features (JOIN, SELECT * support)
|
||
- `.gitignore` for Python/Poetry project
|
||
- `pyproject.toml` dependencies: `sqlglot`, `sqlalchemy`, `loguru`, `python-dotenv`; dev dependencies: `pytest`, `ruff`, `mypy`
|
||
- `src/sqlmem/` package structure with src layout
|
||
- `src/sqlmem/exceptions.py` — `ReadOnlyError` (blocks INSERT/UPDATE/DELETE), `UnsupportedQueryError` (blocks JOIN and SELECT *)
|
||
- `src/sqlmem/config.py` — loads `.env`, configures `loguru` with DEBUG/INFO level based on `SQLMEM_DEBUG`
|
||
- `src/sqlmem/_meta.py` — package version constant
|
||
- `src/sqlmem/parser.py` — SQL Parser using `sqlglot`; extracts table and columns from SELECT, raises on writes/JOIN/wildcard
|
||
- `src/sqlmem/registry.py` — Column Registry; accumulates requested columns per table, detects missing columns requiring re-fetch
|
||
- `src/sqlmem/cache.py` — Cache Manager; SQLite in-memory storage, load from `cache.db` on startup (with schema version check), hourly backup thread, `atexit`/SIGTERM flush, metadata tables (`_sqlmem_meta`, `_sqlmem_tables`, `_sqlmem_columns`)
|
||
- `src/sqlmem/executor.py` — Query Executor; cache hit/miss logic, re-fetch on new columns with WARNING log
|
||
- `src/sqlmem/engine.py` — `CachingEngine` wrapper; public API compatible with SQLAlchemy, `invalidate(table)` for manual cache clearing
|
||
- `src/sqlmem/__init__.py` — public exports: `CachingEngine`, `ReadOnlyError`, `UnsupportedQueryError`
|
||
- `tests/test_parser.py` — parser tests: SELECT parsing, ReadOnlyError, UnsupportedQueryError
|
||
- `tests/test_cache.py` — cache tests: load, data correctness, metadata, disk backup/reload
|
||
- `tests/test_registry.py` — registry tests: accumulation, needs_refetch, table isolation
|