Fix cache stampede with double-checked locking in load_table
This commit is contained in:
@@ -6,6 +6,17 @@ All notable changes to this project will be documented in this file.
|
||||
|
||||
---
|
||||
|
||||
## [1.15.0] - 2026-06-11
|
||||
|
||||
### Fixed
|
||||
- **Cache stampede (thundering herd) on cold loads** — the decision to load a table was made *before* the load lock was taken, and `load_table` never re-checked after acquiring it. During a slow cold load of a large table (observed: 212M rows, ~2 h), a second query for the same table passed the pre-lock "not cached" check, queued on the load lock, and then ran a **redundant second full reload** instead of seeing the first had finished — doubling a multi-hour load. `load_table` now does **double-checked locking**: after acquiring the load lock it re-evaluates a caller-supplied predicate (table cached, all needed columns present, not TTL-expired) and skips the load when it is already satisfied. Invisible on small tables; on large ones it removes hours of redundant indexing under concurrent cold-start traffic.
|
||||
|
||||
### Changed
|
||||
- `pyproject.toml` — bumped version to `1.15.0`.
|
||||
- `CacheManager.load_table` gained an optional `recheck` callback (the double-check predicate); `QueryExecutor` supplies it for both column and `SELECT *` loads.
|
||||
|
||||
---
|
||||
|
||||
## [1.14.0] - 2026-06-10
|
||||
|
||||
Follow-up to 1.12.0 from running `datetime_columns` in production: the feature was only half-wired (writes were coerced, reads and query params were not).
|
||||
|
||||
Reference in New Issue
Block a user