Add support for query parameters, JOINs, SELECT * and three-part table names
This commit is contained in:
@@ -6,6 +6,24 @@ All notable changes to this project will be documented in this file.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## [1.2.0] - 2026-06-04
|
||||||
|
|
||||||
|
### Added
|
||||||
|
- **Parametrized queries (R1)** — `execute(sql, params)` accepts positional (`?` tuple/list) and named (`:name` dict) parameters; passed straight to SQLite during in-memory filtering. Cache loads still fetch the full table (parameters are not applied to source fetches).
|
||||||
|
- **JOIN support (R2)** — multi-table SELECTs are parsed into per-table column sets; each table is cached independently and the JOIN runs in the in-memory SQLite. Columns in a multi-table query must be qualified by table or alias.
|
||||||
|
- **`SELECT *` support (R3)** — wildcard (and `alias.*`) queries discover all columns from the source DB, cache the whole table, and mark it `is_full` so later column queries are guaranteed cache hits without re-fetch.
|
||||||
|
- **Three-part table names (R4)** — `[catalog].[schema].[table]` is parsed to its base name for caching; the in-memory query is rewritten to strip catalog/schema prefixes so it runs under SQLite.
|
||||||
|
- `SQLMEM_SQL_DIALECT` env var (default `tsql`) — sqlglot dialect used to parse incoming SQL; T-SQL also accepts ANSI SQL and MSSQL bracket quoting.
|
||||||
|
- `CacheManager.discover_columns()` and `CacheManager.is_table_full()`; `load_table()` gained a `full` flag.
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- `pyproject.toml` — bumped version to `1.2.0`
|
||||||
|
- `parser.py` — `ParsedQuery.table: str` replaced by `tables: list[str]` plus `columns_by_table`, `sqlite_sql`, `params`, and `wildcard_tables`; SQL is parsed with the configured dialect and rendered to SQLite for execution.
|
||||||
|
- `executor.py` — loads each referenced table independently and applies query parameters during in-memory execution.
|
||||||
|
- `cache.py` — schema version bumped to `2`; `_sqlmem_tables` gained an `is_full` column (existing on-disk caches are discarded and rebuilt on load).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## [1.1.0] - 2026-06-03
|
## [1.1.0] - 2026-06-03
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
|||||||
@@ -22,6 +22,8 @@ Application (SQLAlchemy)
|
|||||||
|
|
||||||
On the first SELECT for a table, SQLmem fetches the required rows from the database and stores them in an in-memory SQLite instance. Subsequent queries for the same columns hit the in-memory cache with no database round-trip. When a query requests a column not yet in cache, SQLmem re-fetches the table with the expanded column set.
|
On the first SELECT for a table, SQLmem fetches the required rows from the database and stores them in an in-memory SQLite instance. Subsequent queries for the same columns hit the in-memory cache with no database round-trip. When a query requests a column not yet in cache, SQLmem re-fetches the table with the expanded column set.
|
||||||
|
|
||||||
|
Parametrized queries, JOINs and `SELECT *` are all supported. Each table referenced in a JOIN is cached independently; the JOIN itself runs in the in-memory SQLite. Query parameters are applied during in-memory filtering, so cache loads always fetch the full table regardless of the `WHERE` values.
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -45,9 +47,25 @@ engine = CachingEngine(base_engine)
|
|||||||
results = engine.execute("SELECT id, name FROM users WHERE status = 'active'")
|
results = engine.execute("SELECT id, name FROM users WHERE status = 'active'")
|
||||||
for row in results:
|
for row in results:
|
||||||
print(row["id"], row["name"])
|
print(row["id"], row["name"])
|
||||||
|
|
||||||
|
# Positional parameters (?):
|
||||||
|
engine.execute("SELECT id, name FROM users WHERE id = ?", ("42",))
|
||||||
|
|
||||||
|
# Named parameters (:name):
|
||||||
|
engine.execute("SELECT id, name FROM users WHERE id = :id", {"id": "42"})
|
||||||
|
|
||||||
|
# JOINs — each table is cached independently:
|
||||||
|
engine.execute(
|
||||||
|
"SELECT u.name, o.total FROM users u "
|
||||||
|
"JOIN orders o ON o.user_id = u.id WHERE u.id = ?",
|
||||||
|
("42",),
|
||||||
|
)
|
||||||
|
|
||||||
|
# SELECT * — loads and caches the whole table:
|
||||||
|
engine.execute("SELECT * FROM users")
|
||||||
```
|
```
|
||||||
|
|
||||||
`execute()` returns a list of dicts. Results are compatible with standard iteration patterns.
|
`execute()` returns a list of dicts. Parameters are passed straight through to SQLite, so positional (`?`) and named (`:name`) styles both work. Results are compatible with standard iteration patterns.
|
||||||
|
|
||||||
## Cache behaviour
|
## Cache behaviour
|
||||||
|
|
||||||
@@ -57,10 +75,12 @@ for row in results:
|
|||||||
Query 1: SELECT a, b FROM orders → cache miss → fetch orders(a, b) from DB
|
Query 1: SELECT a, b FROM orders → cache miss → fetch orders(a, b) from DB
|
||||||
Query 2: SELECT a, d FROM orders → new column d → re-fetch orders(a, b, d)
|
Query 2: SELECT a, d FROM orders → new column d → re-fetch orders(a, b, d)
|
||||||
Query 3: SELECT b FROM orders → cache hit, no DB query
|
Query 3: SELECT b FROM orders → cache hit, no DB query
|
||||||
Query 4: SELECT * FROM orders → UnsupportedQueryError (wildcard not supported)
|
Query 4: SELECT * FROM orders → fetches all columns, marks the table fully cached
|
||||||
Query 5: SELECT a FROM orders JOIN … → UnsupportedQueryError (JOIN not supported)
|
Query 5: SELECT a FROM orders → cache hit (table already full)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**`SELECT *`** loads every column and marks the table as fully cached, so any later column query is a guaranteed cache hit with no re-fetch.
|
||||||
|
|
||||||
**Writes are blocked** — INSERT, UPDATE, and DELETE raise `ReadOnlyError`. SQLmem is a read-only cache.
|
**Writes are blocked** — INSERT, UPDATE, and DELETE raise `ReadOnlyError`. SQLmem is a read-only cache.
|
||||||
|
|
||||||
## Persistence
|
## Persistence
|
||||||
@@ -89,13 +109,14 @@ Set via environment variables or a `.env` file:
|
|||||||
| `SQLMEM_DEBUG` | `false` | `true` enables DEBUG-level logging |
|
| `SQLMEM_DEBUG` | `false` | `true` enables DEBUG-level logging |
|
||||||
| `SQLMEM_CACHE_DB` | `cache.db` | Path to the on-disk persistence file |
|
| `SQLMEM_CACHE_DB` | `cache.db` | Path to the on-disk persistence file |
|
||||||
| `SQLMEM_BACKUP_INTERVAL` | `3600` | Backup interval in seconds |
|
| `SQLMEM_BACKUP_INTERVAL` | `3600` | Backup interval in seconds |
|
||||||
|
| `SQLMEM_SQL_DIALECT` | `tsql` | sqlglot dialect used to parse incoming SQL (e.g. `tsql`, `postgres`, `mysql`) |
|
||||||
|
|
||||||
## Exceptions
|
## Exceptions
|
||||||
|
|
||||||
| Exception | When raised |
|
| Exception | When raised |
|
||||||
|---|---|
|
|---|---|
|
||||||
| `ReadOnlyError` | INSERT, UPDATE, or DELETE statement |
|
| `ReadOnlyError` | INSERT, UPDATE, or DELETE statement |
|
||||||
| `UnsupportedQueryError` | `SELECT *` or any JOIN |
|
| `UnsupportedQueryError` | non-SELECT statement, `SELECT` without `FROM`, or an unqualified column in a multi-table query |
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from sqlmem import ReadOnlyError, UnsupportedQueryError
|
from sqlmem import ReadOnlyError, UnsupportedQueryError
|
||||||
@@ -118,7 +139,8 @@ Set `SQLMEM_DEBUG=true` in `.env` to make the default level DEBUG when no explic
|
|||||||
|
|
||||||
## Limitations
|
## Limitations
|
||||||
|
|
||||||
- `SELECT *` and JOIN queries are not supported.
|
- In a multi-table (JOIN) query, every column must be qualified with its table or alias; unqualified columns raise `UnsupportedQueryError`.
|
||||||
|
- Tables are keyed by their base name — two tables with the same name in different schemas share one cache entry.
|
||||||
- No distributed cache backend (Redis etc.).
|
- No distributed cache backend (Redis etc.).
|
||||||
- No transactional consistency guarantees.
|
- No transactional consistency guarantees.
|
||||||
- Write operations (INSERT/UPDATE/DELETE) are always blocked.
|
- Write operations (INSERT/UPDATE/DELETE) are always blocked.
|
||||||
|
|||||||
+20
-13
@@ -59,11 +59,12 @@ with engine.connect() as conn:
|
|||||||
## Komponenty
|
## Komponenty
|
||||||
|
|
||||||
### 1. SQL Parser
|
### 1. SQL Parser
|
||||||
- Detekuje typ dotazu (SELECT / zápis).
|
- Detekuje typ dotazu (SELECT / zápis); zápisy vyhodí `ReadOnlyError`.
|
||||||
- Extrahuje názvy tabulek z FROM a JOIN klauzulí.
|
- Extrahuje názvy tabulek z FROM a JOIN klauzulí (podpora více tabulek).
|
||||||
- Extrahuje seznam požadovaných sloupců.
|
- Mapuje požadované sloupce na tabulky přes aliasy (`columns_by_table`).
|
||||||
- Detekuje `SELECT *` (wildcard) a JOIN — vyhodí `UnsupportedQueryError`.
|
- Detekuje `SELECT *` a `alias.*` → tabulka se načte celá (`wildcard_tables`).
|
||||||
- Rozhoduje, zda je dotaz obsloužitelný z cache.
|
- Parsuje přes dialekt `SQLMEM_SQL_DIALECT` (default `tsql`) a renderuje in-memory dotaz do SQLite (stripuje catalog/schema prefixy).
|
||||||
|
- Parametry (`?` / `:name`) předává beze změny do in-memory SQLite.
|
||||||
|
|
||||||
### 2. Column Registry
|
### 2. Column Registry
|
||||||
|
|
||||||
@@ -71,12 +72,12 @@ Modul se **za běhu učí**, jaké sloupce z každé tabulky aplikace potřebuje
|
|||||||
|
|
||||||
**Logika při každém příchozím dotazu:**
|
**Logika při každém příchozím dotazu:**
|
||||||
|
|
||||||
1. Parser detekuje `SELECT *` nebo JOIN → vyhodí `UnsupportedQueryError` (není implementováno).
|
1. Parser extrahuje `(tabulka, sloupce)` pro každou tabulku v dotazu (i přes JOIN).
|
||||||
2. Parser extrahuje `(tabulka, sloupce)` z dotazu.
|
2. Registry provede **union** nově požadovaných sloupců s již známými.
|
||||||
3. Registry provede **union** nově požadovaných sloupců s již známými.
|
3. Cache Manager zkontroluje, zda cache pro danou tabulku obsahuje všechny potřebné sloupce:
|
||||||
4. Cache Manager zkontroluje, zda cache pro danou tabulku obsahuje všechny potřebné sloupce:
|
|
||||||
- **Ano** → dotaz jde přímo do SQLite RAM (cache hit).
|
- **Ano** → dotaz jde přímo do SQLite RAM (cache hit).
|
||||||
- **Ne** → re-fetch tabulky z DB s rozšířenou sadou sloupců → přepíše cache → dotaz do SQLite RAM.
|
- **Ne** → re-fetch tabulky z DB s rozšířenou sadou sloupců → přepíše cache → dotaz do SQLite RAM.
|
||||||
|
4. `SELECT *` načte celou tabulku a označí ji jako `is_full` → další dotazy na libovolný sloupec jsou cache hit.
|
||||||
|
|
||||||
**Příklad akumulace sloupců:**
|
**Příklad akumulace sloupců:**
|
||||||
|
|
||||||
@@ -84,8 +85,8 @@ Modul se **za běhu učí**, jaké sloupce z každé tabulky aplikace potřebuje
|
|||||||
Dotaz 1: SELECT A, B FROM T3 → Registry: T3 = {A, B} → fetch T3(A,B) z DB
|
Dotaz 1: SELECT A, B FROM T3 → Registry: T3 = {A, B} → fetch T3(A,B) z DB
|
||||||
Dotaz 2: SELECT A, D FROM T3 → Registry: T3 = {A, B, D} → re-fetch T3(A,B,D) z DB
|
Dotaz 2: SELECT A, D FROM T3 → Registry: T3 = {A, B, D} → re-fetch T3(A,B,D) z DB
|
||||||
Dotaz 3: SELECT B FROM T3 → cache hit, žádný DB dotaz
|
Dotaz 3: SELECT B FROM T3 → cache hit, žádný DB dotaz
|
||||||
Dotaz 4: SELECT * FROM T3 → UnsupportedQueryError (wildcard není podporován)
|
Dotaz 4: SELECT * FROM T3 → full load všech sloupců, tabulka označena is_full
|
||||||
Dotaz 5: SELECT A FROM T3 JOIN T4 ... → UnsupportedQueryError (JOIN není podporován)
|
Dotaz 5: SELECT A FROM T3 JOIN T4 ON … → každá tabulka cachována zvlášť, JOIN běží v RAM
|
||||||
```
|
```
|
||||||
|
|
||||||
**Metadata tabulka `_sqlmem_columns`** (uložena v SQLite):
|
**Metadata tabulka `_sqlmem_columns`** (uložena v SQLite):
|
||||||
@@ -184,10 +185,16 @@ SQLMEM_DEBUG=true # DEBUG level — podrobný výpis každého dotazu, cache o
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Hotové funkce (dříve TODO)
|
||||||
|
|
||||||
|
- [x] **Parametrizované dotazy**: `execute(sql, params)` — poziční `?` i pojmenované `:name`.
|
||||||
|
- [x] **Podpora `SELECT *` (wildcard)**: Načte celou tabulku do cache, označí ji jako `is_full` — další dotazy na libovolný sloupec jsou vždy cache hit bez re-fetch.
|
||||||
|
- [x] **Podpora JOIN**: Parser extrahuje sloupce z každé joinované tabulky zvlášť, Column Registry je sleduje nezávisle. Cache Manager zajistí, že všechny potřebné tabulky jsou v paměti před spuštěním dotazu.
|
||||||
|
- [x] **Třídílné názvy tabulek**: `[catalog].[schema].[table]` se cachuje pod base name, in-memory dotaz prefix stripuje.
|
||||||
|
|
||||||
## TODO — budoucí funkce
|
## TODO — budoucí funkce
|
||||||
|
|
||||||
- **Podpora `SELECT *` (wildcard)**: Načte celou tabulku do cache, označí ji jako `full` — další dotazy na libovolný sloupec jsou vždy cache hit bez re-fetch.
|
- **TTL na úrovni tabulky**: automatické vypršení cache po nastaveném čase.
|
||||||
- **Podpora JOIN**: Parser extrahuje sloupce z každé joinované tabulky zvlášť, Column Registry je sleduje nezávisle. Cache Manager zajistí, že všechny potřebné tabulky jsou v paměti před spuštěním dotazu.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
+1
-1
@@ -1,6 +1,6 @@
|
|||||||
[project]
|
[project]
|
||||||
name = "sqlmem"
|
name = "sqlmem"
|
||||||
version = "1.1.0"
|
version = "1.2.0"
|
||||||
description = ""
|
description = ""
|
||||||
authors = [
|
authors = [
|
||||||
{name = "jan.doubravsky@gmail.com"}
|
{name = "jan.doubravsky@gmail.com"}
|
||||||
|
|||||||
+32
-9
@@ -9,7 +9,7 @@ from loguru import logger
|
|||||||
|
|
||||||
import sqlmem._meta as _meta
|
import sqlmem._meta as _meta
|
||||||
|
|
||||||
SCHEMA_VERSION = 1
|
SCHEMA_VERSION = 2
|
||||||
|
|
||||||
|
|
||||||
class CacheManager:
|
class CacheManager:
|
||||||
@@ -40,7 +40,8 @@ class CacheManager:
|
|||||||
CREATE TABLE IF NOT EXISTS _sqlmem_tables (
|
CREATE TABLE IF NOT EXISTS _sqlmem_tables (
|
||||||
table_name TEXT PRIMARY KEY,
|
table_name TEXT PRIMARY KEY,
|
||||||
last_refresh_at TEXT NOT NULL,
|
last_refresh_at TEXT NOT NULL,
|
||||||
row_count INTEGER
|
row_count INTEGER,
|
||||||
|
is_full INTEGER NOT NULL DEFAULT 0
|
||||||
);
|
);
|
||||||
CREATE TABLE IF NOT EXISTS _sqlmem_columns (
|
CREATE TABLE IF NOT EXISTS _sqlmem_columns (
|
||||||
table_name TEXT NOT NULL,
|
table_name TEXT NOT NULL,
|
||||||
@@ -112,17 +113,18 @@ class CacheManager:
|
|||||||
logger.info("SIGTERM received — flushing cache to disk.")
|
logger.info("SIGTERM received — flushing cache to disk.")
|
||||||
self._backup_to_disk()
|
self._backup_to_disk()
|
||||||
|
|
||||||
def mark_table_refreshed(self, table: str, row_count: int) -> None:
|
def mark_table_refreshed(self, table: str, row_count: int, full: bool = False) -> None:
|
||||||
with self._lock:
|
with self._lock:
|
||||||
self._mem_conn.execute(
|
self._mem_conn.execute(
|
||||||
"""
|
"""
|
||||||
INSERT INTO _sqlmem_tables (table_name, last_refresh_at, row_count)
|
INSERT INTO _sqlmem_tables (table_name, last_refresh_at, row_count, is_full)
|
||||||
VALUES (?, ?, ?)
|
VALUES (?, ?, ?, ?)
|
||||||
ON CONFLICT(table_name) DO UPDATE SET
|
ON CONFLICT(table_name) DO UPDATE SET
|
||||||
last_refresh_at = excluded.last_refresh_at,
|
last_refresh_at = excluded.last_refresh_at,
|
||||||
row_count = excluded.row_count
|
row_count = excluded.row_count,
|
||||||
|
is_full = excluded.is_full
|
||||||
""",
|
""",
|
||||||
(table, _now(), row_count),
|
(table, _now(), row_count, int(full)),
|
||||||
)
|
)
|
||||||
self._mem_conn.commit()
|
self._mem_conn.commit()
|
||||||
|
|
||||||
@@ -132,7 +134,28 @@ class CacheManager:
|
|||||||
).fetchone()
|
).fetchone()
|
||||||
return row is not None
|
return row is not None
|
||||||
|
|
||||||
def load_table(self, table: str, columns: list[str], source_conn: sqlite3.Connection) -> None:
|
def is_table_full(self, table: str) -> bool:
|
||||||
|
"""True if the whole table (all columns) is cached — a SELECT * cache hit."""
|
||||||
|
row = self._mem_conn.execute(
|
||||||
|
"SELECT is_full FROM _sqlmem_tables WHERE table_name = ?", (table,)
|
||||||
|
).fetchone()
|
||||||
|
return bool(row and row[0])
|
||||||
|
|
||||||
|
def discover_columns(self, table: str, source_conn: sqlite3.Connection) -> list[str]:
|
||||||
|
"""Return all column names of *table* from the source DB without fetching rows."""
|
||||||
|
logger.debug(f"Discovering columns of {table!r} from source DB")
|
||||||
|
cursor = source_conn.execute(f"SELECT * FROM {table} WHERE 1 = 0")
|
||||||
|
columns = [desc[0] for desc in cursor.description]
|
||||||
|
logger.debug(f"{table!r} has columns: {columns}")
|
||||||
|
return columns
|
||||||
|
|
||||||
|
def load_table(
|
||||||
|
self,
|
||||||
|
table: str,
|
||||||
|
columns: list[str],
|
||||||
|
source_conn: sqlite3.Connection,
|
||||||
|
full: bool = False,
|
||||||
|
) -> None:
|
||||||
cols = ", ".join(columns)
|
cols = ", ".join(columns)
|
||||||
logger.info(f"Fetching {table!r} columns [{cols}] from source DB")
|
logger.info(f"Fetching {table!r} columns [{cols}] from source DB")
|
||||||
rows = source_conn.execute(f"SELECT {cols} FROM {table}").fetchall()
|
rows = source_conn.execute(f"SELECT {cols} FROM {table}").fetchall()
|
||||||
@@ -145,7 +168,7 @@ class CacheManager:
|
|||||||
self._mem_conn.executemany(f"INSERT INTO {table} VALUES ({placeholders})", rows)
|
self._mem_conn.executemany(f"INSERT INTO {table} VALUES ({placeholders})", rows)
|
||||||
self._mem_conn.commit()
|
self._mem_conn.commit()
|
||||||
|
|
||||||
self.mark_table_refreshed(table, len(rows))
|
self.mark_table_refreshed(table, len(rows), full)
|
||||||
logger.info(f"Table {table!r} cached ({len(rows)} rows, columns: {columns})")
|
logger.info(f"Table {table!r} cached ({len(rows)} rows, columns: {columns})")
|
||||||
|
|
||||||
def close(self) -> None:
|
def close(self) -> None:
|
||||||
|
|||||||
@@ -9,6 +9,9 @@ load_dotenv()
|
|||||||
DEBUG = os.getenv("SQLMEM_DEBUG", "false").lower() == "true"
|
DEBUG = os.getenv("SQLMEM_DEBUG", "false").lower() == "true"
|
||||||
CACHE_DB_PATH = Path(os.getenv("SQLMEM_CACHE_DB", "cache.db"))
|
CACHE_DB_PATH = Path(os.getenv("SQLMEM_CACHE_DB", "cache.db"))
|
||||||
BACKUP_INTERVAL_SECONDS = int(os.getenv("SQLMEM_BACKUP_INTERVAL", "3600"))
|
BACKUP_INTERVAL_SECONDS = int(os.getenv("SQLMEM_BACKUP_INTERVAL", "3600"))
|
||||||
|
# Dialect used by sqlglot to parse incoming SQL. Defaults to T-SQL (SQL Server),
|
||||||
|
# which also accepts ANSI SQL. In-memory queries are always rendered to SQLite.
|
||||||
|
SQL_DIALECT = os.getenv("SQLMEM_SQL_DIALECT", "tsql")
|
||||||
|
|
||||||
# Silent by default — callers opt in via add_sink().
|
# Silent by default — callers opt in via add_sink().
|
||||||
logger.disable("sqlmem")
|
logger.disable("sqlmem")
|
||||||
|
|||||||
@@ -1,4 +1,5 @@
|
|||||||
import sqlite3
|
import sqlite3
|
||||||
|
from typing import cast
|
||||||
|
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
from sqlalchemy.engine import Engine
|
from sqlalchemy.engine import Engine
|
||||||
@@ -6,7 +7,7 @@ from sqlalchemy.engine import Engine
|
|||||||
from .cache import CacheManager
|
from .cache import CacheManager
|
||||||
from .config import BACKUP_INTERVAL_SECONDS, CACHE_DB_PATH
|
from .config import BACKUP_INTERVAL_SECONDS, CACHE_DB_PATH
|
||||||
from .executor import QueryExecutor
|
from .executor import QueryExecutor
|
||||||
from .parser import parse
|
from .parser import Params, parse
|
||||||
from .registry import ColumnRegistry
|
from .registry import ColumnRegistry
|
||||||
from .stats import Stats, StatsCollector
|
from .stats import Stats, StatsCollector
|
||||||
|
|
||||||
@@ -25,10 +26,10 @@ class CachingEngine:
|
|||||||
def stats(self) -> Stats:
|
def stats(self) -> Stats:
|
||||||
return self._stats.snapshot(self._cache.connection)
|
return self._stats.snapshot(self._cache.connection)
|
||||||
|
|
||||||
def execute(self, sql: str) -> list[dict]:
|
def execute(self, sql: str, params: Params = None) -> list[dict]:
|
||||||
parsed = parse(sql)
|
parsed = parse(sql, params)
|
||||||
with self._source_engine.connect() as sa_conn:
|
with self._source_engine.connect() as sa_conn:
|
||||||
raw_conn: sqlite3.Connection = sa_conn.connection.dbapi_connection
|
raw_conn = cast(sqlite3.Connection, sa_conn.connection.dbapi_connection)
|
||||||
executor = QueryExecutor(self._cache, self._registry, raw_conn, self._stats)
|
executor = QueryExecutor(self._cache, self._registry, raw_conn, self._stats)
|
||||||
return executor.execute(parsed)
|
return executor.execute(parsed)
|
||||||
|
|
||||||
|
|||||||
+40
-10
@@ -22,13 +22,43 @@ class QueryExecutor:
|
|||||||
self._stats = stats
|
self._stats = stats
|
||||||
|
|
||||||
def execute(self, parsed: ParsedQuery) -> list[dict]:
|
def execute(self, parsed: ParsedQuery) -> list[dict]:
|
||||||
table = parsed.table
|
for table in parsed.tables:
|
||||||
columns = parsed.columns
|
self._ensure_table(table, parsed)
|
||||||
|
return self._run_in_memory(parsed)
|
||||||
|
|
||||||
|
def _ensure_table(self, table: str, parsed: ParsedQuery) -> None:
|
||||||
|
if table in parsed.wildcard_tables:
|
||||||
|
self._ensure_full(table)
|
||||||
|
else:
|
||||||
|
self._ensure_columns(table, parsed.columns_by_table[table])
|
||||||
|
|
||||||
|
def _ensure_full(self, table: str) -> None:
|
||||||
|
"""Load every column of *table* (SELECT * / t.*), refetching unless already full."""
|
||||||
|
if self._cache.is_table_cached(table) and self._cache.is_table_full(table):
|
||||||
|
logger.debug(f"Cache hit (full): {table!r}")
|
||||||
|
self._stats.record_hit()
|
||||||
|
return
|
||||||
|
|
||||||
|
if self._cache.is_table_cached(table):
|
||||||
|
logger.warning(f"Re-fetching {table!r} in full — SELECT * requested.")
|
||||||
|
self._stats.record_refetch()
|
||||||
|
else:
|
||||||
|
self._stats.record_miss()
|
||||||
|
|
||||||
|
columns = self._cache.discover_columns(table, self._source_conn)
|
||||||
|
self._cache.load_table(table, columns, self._source_conn, full=True)
|
||||||
|
self._registry.update(table, columns)
|
||||||
|
|
||||||
|
def _ensure_columns(self, table: str, columns: list[str]) -> None:
|
||||||
|
"""Load *table* with at least *columns*, refetching only when columns are missing."""
|
||||||
missing = self._registry.needs_refetch(table, columns)
|
missing = self._registry.needs_refetch(table, columns)
|
||||||
table_cached = self._cache.is_table_cached(table)
|
table_cached = self._cache.is_table_cached(table)
|
||||||
|
|
||||||
if missing or not table_cached:
|
if not missing and table_cached:
|
||||||
|
logger.debug(f"Cache hit: {table!r} columns={columns}")
|
||||||
|
self._stats.record_hit()
|
||||||
|
return
|
||||||
|
|
||||||
if table_cached and missing:
|
if table_cached and missing:
|
||||||
logger.warning(
|
logger.warning(
|
||||||
f"Re-fetching {table!r} — new columns requested: {missing}. "
|
f"Re-fetching {table!r} — new columns requested: {missing}. "
|
||||||
@@ -37,18 +67,18 @@ class QueryExecutor:
|
|||||||
self._stats.record_refetch()
|
self._stats.record_refetch()
|
||||||
else:
|
else:
|
||||||
self._stats.record_miss()
|
self._stats.record_miss()
|
||||||
|
|
||||||
all_columns = list(self._registry.get_columns(table)) + missing
|
all_columns = list(self._registry.get_columns(table)) + missing
|
||||||
self._cache.load_table(table, all_columns, self._source_conn)
|
self._cache.load_table(table, all_columns, self._source_conn)
|
||||||
self._registry.update(table, all_columns)
|
self._registry.update(table, all_columns)
|
||||||
else:
|
|
||||||
logger.debug(f"Cache hit: {table!r} columns={columns}")
|
|
||||||
self._stats.record_hit()
|
|
||||||
|
|
||||||
return self._run_in_memory(parsed)
|
|
||||||
|
|
||||||
def _run_in_memory(self, parsed: ParsedQuery) -> list[dict]:
|
def _run_in_memory(self, parsed: ParsedQuery) -> list[dict]:
|
||||||
logger.debug(f"Executing in SQLite RAM: {parsed.original_sql!r}")
|
logger.debug(f"Executing in SQLite RAM: {parsed.sqlite_sql!r} params={parsed.params!r}")
|
||||||
cursor = self._cache.connection.execute(parsed.original_sql)
|
conn = self._cache.connection
|
||||||
|
if parsed.params is None:
|
||||||
|
cursor = conn.execute(parsed.sqlite_sql)
|
||||||
|
else:
|
||||||
|
cursor = conn.execute(parsed.sqlite_sql, parsed.params)
|
||||||
col_names = [desc[0] for desc in cursor.description]
|
col_names = [desc[0] for desc in cursor.description]
|
||||||
rows = cursor.fetchall()
|
rows = cursor.fetchall()
|
||||||
return [dict(zip(col_names, row)) for row in rows]
|
return [dict(zip(col_names, row)) for row in rows]
|
||||||
|
|||||||
+105
-39
@@ -1,25 +1,34 @@
|
|||||||
from dataclasses import dataclass
|
from dataclasses import dataclass, field
|
||||||
|
|
||||||
import sqlglot
|
import sqlglot
|
||||||
import sqlglot.expressions as exp
|
import sqlglot.expressions as exp
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
|
from .config import SQL_DIALECT
|
||||||
from .exceptions import ReadOnlyError, UnsupportedQueryError
|
from .exceptions import ReadOnlyError, UnsupportedQueryError
|
||||||
|
|
||||||
WRITE_TYPES = (exp.Insert, exp.Update, exp.Delete)
|
WRITE_TYPES = (exp.Insert, exp.Update, exp.Delete)
|
||||||
|
SQLITE_DIALECT = "sqlite"
|
||||||
|
|
||||||
|
# Parameters accepted by execute(): positional (tuple/list of ``?``) or named (dict of ``:name``).
|
||||||
|
Params = tuple | list | dict | None
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class ParsedQuery:
|
class ParsedQuery:
|
||||||
table: str
|
tables: list[str]
|
||||||
columns: list[str]
|
columns_by_table: dict[str, list[str]]
|
||||||
|
sqlite_sql: str
|
||||||
original_sql: str
|
original_sql: str
|
||||||
|
params: Params = None
|
||||||
|
# Tables that must be loaded in full (SELECT * / t.* / referenced without explicit columns).
|
||||||
|
wildcard_tables: set[str] = field(default_factory=set)
|
||||||
|
|
||||||
|
|
||||||
def parse(sql: str) -> ParsedQuery:
|
def parse(sql: str, params: Params = None) -> ParsedQuery:
|
||||||
logger.debug(f"Parsing SQL: {sql!r}")
|
logger.debug(f"Parsing SQL: {sql!r}")
|
||||||
|
|
||||||
statement = sqlglot.parse_one(sql)
|
statement = sqlglot.parse_one(sql, dialect=SQL_DIALECT)
|
||||||
|
|
||||||
if isinstance(statement, WRITE_TYPES):
|
if isinstance(statement, WRITE_TYPES):
|
||||||
raise ReadOnlyError(
|
raise ReadOnlyError(
|
||||||
@@ -29,47 +38,104 @@ def parse(sql: str) -> ParsedQuery:
|
|||||||
if not isinstance(statement, exp.Select):
|
if not isinstance(statement, exp.Select):
|
||||||
raise UnsupportedQueryError(f"Only SELECT statements are supported, got: {sql!r}")
|
raise UnsupportedQueryError(f"Only SELECT statements are supported, got: {sql!r}")
|
||||||
|
|
||||||
_check_joins(statement)
|
tables, alias_map = _extract_tables(statement)
|
||||||
_check_wildcard(statement)
|
if not tables:
|
||||||
|
raise UnsupportedQueryError("SELECT without FROM is not supported.")
|
||||||
|
|
||||||
table = _extract_table(statement)
|
wildcard_tables = _extract_wildcards(statement, tables, alias_map)
|
||||||
columns = _extract_columns(statement)
|
columns_by_table = _extract_columns(statement, tables, alias_map, wildcard_tables)
|
||||||
|
|
||||||
logger.debug(f"Parsed → table={table!r}, columns={columns}")
|
# A table that appears in FROM/JOIN but contributes no explicit column must
|
||||||
return ParsedQuery(table=table, columns=columns, original_sql=sql)
|
# still be present for the in-memory query — load it in full.
|
||||||
|
for table in tables:
|
||||||
|
if table not in wildcard_tables and not columns_by_table.get(table):
|
||||||
|
wildcard_tables.add(table)
|
||||||
|
columns_by_table.pop(table, None)
|
||||||
|
|
||||||
|
sqlite_sql = _to_sqlite(statement)
|
||||||
|
|
||||||
|
logger.debug(
|
||||||
|
f"Parsed → tables={tables}, columns={columns_by_table}, "
|
||||||
|
f"wildcard={wildcard_tables}, params={params!r}"
|
||||||
|
)
|
||||||
|
return ParsedQuery(
|
||||||
|
tables=tables,
|
||||||
|
columns_by_table=columns_by_table,
|
||||||
|
sqlite_sql=sqlite_sql,
|
||||||
|
original_sql=sql,
|
||||||
|
params=params,
|
||||||
|
wildcard_tables=wildcard_tables,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def _check_joins(statement: exp.Select) -> None:
|
def _extract_tables(statement: exp.Select) -> tuple[list[str], dict[str, str]]:
|
||||||
if statement.find(exp.Join):
|
"""Return real table names (first-seen order) and an alias→real-name map."""
|
||||||
raise UnsupportedQueryError("JOIN is not supported yet. Use simple single-table SELECT.")
|
real_names: list[str] = []
|
||||||
|
alias_map: dict[str, str] = {}
|
||||||
|
for table in statement.find_all(exp.Table):
|
||||||
|
name = table.name
|
||||||
|
if name not in real_names:
|
||||||
|
real_names.append(name)
|
||||||
|
alias_map[name] = name
|
||||||
|
if table.alias:
|
||||||
|
alias_map[table.alias] = name
|
||||||
|
return real_names, alias_map
|
||||||
|
|
||||||
|
|
||||||
def _check_wildcard(statement: exp.Select) -> None:
|
def _extract_wildcards(
|
||||||
|
statement: exp.Select, tables: list[str], alias_map: dict[str, str]
|
||||||
|
) -> set[str]:
|
||||||
|
"""Detect ``SELECT *`` (all tables) and ``alias.*`` (one table) in the projection."""
|
||||||
|
wildcard: set[str] = set()
|
||||||
|
for projection in statement.expressions:
|
||||||
|
if isinstance(projection, exp.Star):
|
||||||
|
return set(tables)
|
||||||
|
if isinstance(projection, exp.Column) and isinstance(projection.this, exp.Star):
|
||||||
|
qualifier = projection.table
|
||||||
|
wildcard.add(alias_map.get(qualifier, qualifier))
|
||||||
|
return wildcard
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_columns(
|
||||||
|
statement: exp.Select,
|
||||||
|
tables: list[str],
|
||||||
|
alias_map: dict[str, str],
|
||||||
|
wildcard_tables: set[str],
|
||||||
|
) -> dict[str, list[str]]:
|
||||||
|
"""Map each table to the deduplicated columns referenced anywhere in the query."""
|
||||||
|
single = tables[0] if len(tables) == 1 else None
|
||||||
|
columns: dict[str, list[str]] = {}
|
||||||
|
seen: dict[str, set[str]] = {}
|
||||||
|
|
||||||
for col in statement.find_all(exp.Column):
|
for col in statement.find_all(exp.Column):
|
||||||
if isinstance(col.this, exp.Star):
|
if isinstance(col.this, exp.Star):
|
||||||
raise UnsupportedQueryError("SELECT * is not supported yet. Specify columns explicitly.")
|
continue
|
||||||
if statement.find(exp.Star):
|
qualifier = col.table
|
||||||
raise UnsupportedQueryError("SELECT * is not supported yet. Specify columns explicitly.")
|
if qualifier:
|
||||||
|
table = alias_map.get(qualifier, qualifier)
|
||||||
|
elif single is not None:
|
||||||
|
table = single
|
||||||
|
else:
|
||||||
|
raise UnsupportedQueryError(
|
||||||
|
f"Unqualified column {col.name!r} is ambiguous in a multi-table query; "
|
||||||
|
"qualify it with its table or alias."
|
||||||
|
)
|
||||||
|
if table in wildcard_tables:
|
||||||
|
continue
|
||||||
|
bucket = seen.setdefault(table, set())
|
||||||
|
if col.name not in bucket:
|
||||||
|
bucket.add(col.name)
|
||||||
|
columns.setdefault(table, []).append(col.name)
|
||||||
|
|
||||||
|
|
||||||
def _extract_table(statement: exp.Select) -> str:
|
|
||||||
from_clause = statement.find(exp.From)
|
|
||||||
if not from_clause:
|
|
||||||
raise UnsupportedQueryError("SELECT without FROM is not supported.")
|
|
||||||
table = from_clause.find(exp.Table)
|
|
||||||
if not table:
|
|
||||||
raise UnsupportedQueryError("Could not extract table name from query.")
|
|
||||||
return table.name
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_columns(statement: exp.Select) -> list[str]:
|
|
||||||
seen: set[str] = set()
|
|
||||||
columns: list[str] = []
|
|
||||||
for col in statement.find_all(exp.Column):
|
|
||||||
name = col.name
|
|
||||||
if name not in seen:
|
|
||||||
seen.add(name)
|
|
||||||
columns.append(name)
|
|
||||||
if not columns:
|
|
||||||
raise UnsupportedQueryError("Could not extract column names from query.")
|
|
||||||
return columns
|
return columns
|
||||||
|
|
||||||
|
|
||||||
|
def _to_sqlite(statement: exp.Select) -> str:
|
||||||
|
"""Render the statement as SQLite SQL, stripping catalog/schema prefixes.
|
||||||
|
|
||||||
|
Mutates *statement* in place; callers must extract metadata beforehand.
|
||||||
|
"""
|
||||||
|
for table in statement.find_all(exp.Table):
|
||||||
|
table.set("db", None)
|
||||||
|
table.set("catalog", None)
|
||||||
|
return statement.sql(dialect=SQLITE_DIALECT)
|
||||||
|
|||||||
@@ -1,5 +1,4 @@
|
|||||||
import sqlite3
|
import sqlite3
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,5 @@
|
|||||||
import importlib
|
import importlib
|
||||||
|
|
||||||
import pytest
|
|
||||||
|
|
||||||
import sqlmem.config as cfg
|
import sqlmem.config as cfg
|
||||||
|
|
||||||
|
|||||||
+48
-5
@@ -1,5 +1,4 @@
|
|||||||
import sqlite3
|
import sqlite3
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
from sqlalchemy import create_engine
|
from sqlalchemy import create_engine
|
||||||
@@ -215,16 +214,60 @@ def test_delete_raises_readonly(engine):
|
|||||||
engine.execute("DELETE FROM products WHERE id = '1'")
|
engine.execute("DELETE FROM products WHERE id = '1'")
|
||||||
|
|
||||||
|
|
||||||
def test_join_raises_unsupported(engine):
|
def test_ambiguous_unqualified_join_column_raises(engine):
|
||||||
with pytest.raises(UnsupportedQueryError):
|
with pytest.raises(UnsupportedQueryError):
|
||||||
engine.execute(
|
engine.execute(
|
||||||
"SELECT p.name, o.qty FROM products p JOIN orders o ON p.id = o.product_id"
|
"SELECT name FROM products p JOIN orders o ON p.id = o.product_id"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_select_star_raises_unsupported(engine):
|
# ---------------------------------------------------------------------------
|
||||||
with pytest.raises(UnsupportedQueryError):
|
# R1 — parametrized queries
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_positional_param(engine):
|
||||||
|
rows = engine.execute("SELECT id, name FROM products WHERE id = ?", ("1",))
|
||||||
|
assert rows == [{"id": "1", "name": "Widget"}]
|
||||||
|
|
||||||
|
|
||||||
|
def test_named_param(engine):
|
||||||
|
rows = engine.execute("SELECT name FROM products WHERE id = :id", {"id": "2"})
|
||||||
|
assert rows == [{"name": "Gadget"}]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# R2 — JOIN support
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_join_two_tables(engine):
|
||||||
|
rows = engine.execute(
|
||||||
|
"SELECT p.name, o.qty FROM products p "
|
||||||
|
"JOIN orders o ON p.id = o.product_id WHERE p.id = ?",
|
||||||
|
("1",),
|
||||||
|
)
|
||||||
|
assert rows == [{"name": "Widget", "qty": "2"}]
|
||||||
|
|
||||||
|
|
||||||
|
def test_join_caches_both_tables(engine):
|
||||||
|
engine.execute(
|
||||||
|
"SELECT p.name, o.qty FROM products p JOIN orders o ON p.id = o.product_id"
|
||||||
|
)
|
||||||
|
assert engine._cache.is_table_cached("products") is True
|
||||||
|
assert engine._cache.is_table_cached("orders") is True
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# R3 — SELECT *
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_select_star_returns_all_columns(engine):
|
||||||
|
rows = engine.execute("SELECT * FROM products WHERE id = '1'")
|
||||||
|
assert rows == [{"id": "1", "name": "Widget", "price": "9.99"}]
|
||||||
|
|
||||||
|
|
||||||
|
def test_select_star_marks_table_full(engine):
|
||||||
engine.execute("SELECT * FROM products")
|
engine.execute("SELECT * FROM products")
|
||||||
|
assert engine._cache.is_table_full("products") is True
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -0,0 +1,122 @@
|
|||||||
|
import sqlite3
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from sqlmem.cache import CacheManager
|
||||||
|
from sqlmem.executor import QueryExecutor
|
||||||
|
from sqlmem.parser import parse
|
||||||
|
from sqlmem.registry import ColumnRegistry
|
||||||
|
from sqlmem.stats import StatsCollector
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def source_conn():
|
||||||
|
conn = sqlite3.connect(":memory:")
|
||||||
|
conn.executescript(
|
||||||
|
"""
|
||||||
|
CREATE TABLE users (id TEXT, name TEXT, status TEXT);
|
||||||
|
INSERT INTO users VALUES ('1', 'alice', 'active'), ('2', 'bob', 'inactive');
|
||||||
|
CREATE TABLE orders (id TEXT, user_id TEXT, total TEXT, title TEXT);
|
||||||
|
INSERT INTO orders VALUES ('10', '1', '99', 'first'), ('11', '2', '5', 'second');
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
yield conn
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def executor(tmp_path, source_conn):
|
||||||
|
cache = CacheManager(db_path=tmp_path / "cache.db", backup_interval=9999)
|
||||||
|
registry = ColumnRegistry(cache.connection)
|
||||||
|
stats = StatsCollector()
|
||||||
|
ex = QueryExecutor(cache, registry, source_conn, stats)
|
||||||
|
yield ex
|
||||||
|
cache.close()
|
||||||
|
|
||||||
|
|
||||||
|
def run(executor, sql, params=None):
|
||||||
|
return executor.execute(parse(sql, params))
|
||||||
|
|
||||||
|
|
||||||
|
# --- R1: parameters ---------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_param_filters_in_memory(executor):
|
||||||
|
rows = run(executor, "SELECT id, name FROM users WHERE id = ?", ("1",))
|
||||||
|
assert rows == [{"id": "1", "name": "alice"}]
|
||||||
|
|
||||||
|
|
||||||
|
def test_param_no_match(executor):
|
||||||
|
rows = run(executor, "SELECT name FROM users WHERE id = ?", ("999",))
|
||||||
|
assert rows == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_named_params(executor):
|
||||||
|
rows = run(executor, "SELECT name FROM users WHERE id = :id", {"id": "2"})
|
||||||
|
assert rows == [{"name": "bob"}]
|
||||||
|
|
||||||
|
|
||||||
|
# --- cache hit / miss / refetch --------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_cache_hit_does_not_refetch(executor):
|
||||||
|
run(executor, "SELECT name FROM users")
|
||||||
|
run(executor, "SELECT name FROM users")
|
||||||
|
assert executor._stats.hits == 1
|
||||||
|
assert executor._stats.misses == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_new_column_triggers_refetch(executor):
|
||||||
|
run(executor, "SELECT name FROM users")
|
||||||
|
run(executor, "SELECT name, status FROM users")
|
||||||
|
assert executor._stats.misses == 1
|
||||||
|
assert executor._stats.refetches == 1
|
||||||
|
|
||||||
|
|
||||||
|
# --- R2: JOINs --------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_join_across_two_tables(executor):
|
||||||
|
rows = run(
|
||||||
|
executor,
|
||||||
|
"SELECT u.name, o.title FROM users u "
|
||||||
|
"JOIN orders o ON o.user_id = u.id WHERE u.id = ?",
|
||||||
|
("1",),
|
||||||
|
)
|
||||||
|
assert rows == [{"name": "alice", "title": "first"}]
|
||||||
|
|
||||||
|
|
||||||
|
def test_join_caches_each_table_independently(executor):
|
||||||
|
run(
|
||||||
|
executor,
|
||||||
|
"SELECT u.name, o.title FROM users u JOIN orders o ON o.user_id = u.id",
|
||||||
|
)
|
||||||
|
# two distinct tables loaded → two misses
|
||||||
|
assert executor._stats.misses == 2
|
||||||
|
assert executor._cache.is_table_cached("users")
|
||||||
|
assert executor._cache.is_table_cached("orders")
|
||||||
|
|
||||||
|
|
||||||
|
# --- R3: SELECT * -----------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_select_star_returns_all_columns(executor):
|
||||||
|
rows = run(executor, "SELECT * FROM users WHERE id = ?", ("1",))
|
||||||
|
assert rows == [{"id": "1", "name": "alice", "status": "active"}]
|
||||||
|
|
||||||
|
|
||||||
|
def test_select_star_marks_table_full_and_hits(executor):
|
||||||
|
run(executor, "SELECT * FROM users")
|
||||||
|
run(executor, "SELECT * FROM users")
|
||||||
|
assert executor._cache.is_table_full("users")
|
||||||
|
assert executor._stats.misses == 1
|
||||||
|
assert executor._stats.hits == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_column_query_after_star_is_a_hit(executor):
|
||||||
|
run(executor, "SELECT * FROM users")
|
||||||
|
run(executor, "SELECT name FROM users")
|
||||||
|
# full table already cached → specific column is a hit, no refetch
|
||||||
|
assert executor._stats.refetches == 0
|
||||||
|
assert executor._stats.hits == 1
|
||||||
+81
-9
@@ -6,16 +6,22 @@ from sqlmem.parser import parse
|
|||||||
|
|
||||||
def test_simple_select():
|
def test_simple_select():
|
||||||
result = parse("SELECT name, email FROM users WHERE status = 'active'")
|
result = parse("SELECT name, email FROM users WHERE status = 'active'")
|
||||||
assert result.table == "users"
|
assert result.tables == ["users"]
|
||||||
|
cols = result.columns_by_table["users"]
|
||||||
# WHERE columns are also extracted — needed for in-memory SQLite filtering
|
# WHERE columns are also extracted — needed for in-memory SQLite filtering
|
||||||
assert {"name", "email"}.issubset(set(result.columns))
|
assert {"name", "email"}.issubset(set(cols))
|
||||||
assert "status" in result.columns
|
assert "status" in cols
|
||||||
|
|
||||||
|
|
||||||
def test_multiple_columns():
|
def test_multiple_columns():
|
||||||
result = parse("SELECT a, b, c FROM orders")
|
result = parse("SELECT a, b, c FROM orders")
|
||||||
assert result.table == "orders"
|
assert result.tables == ["orders"]
|
||||||
assert set(result.columns) == {"a", "b", "c"}
|
assert set(result.columns_by_table["orders"]) == {"a", "b", "c"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_columns_deduplicated_in_order():
|
||||||
|
result = parse("SELECT a, a, b FROM t WHERE a > 1")
|
||||||
|
assert result.columns_by_table["t"] == ["a", "b"]
|
||||||
|
|
||||||
|
|
||||||
def test_insert_raises_readonly():
|
def test_insert_raises_readonly():
|
||||||
@@ -33,11 +39,77 @@ def test_delete_raises_readonly():
|
|||||||
parse("DELETE FROM users WHERE id = 1")
|
parse("DELETE FROM users WHERE id = 1")
|
||||||
|
|
||||||
|
|
||||||
def test_wildcard_raises_unsupported():
|
def test_select_without_from_raises():
|
||||||
with pytest.raises(UnsupportedQueryError):
|
with pytest.raises(UnsupportedQueryError):
|
||||||
parse("SELECT * FROM users")
|
parse("SELECT 1")
|
||||||
|
|
||||||
|
|
||||||
def test_join_raises_unsupported():
|
# --- R1: parameters ---------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_params_stored():
|
||||||
|
result = parse("SELECT name FROM users WHERE id = ?", ("7189790",))
|
||||||
|
assert result.params == ("7189790",)
|
||||||
|
assert "?" in result.sqlite_sql
|
||||||
|
|
||||||
|
|
||||||
|
def test_named_params_preserved():
|
||||||
|
result = parse("SELECT name FROM users WHERE id = :id", {"id": 1})
|
||||||
|
assert ":id" in result.sqlite_sql
|
||||||
|
|
||||||
|
|
||||||
|
# --- R2: JOINs --------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_join_extracts_all_tables():
|
||||||
|
result = parse(
|
||||||
|
"SELECT a.id, b.title FROM users a "
|
||||||
|
"JOIN orders b ON a.id = b.user_id WHERE a.id = ?",
|
||||||
|
(1,),
|
||||||
|
)
|
||||||
|
assert set(result.tables) == {"users", "orders"}
|
||||||
|
assert "id" in result.columns_by_table["users"]
|
||||||
|
assert "title" in result.columns_by_table["orders"]
|
||||||
|
# join + where columns resolved to their tables via alias
|
||||||
|
assert "user_id" in result.columns_by_table["orders"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_join_unqualified_column_is_ambiguous():
|
||||||
with pytest.raises(UnsupportedQueryError):
|
with pytest.raises(UnsupportedQueryError):
|
||||||
parse("SELECT a.name, b.title FROM users a JOIN orders b ON a.id = b.user_id")
|
parse("SELECT name FROM users a JOIN orders b ON a.id = b.user_id")
|
||||||
|
|
||||||
|
|
||||||
|
# --- R3: SELECT * -----------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_wildcard_marks_table_full():
|
||||||
|
result = parse("SELECT * FROM users")
|
||||||
|
assert result.wildcard_tables == {"users"}
|
||||||
|
assert result.columns_by_table == {}
|
||||||
|
|
||||||
|
|
||||||
|
def test_qualified_wildcard_marks_only_that_table():
|
||||||
|
result = parse(
|
||||||
|
"SELECT u.*, o.total FROM users u JOIN orders o ON u.id = o.user_id"
|
||||||
|
)
|
||||||
|
assert "users" in result.wildcard_tables
|
||||||
|
assert "orders" not in result.wildcard_tables
|
||||||
|
assert "total" in result.columns_by_table["orders"]
|
||||||
|
|
||||||
|
|
||||||
|
# --- R4: three-part names (MSSQL brackets) ----------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_three_part_name_uses_base_table():
|
||||||
|
result = parse(
|
||||||
|
"SELECT [PRODUCT_PRODUCTNR], [PRAT_NAME] "
|
||||||
|
"FROM [DP_PIM].[dbo].[VW_P_PRATVALUES] WHERE PRODUCT_PRODUCTNR = ?",
|
||||||
|
("7189790",),
|
||||||
|
)
|
||||||
|
assert result.tables == ["VW_P_PRATVALUES"]
|
||||||
|
cols = result.columns_by_table["VW_P_PRATVALUES"]
|
||||||
|
assert {"PRODUCT_PRODUCTNR", "PRAT_NAME"}.issubset(set(cols))
|
||||||
|
# in-memory SQL must drop the catalog/schema prefix
|
||||||
|
assert "DP_PIM" not in result.sqlite_sql
|
||||||
|
assert "dbo" not in result.sqlite_sql
|
||||||
|
assert "VW_P_PRATVALUES" in result.sqlite_sql
|
||||||
|
|||||||
Reference in New Issue
Block a user