Add ČSFD Anubis bypass, drop legacy preset tags, rename Země → Země původu

This commit is contained in:
2026-06-12 20:30:14 +02:00
parent 22a14b1e41
commit 86c689b9f1
14 changed files with 349 additions and 146 deletions
+24 -1
View File
@@ -44,7 +44,7 @@ Each version entry uses these sections (include only those that apply):
- Project `README.md` (overview, concepts, workflow, run/build instructions).
- **ČSFD scraping** (`csfd.py`, ported from the Tagger devel branch): fetches
movie data from a ČSFD link (JSON-LD + HTML parsing). `File.apply_csfd_tags`
assigns Žánr / Rok / Země tags and caches the fetched data in the metadata.
assigns Žánr / Rok / Země původu tags and caches the fetched data in the metadata.
The GUI auto-fetches on import when a link is given and offers "Načíst tagy
z ČSFD" for selected movies.
- App startup injects `truststore` so HTTPS uses the OS certificate store —
@@ -56,12 +56,35 @@ Each version entry uses these sections (include only those that apply):
- ČSFD parsing updated for the current site HTML: year is read from JSON-LD
`dateCreated`, and the origin line (now bullet-separated, no commas) is
tokenized so country / year / duration are extracted correctly.
- **ČSFD anti-bot wall (Anubis):** ČSFD now serves a proof-of-work challenge
page instead of the movie, so fetches returned a film with no genres/year
("načteno 0 tagů"). `csfd.py` now detects the Anubis challenge, solves the
SHA-256 proof-of-work the way the bundled worker JS does, and replays the
request through a `requests.Session` (reused across a batch so only the first
fetch pays the PoW cost). Žánr / Rok / Země původu tags load again.
- "Assign tags" dialog crashed on PySide6/Qt6 — `Qt.ItemIsTristate` was renamed
to `Qt.ItemIsAutoTristate`.
### Changed
- ČSFD country tag category renamed **Země → Země původu**. Added
`scripts/migrate_tag_category.py` to rewrite the category in an existing pool
index (backs up `.Curator.!index` first); run against the live pool.
- Filmotéka tree now also builds the **Země původu** branch — it was missing
from `FILMOTEKA_CATEGORIES`, so the country level was never generated. Tree
categories are now Rok / Žánr / Země původu / Hodnocení.
- Movie table trimmed to **Název / Štítky / Velikost** — the Datum and ČSFD
columns were dropped (a ČSFD link is a prerequisite, so its indicator was
always the same).
- All references to "Tagger" renamed to "Curator" (code, spec, config filenames
`.Curator.!gtag` / `.Curator.!ftag`, tests).
- `requires-python` narrowed to `>=3.14,<3.15` (PySide6 compatibility).
### Removed
- Legacy Tagger predefined tags: the always-available **Hodnocení** (⭐ rating)
and **Barva** (color) categories in `TagManager`, and the automatic
**Stav/Nové** tag assigned to every newly imported file. `DEFAULT_TAGS` is now
empty; the pool is driven by ČSFD-derived tags (Žánr / Rok / Země původu).
### Dependencies
- Added `pyside6` (GUI), `requests` + `beautifulsoup4` (ČSFD scraping),
`truststore` (OS cert store for HTTPS). Declared `python-dotenv`, `pillow`,