# PROJECT.md This file is project-specific. Only include information directly related to the concrete project — goals, current state, architecture decisions, known issues, and tasks. ## Origin Curator is a fork of the former **Tagger** project. The tagging, filtering and hardlink-tree parts are inherited and keep working as before. On top of that, Curator becomes a full **movie library manager (Filmotéka)**. ## Core idea Curator manages a personal movie library based on two folders: - **Pool** — the managed repository of video files. This is the **single source of truth**. Curator manages the pool itself (insert/remove file), so files are never moved by hand. The pool has exactly two top-level folders: **Filmy** (movies — tag-based tree) and **Seriály** (series — a "copy-as-is" folder mirrored 1:1 into the output; see Design decisions). Every file lives here exactly once. - **Filmotéka (output)** — a generated, browsable directory tree made only of **hardlinks** into the pool (the same mechanism as today's hardlink manager). It is fully disposable: deleting the Filmotéka folder loses nothing, because it can always be regenerated from the pool. ### Workflow 1. The user configures two folders: the **pool** and the **Filmotéka output**. 2. The user picks a video file via "Open file". 3. Curator opens a dialog to fill in basic info — at minimum the **title/name** and a **ČSFD link**. 4. Curator **renames** the file and **moves** it into the managed pool, and writes a **metadata file** describing it. 5. From the pool, Curator **generates the Filmotéka** — a complex tree of hardlinks built from each file's tags/metadata (like the current hardlink manager, but driven by the pool). 6. Deleting the Filmotéka has no effect on the pool; the tree is regenerated on demand. ## Current state - Inherited from Tagger: `Tag`, `TagManager`, `File` (sidecar metadata), `FileManager` (folder scan, filtering, ignore patterns), 3-level config, `HardlinkManager` (create/sync/cleanup), pytest suite. - Rename Tagger → Curator done across code, spec, config filenames (`.Curator.!gtag` / `.Curator.!ftag`) and tests. - **PySide6 GUI** (`src/ui/qt_app.py`) reframed around the Filmotéka workflow is the entry point; the old tkinter `src/ui/gui.py` is retained for reference. - **Pool + Filmotéka wired up:** global config holds `pool_dir` / `filmoteka_dir`; `FileManager` creates `Filmy`/`Seriály`, imports movies (copy → `Title.ext`), loads the pool, and the GUI generates the Filmotéka tree via `HardlinkManager`. - `File` carries `title` + `csfd_link`. **Pool metadata lives in a unified index** (`/.Curator.!index`, see `pool_index.py`); `File` writes there when an index is injected, and still falls back to per-file `.!tag` sidecars for arbitrary (non-pool) folders. ### GUI decision The GUI was **reframed around the Filmotéka** (not kept as a generic tagger) and **rewritten in PySide6**: Pool/Filmotéka setup, Import movie, tag-filter sidebar, movie table, and one-click Filmotéka generation. ## Design decisions - **Metadata storage:** one **unified metadata file** for the whole pool (a central index), not per-file sidecars. Justified because Curator owns the pool and files are never moved manually, so it is not exposed to path drift. - **Import dialog:** **multi-file** — pick several videos at once and give each its own **Title** + **ČSFD link** (one row per file, more can be added from the dialog), or auto-filled with **"Najít ČSFD odkazy"** (cleans each filename into a query and fills the first ČSFD search hit; existing links are kept). A single **copy/move** toggle decides whether the sources are copied (default) or moved into the pool. Each file is renamed to `Title.ext`. When a ČSFD link is given, Curator fetches the movie and assigns Žánr / Rok / Země původu / Hodnocení (ten-point band) tags automatically; further tags can be added via the UI. Directors and the first 10 actors are fetched and cached too, but **deliberately not turned into tags/folders** (there would be too many). - **Genres / countries:** a movie can have **multiple genres** and, for a co-production, **multiple countries of origin** (ČSFD writes them slash-separated, e.g. "USA / Velká Británie"). Each becomes its own tag, so the film appears under every matching genre and country branch in the Filmotéka (multiple hardlinks). - **Pool layout:** two top-level folders — **Filmy** and **Seriály**. Movies are the first target; the Seriály branch follows the "copy-as-is" rule below. - **Copy-as-is folders (Seriály):** a subfolder inside the pool can be marked as **copy / as-is**. For such a folder Curator does **not** build a tag-based tree; instead it **mirrors the exact directory hierarchy** from the pool into the Filmotéka output, with the files materialized as **hardlinks** into the pool. So `pool/Seriály/...` is cloned 1:1 into `output/Seriály/...` (same structure, hardlinked files). This is how Seriály work. - **File naming:** imported movies are renamed to **`Title.ext`** (no year in the filename; year lives in metadata/tags). - **Import copy vs move:** by default the original file is **copied** into the pool (non-destructive); the import dialog also offers a **move** option that relocates the source into the pool instead. - **Filmotéka tree layout:** driven by a category → root-folder map (`FILMOTEKA_CATEGORY_ROOTS`). At the output root sit the **genre folders directly** (`output/Akční/film`, …), next to the copy-as-is mirrors (**Seriály**), plus two grouping folders: **`Dle roku`** (`output/Dle roku//film`) and **`Dle země původu`** (`output/Dle země původu//film`), plus `Dle hodnocení`. Each is a hardlink. `HardlinkManager` supports an empty root (tag folders placed directly at the output root) and restricts obsolete cleanup to the tag-tree's own top-level folders so mirrors are never touched. - **Tag schema (config-driven, not hard-coded):** the categories, their ČSFD source field + transform, and their Filmotéka folder mapping all live in `tag_schema` in the global config (default `config.DEFAULT_TAG_SCHEMA`, edited via *Nastavení → Tag schéma…*). Both `apply_csfd_tags` (which fields → tags) and the Filmotéka layout (`FileManager.filmoteka_category_roots`) read from it, so adding a category or changing a folder rule needs no code change. A category can be made filter-only (no folders) by setting its `filmoteka_root` to null. The `transform` (e.g. `decade_band`) shapes only the **folder name** — tags keep the **exact value** (rating → tag `Hodnocení/90`, folder `Dle hodnocení/90–100 %`); it is applied at Filmotéka generation via `filmoteka_category_transforms`. - **Per-category filename template** (`filename_template` in a schema entry): the hardlink name **inside that category's folders only** is rendered from the movie's metadata (`File.name_context`: title/year/rating/ext/stem/filename plus any free-form attributes), e.g. a Kolekce with `"{collection_sort} - {title}{ext}"`. Other folders and the pool file keep the plain name; applied via `filmoteka_category_filename_templates`. - **Free-form per-movie attributes** (`File.attributes`, set in the GUI): arbitrary `key → value` metadata stored in the index and merged into `name_context`, so custom fields like `collection_sort` can drive filename templates. - **Tag provenance (ČSFD vs user):** each file records which tags came from ČSFD (`csfd_tags`). Re-fetching regenerates only those; user-added tags are kept, so changing a movie's ČSFD link refreshes ČSFD tags without losing manual ones. ## Tasks # (no open tasks — see Done) ## Done - Pool-root and Filmotéka-output folder settings in the global config - Filmy / Seriály top-level folder handling in the pool - "Import movie" dialog (Title + ČSFD link), copy into pool/Filmy as Title.ext - Rename a pooled movie from the app (`FileManager.rename_movie`): renames the file in pool/Filmy and moves its metadata to the new index key - Remove-from-pool (delete file + its metadata) - Generate the Filmotéka hardlink tree from the pool (Rok / Žánr / Země původu / Hodnocení) - Filmotéka fully regenerable from the pool alone (delete output = no loss) - GUI reframed around the Filmotéka and rewritten in PySide6 - Seriály "copy-as-is" mirror: pool/Seriály cloned 1:1 into the output as hardlinks (`HardlinkManager.mirror_as_is`), wired into Filmotéka generation - Fixed `media_utils` missing `subprocess` import - Unified pool metadata index (`pool_index.py`): one `.Curator.!index` per pool; `File` reads/writes it when injected, `FileManager` uses it for the pool - Configurable copy-as-is folders (`copyasis_folders` in global config, editable from the GUI); each is mirrored 1:1 during Filmotéka generation (Seriály default) - README.md written (overview, concepts, workflow, run/build instructions) - ČSFD scraping (`csfd.py`, ported from Tagger devel): `File.apply_csfd_tags` fetches a movie and assigns Žánr / Rok / Země původu tags (cached in metadata); wired into the GUI (auto-fetch on import with a ČSFD link, plus "Načíst tagy z ČSFD"). Parsing updated for current ČSFD HTML and verified live against Matrix (film/9499); HTTPS uses the OS cert store via `truststore` (corporate SSL) - ČSFD Anubis anti-bot wall handled: `csfd.py` detects the proof-of-work challenge page, solves it (SHA-256 PoW matching the bundled worker JS) and replays via a shared `requests.Session`, so Žánr / Rok / Země původu tags load again (the "nalezeno 1 film, načteno 0 tagů" symptom). Verified live (Matrix 1999) - Removed the inherited Tagger predefined tags: `DEFAULT_TAGS` is now empty (no Hodnocení ⭐ / Barva categories) and new files no longer get an automatic `Stav/Nové` tag. Tags now come from ČSFD (Žánr / Rok / Země původu) and manual edits. Note: `Hodnocení` is still listed in `FILMOTEKA_CATEGORIES`, so that branch is simply empty until something assigns a Hodnocení tag again - Fixed template cruft: `src/constants.py` made consistent (Curator values, `get_version`/`get_debug_mode` API) and `test_constants.py` aligned; removed the imported `tagger/` devel dump