169 lines
10 KiB
Markdown
169 lines
10 KiB
Markdown
# PROJECT.md
|
||
|
||
This file is project-specific. Only include information directly related to the concrete project — goals, current state, architecture decisions, known issues, and tasks.
|
||
|
||
## Origin
|
||
|
||
Curator is a fork of the former **Tagger** project. The tagging, filtering and
|
||
hardlink-tree parts are inherited and keep working as before. On top of that,
|
||
Curator becomes a full **movie library manager (Filmotéka)**.
|
||
|
||
## Core idea
|
||
|
||
Curator manages a personal movie library based on two folders:
|
||
|
||
- **Pool** — the managed repository of video files. This is the **single source
|
||
of truth**. Curator manages the pool itself (insert/remove file), so files are
|
||
never moved by hand. The pool has exactly two top-level folders: **Filmy**
|
||
(movies — tag-based tree) and **Seriály** (series — a "copy-as-is" folder
|
||
mirrored 1:1 into the output; see Design decisions). Every file lives here
|
||
exactly once.
|
||
- **Filmotéka (output)** — a generated, browsable directory tree made only of
|
||
**hardlinks** into the pool (the same mechanism as today's hardlink manager).
|
||
It is fully disposable: deleting the Filmotéka folder loses nothing, because
|
||
it can always be regenerated from the pool.
|
||
|
||
### Workflow
|
||
|
||
1. The user configures two folders: the **pool** and the **Filmotéka output**.
|
||
2. The user picks a video file via "Open file".
|
||
3. Curator opens a dialog to fill in basic info — at minimum the **title/name**
|
||
and a **ČSFD link**.
|
||
4. Curator **renames** the file and **moves** it into the managed pool, and
|
||
writes a **metadata file** describing it.
|
||
5. From the pool, Curator **generates the Filmotéka** — a complex tree of
|
||
hardlinks built from each file's tags/metadata (like the current hardlink
|
||
manager, but driven by the pool).
|
||
6. Deleting the Filmotéka has no effect on the pool; the tree is regenerated on
|
||
demand.
|
||
|
||
## Current state
|
||
|
||
- Inherited from Tagger: `Tag`, `TagManager`, `File` (sidecar metadata),
|
||
`FileManager` (folder scan, filtering, ignore patterns), 3-level config,
|
||
`HardlinkManager` (create/sync/cleanup), pytest suite.
|
||
- Rename Tagger → Curator done across code, spec, config filenames
|
||
(`.Curator.!gtag` / `.Curator.!ftag`) and tests.
|
||
- **PySide6 GUI** (`src/ui/qt_app.py`) reframed around the Filmotéka workflow is
|
||
the entry point; the old tkinter `src/ui/gui.py` is retained for reference.
|
||
- **Pool + Filmotéka wired up:** global config holds `pool_dir` / `filmoteka_dir`;
|
||
`FileManager` creates `Filmy`/`Seriály`, imports movies (copy → `Title.ext`),
|
||
loads the pool, and the GUI generates the Filmotéka tree via `HardlinkManager`.
|
||
- `File` carries `title` + `csfd_link`. **Pool metadata lives in a unified index**
|
||
(`<pool>/.Curator.!index`, see `pool_index.py`); `File` writes there when an
|
||
index is injected, and still falls back to per-file `.!tag` sidecars for
|
||
arbitrary (non-pool) folders.
|
||
|
||
### GUI decision
|
||
|
||
The GUI was **reframed around the Filmotéka** (not kept as a generic tagger) and
|
||
**rewritten in PySide6**: Pool/Filmotéka setup, Import movie, tag-filter sidebar,
|
||
movie table, and one-click Filmotéka generation.
|
||
|
||
## Design decisions
|
||
|
||
- **Metadata storage:** one **unified metadata file** for the whole pool (a
|
||
central index), not per-file sidecars. Justified because Curator owns the pool
|
||
and files are never moved manually, so it is not exposed to path drift.
|
||
- **Import dialog:** **multi-file** — pick several videos at once and give each
|
||
its own **Title** + **ČSFD link** (one row per file, more can be added from the
|
||
dialog), or auto-filled with **"Najít ČSFD odkazy"** (cleans each filename into
|
||
a query and fills the first ČSFD search hit; existing links are kept). A single
|
||
**copy/move** toggle decides whether the sources are copied (default) or moved
|
||
into the pool. Each file is renamed to `Title.ext`. When a
|
||
ČSFD link is given, Curator fetches the movie and assigns Žánr / Rok / Země
|
||
původu / Hodnocení (ten-point band) tags automatically; further tags can be
|
||
added via the UI. Directors and the first 10 actors are fetched and cached too,
|
||
but **deliberately not turned into tags/folders** (there would be too many).
|
||
- **Genres / countries:** a movie can have **multiple genres** and, for a
|
||
co-production, **multiple countries of origin** (ČSFD writes them
|
||
slash-separated, e.g. "USA / Velká Británie"). Each becomes its own tag, so the
|
||
film appears under every matching genre and country branch in the Filmotéka
|
||
(multiple hardlinks).
|
||
- **Pool layout:** two top-level folders — **Filmy** and **Seriály**. Movies are
|
||
the first target; the Seriály branch follows the "copy-as-is" rule below.
|
||
- **Copy-as-is folders (Seriály):** a subfolder inside the pool can be marked as
|
||
**copy / as-is**. For such a folder Curator does **not** build a tag-based tree;
|
||
instead it **mirrors the exact directory hierarchy** from the pool into the
|
||
Filmotéka output, with the files materialized as **hardlinks** into the pool.
|
||
So `pool/Seriály/...` is cloned 1:1 into `output/Seriály/...` (same structure,
|
||
hardlinked files). This is how Seriály work.
|
||
- **File naming:** imported movies are renamed to **`Title.ext`** (no year in the
|
||
filename; year lives in metadata/tags).
|
||
- **Import copy vs move:** by default the original file is **copied** into the
|
||
pool (non-destructive); the import dialog also offers a **move** option that
|
||
relocates the source into the pool instead.
|
||
- **Filmotéka tree layout:** driven by a category → root-folder map
|
||
(`FILMOTEKA_CATEGORY_ROOTS`). At the output root sit the **genre folders
|
||
directly** (`output/Akční/film`, …), next to the copy-as-is mirrors
|
||
(**Seriály**), plus two grouping folders: **`Dle roku`** (`output/Dle
|
||
roku/<rok>/film`) and **`Dle země původu`** (`output/Dle země
|
||
původu/<země>/film`), plus `Dle hodnocení`. Each is a hardlink.
|
||
`HardlinkManager` supports an empty root (tag folders placed directly at the
|
||
output root) and restricts obsolete cleanup to the tag-tree's own top-level
|
||
folders so mirrors are never touched.
|
||
- **Tag schema (config-driven, not hard-coded):** the categories, their ČSFD
|
||
source field + transform, and their Filmotéka folder mapping all live in
|
||
`tag_schema` in the global config (default `config.DEFAULT_TAG_SCHEMA`, edited
|
||
via *Nastavení → Tag schéma…*). Both `apply_csfd_tags` (which fields → tags)
|
||
and the Filmotéka layout (`FileManager.filmoteka_category_roots`) read from it,
|
||
so adding a category or changing a folder rule needs no code change. A category
|
||
can be made filter-only (no folders) by setting its `filmoteka_root` to null.
|
||
The `transform` (e.g. `decade_band`) shapes only the **folder name** — tags keep
|
||
the **exact value** (rating → tag `Hodnocení/90`, folder `Dle hodnocení/90–100 %`);
|
||
it is applied at Filmotéka generation via `filmoteka_category_transforms`.
|
||
- **Per-category filename template** (`filename_template` in a schema entry): the
|
||
hardlink name **inside that category's folders only** is rendered from the
|
||
movie's metadata (`File.name_context`: title/year/rating/ext/stem/filename plus
|
||
any free-form attributes), e.g. a Kolekce with `"{collection_sort} - {title}{ext}"`.
|
||
Other folders and the pool file keep the plain name; applied via
|
||
`filmoteka_category_filename_templates`.
|
||
- **Free-form per-movie attributes** (`File.attributes`, set in the GUI): arbitrary
|
||
`key → value` metadata stored in the index and merged into `name_context`, so
|
||
custom fields like `collection_sort` can drive filename templates.
|
||
- **Tag provenance (ČSFD vs user):** each file records which tags came from ČSFD
|
||
(`csfd_tags`). Re-fetching regenerates only those; user-added tags are kept, so
|
||
changing a movie's ČSFD link refreshes ČSFD tags without losing manual ones.
|
||
|
||
## Tasks
|
||
|
||
# (no open tasks — see Done)
|
||
|
||
## Done
|
||
|
||
- Pool-root and Filmotéka-output folder settings in the global config
|
||
- Filmy / Seriály top-level folder handling in the pool
|
||
- "Import movie" dialog (Title + ČSFD link), copy into pool/Filmy as Title.ext
|
||
- Rename a pooled movie from the app (`FileManager.rename_movie`): renames the
|
||
file in pool/Filmy and moves its metadata to the new index key
|
||
- Remove-from-pool (delete file + its metadata)
|
||
- Generate the Filmotéka hardlink tree from the pool (Rok / Žánr / Země původu /
|
||
Hodnocení)
|
||
- Filmotéka fully regenerable from the pool alone (delete output = no loss)
|
||
- GUI reframed around the Filmotéka and rewritten in PySide6
|
||
- Seriály "copy-as-is" mirror: pool/Seriály cloned 1:1 into the output as
|
||
hardlinks (`HardlinkManager.mirror_as_is`), wired into Filmotéka generation
|
||
- Fixed `media_utils` missing `subprocess` import
|
||
- Unified pool metadata index (`pool_index.py`): one `.Curator.!index` per pool;
|
||
`File` reads/writes it when injected, `FileManager` uses it for the pool
|
||
- Configurable copy-as-is folders (`copyasis_folders` in global config, editable
|
||
from the GUI); each is mirrored 1:1 during Filmotéka generation (Seriály default)
|
||
- README.md written (overview, concepts, workflow, run/build instructions)
|
||
- ČSFD scraping (`csfd.py`, ported from Tagger devel): `File.apply_csfd_tags`
|
||
fetches a movie and assigns Žánr / Rok / Země původu tags (cached in metadata); wired
|
||
into the GUI (auto-fetch on import with a ČSFD link, plus "Načíst tagy z ČSFD").
|
||
Parsing updated for current ČSFD HTML and verified live against Matrix
|
||
(film/9499); HTTPS uses the OS cert store via `truststore` (corporate SSL)
|
||
- ČSFD Anubis anti-bot wall handled: `csfd.py` detects the proof-of-work
|
||
challenge page, solves it (SHA-256 PoW matching the bundled worker JS) and
|
||
replays via a shared `requests.Session`, so Žánr / Rok / Země původu tags load again
|
||
(the "nalezeno 1 film, načteno 0 tagů" symptom). Verified live (Matrix 1999)
|
||
- Removed the inherited Tagger predefined tags: `DEFAULT_TAGS` is now empty
|
||
(no Hodnocení ⭐ / Barva categories) and new files no longer get an automatic
|
||
`Stav/Nové` tag. Tags now come from ČSFD (Žánr / Rok / Země původu) and manual edits.
|
||
Note: `Hodnocení` is still listed in `FILMOTEKA_CATEGORIES`, so that branch is
|
||
simply empty until something assigns a Hodnocení tag again
|
||
- Fixed template cruft: `src/constants.py` made consistent (Curator values,
|
||
`get_version`/`get_debug_mode` API) and `test_constants.py` aligned; removed
|
||
the imported `tagger/` devel dump
|