10 KiB
10 KiB
PROJECT.md
This file is project-specific. Only include information directly related to the concrete project — goals, current state, architecture decisions, known issues, and tasks.
Origin
Curator is a fork of the former Tagger project. The tagging, filtering and hardlink-tree parts are inherited and keep working as before. On top of that, Curator becomes a full movie library manager (Filmotéka).
Core idea
Curator manages a personal movie library based on two folders:
- Pool — the managed repository of video files. This is the single source of truth. Curator manages the pool itself (insert/remove file), so files are never moved by hand. The pool has exactly two top-level folders: Filmy (movies — tag-based tree) and Seriály (series — a "copy-as-is" folder mirrored 1:1 into the output; see Design decisions). Every file lives here exactly once.
- Filmotéka (output) — a generated, browsable directory tree made only of hardlinks into the pool (the same mechanism as today's hardlink manager). It is fully disposable: deleting the Filmotéka folder loses nothing, because it can always be regenerated from the pool.
Workflow
- The user configures two folders: the pool and the Filmotéka output.
- The user picks a video file via "Open file".
- Curator opens a dialog to fill in basic info — at minimum the title/name and a ČSFD link.
- Curator renames the file and moves it into the managed pool, and writes a metadata file describing it.
- From the pool, Curator generates the Filmotéka — a complex tree of hardlinks built from each file's tags/metadata (like the current hardlink manager, but driven by the pool).
- Deleting the Filmotéka has no effect on the pool; the tree is regenerated on demand.
Current state
- Inherited from Tagger:
Tag,TagManager,File(sidecar metadata),FileManager(folder scan, filtering, ignore patterns), 3-level config,HardlinkManager(create/sync/cleanup), pytest suite. - Rename Tagger → Curator done across code, spec, config filenames
(
.Curator.!gtag/.Curator.!ftag) and tests. - PySide6 GUI (
src/ui/qt_app.py) reframed around the Filmotéka workflow is the entry point; the old tkintersrc/ui/gui.pyis retained for reference. - Pool + Filmotéka wired up: global config holds
pool_dir/filmoteka_dir;FileManagercreatesFilmy/Seriály, imports movies (copy →Title.ext), loads the pool, and the GUI generates the Filmotéka tree viaHardlinkManager. Filecarriestitle+csfd_link. Pool metadata lives in a unified index (<pool>/.Curator.!index, seepool_index.py);Filewrites there when an index is injected, and still falls back to per-file.!tagsidecars for arbitrary (non-pool) folders.
GUI decision
The GUI was reframed around the Filmotéka (not kept as a generic tagger) and rewritten in PySide6: Pool/Filmotéka setup, Import movie, tag-filter sidebar, movie table, and one-click Filmotéka generation.
Design decisions
- Metadata storage: one unified metadata file for the whole pool (a central index), not per-file sidecars. Justified because Curator owns the pool and files are never moved manually, so it is not exposed to path drift.
- Import dialog: multi-file — pick several videos at once and give each
its own Title + ČSFD link (one row per file, more can be added from the
dialog), or auto-filled with "Najít ČSFD odkazy" (cleans each filename into
a query and fills the first ČSFD search hit; existing links are kept). A single
copy/move toggle decides whether the sources are copied (default) or moved
into the pool. Each file is renamed to
Title.ext. When a ČSFD link is given, Curator fetches the movie and assigns Žánr / Rok / Země původu / Hodnocení (ten-point band) tags automatically; further tags can be added via the UI. Directors and the first 10 actors are fetched and cached too, but deliberately not turned into tags/folders (there would be too many). - Genres / countries: a movie can have multiple genres and, for a co-production, multiple countries of origin (ČSFD writes them slash-separated, e.g. "USA / Velká Británie"). Each becomes its own tag, so the film appears under every matching genre and country branch in the Filmotéka (multiple hardlinks).
- Pool layout: two top-level folders — Filmy and Seriály. Movies are the first target; the Seriály branch follows the "copy-as-is" rule below.
- Copy-as-is folders (Seriály): a subfolder inside the pool can be marked as
copy / as-is. For such a folder Curator does not build a tag-based tree;
instead it mirrors the exact directory hierarchy from the pool into the
Filmotéka output, with the files materialized as hardlinks into the pool.
So
pool/Seriály/...is cloned 1:1 intooutput/Seriály/...(same structure, hardlinked files). This is how Seriály work. - File naming: imported movies are renamed to
Title.ext(no year in the filename; year lives in metadata/tags). - Import copy vs move: by default the original file is copied into the pool (non-destructive); the import dialog also offers a move option that relocates the source into the pool instead.
- Filmotéka tree layout: driven by a category → root-folder map
(
FILMOTEKA_CATEGORY_ROOTS). At the output root sit the genre folders directly (output/Akční/film, …), next to the copy-as-is mirrors (Seriály), plus two grouping folders:Dle roku(output/Dle roku/<rok>/film) andDle země původu(output/Dle země původu/<země>/film), plusDle hodnocení. Each is a hardlink.HardlinkManagersupports an empty root (tag folders placed directly at the output root) and restricts obsolete cleanup to the tag-tree's own top-level folders so mirrors are never touched. - Tag schema (config-driven, not hard-coded): the categories, their ČSFD
source field + transform, and their Filmotéka folder mapping all live in
tag_schemain the global config (defaultconfig.DEFAULT_TAG_SCHEMA, edited via Nastavení → Tag schéma…). Bothapply_csfd_tags(which fields → tags) and the Filmotéka layout (FileManager.filmoteka_category_roots) read from it, so adding a category or changing a folder rule needs no code change. A category can be made filter-only (no folders) by setting itsfilmoteka_rootto null. Thetransform(e.g.decade_band) shapes only the folder name — tags keep the exact value (rating → tagHodnocení/90, folderDle hodnocení/90–100 %); it is applied at Filmotéka generation viafilmoteka_category_transforms. - Per-category filename template (
filename_templatein a schema entry): the hardlink name inside that category's folders only is rendered from the movie's metadata (File.name_context: title/year/rating/ext/stem/filename plus any free-form attributes), e.g. a Kolekce with"{collection_sort} - {title}{ext}". Other folders and the pool file keep the plain name; applied viafilmoteka_category_filename_templates. - Free-form per-movie attributes (
File.attributes, set in the GUI): arbitrarykey → valuemetadata stored in the index and merged intoname_context, so custom fields likecollection_sortcan drive filename templates. - Tag provenance (ČSFD vs user): each file records which tags came from ČSFD
(
csfd_tags). Re-fetching regenerates only those; user-added tags are kept, so changing a movie's ČSFD link refreshes ČSFD tags without losing manual ones.
Tasks
(no open tasks — see Done)
Done
- Pool-root and Filmotéka-output folder settings in the global config
- Filmy / Seriály top-level folder handling in the pool
- "Import movie" dialog (Title + ČSFD link), copy into pool/Filmy as Title.ext
- Rename a pooled movie from the app (
FileManager.rename_movie): renames the file in pool/Filmy and moves its metadata to the new index key - Remove-from-pool (delete file + its metadata)
- Generate the Filmotéka hardlink tree from the pool (Rok / Žánr / Země původu / Hodnocení)
- Filmotéka fully regenerable from the pool alone (delete output = no loss)
- GUI reframed around the Filmotéka and rewritten in PySide6
- Seriály "copy-as-is" mirror: pool/Seriály cloned 1:1 into the output as
hardlinks (
HardlinkManager.mirror_as_is), wired into Filmotéka generation - Fixed
media_utilsmissingsubprocessimport - Unified pool metadata index (
pool_index.py): one.Curator.!indexper pool;Filereads/writes it when injected,FileManageruses it for the pool - Configurable copy-as-is folders (
copyasis_foldersin global config, editable from the GUI); each is mirrored 1:1 during Filmotéka generation (Seriály default) - README.md written (overview, concepts, workflow, run/build instructions)
- ČSFD scraping (
csfd.py, ported from Tagger devel):File.apply_csfd_tagsfetches a movie and assigns Žánr / Rok / Země původu tags (cached in metadata); wired into the GUI (auto-fetch on import with a ČSFD link, plus "Načíst tagy z ČSFD"). Parsing updated for current ČSFD HTML and verified live against Matrix (film/9499); HTTPS uses the OS cert store viatruststore(corporate SSL) - ČSFD Anubis anti-bot wall handled:
csfd.pydetects the proof-of-work challenge page, solves it (SHA-256 PoW matching the bundled worker JS) and replays via a sharedrequests.Session, so Žánr / Rok / Země původu tags load again (the "nalezeno 1 film, načteno 0 tagů" symptom). Verified live (Matrix 1999) - Removed the inherited Tagger predefined tags:
DEFAULT_TAGSis now empty (no Hodnocení ⭐ / Barva categories) and new files no longer get an automaticStav/Novétag. Tags now come from ČSFD (Žánr / Rok / Země původu) and manual edits. Note:Hodnoceníis still listed inFILMOTEKA_CATEGORIES, so that branch is simply empty until something assigns a Hodnocení tag again - Fixed template cruft:
src/constants.pymade consistent (Curator values,get_version/get_debug_modeAPI) andtest_constants.pyaligned; removed the importedtagger/devel dump