Add ČSFD Anubis bypass, drop legacy preset tags, rename Země → Země původu

2026-06-12 20:30:14 +02:00
parent 22a14b1e41
commit 86c689b9f1
14 changed files with 349 additions and 146 deletions
@@ -34,4 +34,4 @@ build/
 AGENTS.md
 CLAUDE.md
 DESIGN_DOCUMENT.md
-.claude/
+.claude/
@@ -44,7 +44,7 @@ Each version entry uses these sections (include only those that apply):
 - Project `README.md` (overview, concepts, workflow, run/build instructions).
 - **ČSFD scraping** (`csfd.py`, ported from the Tagger devel branch): fetches
  movie data from a ČSFD link (JSON-LD + HTML parsing). `File.apply_csfd_tags`
-  assigns Žánr / Rok / Země tags and caches the fetched data in the metadata.
+  assigns Žánr / Rok / Země původu tags and caches the fetched data in the metadata.
  The GUI auto-fetches on import when a link is given and offers "Načíst tagy
  z ČSFD" for selected movies.
 - App startup injects `truststore` so HTTPS uses the OS certificate store —
@@ -56,12 +56,35 @@ Each version entry uses these sections (include only those that apply):
 - ČSFD parsing updated for the current site HTML: year is read from JSON-LD
  `dateCreated`, and the origin line (now bullet-separated, no commas) is
  tokenized so country / year / duration are extracted correctly.
 - **ČSFD anti-bot wall (Anubis):** ČSFD now serves a proof-of-work challenge
  page instead of the movie, so fetches returned a film with no genres/year
  ("načteno 0 tagů"). `csfd.py` now detects the Anubis challenge, solves the
  SHA-256 proof-of-work the way the bundled worker JS does, and replays the
  request through a `requests.Session` (reused across a batch so only the first
  fetch pays the PoW cost). Žánr / Rok / Země původu tags load again.
 - "Assign tags" dialog crashed on PySide6/Qt6 — `Qt.ItemIsTristate` was renamed
  to `Qt.ItemIsAutoTristate`.
 ### Changed
 - ČSFD country tag category renamed **Země → Země původu**. Added
  `scripts/migrate_tag_category.py` to rewrite the category in an existing pool
  index (backs up `.Curator.!index` first); run against the live pool.
 - Filmotéka tree now also builds the **Země původu** branch — it was missing
  from `FILMOTEKA_CATEGORIES`, so the country level was never generated. Tree
  categories are now Rok / Žánr / Země původu / Hodnocení.
 - Movie table trimmed to **Název / Štítky / Velikost** — the Datum and ČSFD
  columns were dropped (a ČSFD link is a prerequisite, so its indicator was
  always the same).
 - All references to "Tagger" renamed to "Curator" (code, spec, config filenames
  `.Curator.!gtag` / `.Curator.!ftag`, tests).
 - `requires-python` narrowed to `>=3.14,<3.15` (PySide6 compatibility).
 ### Removed
 - Legacy Tagger predefined tags: the always-available **Hodnocení** (⭐ rating)
  and **Barva** (color) categories in `TagManager`, and the automatic
  **Stav/Nové** tag assigned to every newly imported file. `DEFAULT_TAGS` is now
  empty; the pool is driven by ČSFD-derived tags (Žánr / Rok / Země původu).
 ### Dependencies
 - Added `pyside6` (GUI), `requests` + `beautifulsoup4` (ČSFD scraping),
  `truststore` (OS cert store for HTTPS). Declared `python-dotenv`, `pillow`,
@@ -67,7 +67,7 @@ movie table, and one-click Filmotéka generation.
  and files are never moved manually, so it is not exposed to path drift.
 - **Import dialog:** collects only **Title** + **ČSFD link**. The file is renamed
  to `Title.ext`. When a ČSFD link is given, Curator fetches the movie and assigns
-  Žánr / Rok / Země tags automatically; further tags can be added via the UI.
+  Žánr / Rok / Země původu tags automatically; further tags can be added via the UI.
 - **Genres:** a movie can have **multiple genres**, so it appears under each of
  its genre branches in the Filmotéka (multiple hardlinks).
 - **Pool layout:** two top-level folders — **Filmy** and **Seriály**. Movies are
@@ -84,7 +84,7 @@ movie table, and one-click Filmotéka generation.
  the source is left in place.
 - **Filmotéka tree:** **one level per category** — `output/Category/Tag/film`
  (hardlink), same shape as the current hardlink manager. For now the tree is
-  built from these categories: **Rok**, **Žánr**, **Hodnocení**.
+  built from these categories: **Rok**, **Žánr**, **Země původu**, **Hodnocení**.
 ## Tasks
@@ -96,7 +96,8 @@ movie table, and one-click Filmotéka generation.
 - Filmy / Seriály top-level folder handling in the pool
 - "Import movie" dialog (Title + ČSFD link), copy into pool/Filmy as Title.ext
 - Remove-from-pool (delete file + its metadata)
- Generate the Filmotéka hardlink tree from the pool (Rok / Žánr / Hodnocení)
+- Generate the Filmotéka hardlink tree from the pool (Rok / Žánr / Země původu /
  Hodnocení)
 - Filmotéka fully regenerable from the pool alone (delete output = no loss)
 - GUI reframed around the Filmotéka and rewritten in PySide6
 - Seriály "copy-as-is" mirror: pool/Seriály cloned 1:1 into the output as
@@ -108,10 +109,19 @@ movie table, and one-click Filmotéka generation.
  from the GUI); each is mirrored 1:1 during Filmotéka generation (Seriály default)
 - README.md written (overview, concepts, workflow, run/build instructions)
 - ČSFD scraping (`csfd.py`, ported from Tagger devel): `File.apply_csfd_tags`
-  fetches a movie and assigns Žánr / Rok / Země tags (cached in metadata); wired
+  fetches a movie and assigns Žánr / Rok / Země původu tags (cached in metadata); wired
  into the GUI (auto-fetch on import with a ČSFD link, plus "Načíst tagy z ČSFD").
  Parsing updated for current ČSFD HTML and verified live against Matrix
  (film/9499); HTTPS uses the OS cert store via `truststore` (corporate SSL)
 - ČSFD Anubis anti-bot wall handled: `csfd.py` detects the proof-of-work
  challenge page, solves it (SHA-256 PoW matching the bundled worker JS) and
  replays via a shared `requests.Session`, so Žánr / Rok / Země původu tags load again
  (the "nalezeno 1 film, načteno 0 tagů" symptom). Verified live (Matrix 1999)
 - Removed the inherited Tagger predefined tags: `DEFAULT_TAGS` is now empty
  (no Hodnocení ⭐ / Barva categories) and new files no longer get an automatic
  `Stav/Nové` tag. Tags now come from ČSFD (Žánr / Rok / Země původu) and manual edits.
  Note: `Hodnocení` is still listed in `FILMOTEKA_CATEGORIES`, so that branch is
  simply empty until something assigns a Hodnocení tag again
 - Fixed template cruft: `src/constants.py` made consistent (Curator values,
  `get_version`/`get_debug_mode` API) and `test_constants.py` aligned; removed
  the imported `tagger/` devel dump
@@ -27,7 +27,7 @@ hardlink-tree machinery is inherited and extended into a full library workflow.
 2. **Import a movie**: pick a video, enter its **Title** and a **ČSFD link**. The
   file is copied (non-destructively) into `pool/Filmy` as `Title.ext` and
   recorded in the index. If a ČSFD link is given, Curator fetches the movie from
-   [ČSFD.cz](https://www.csfd.cz) and assigns **Žánr / Rok / Země** tags
+   [ČSFD.cz](https://www.csfd.cz) and assigns **Žánr / Rok / Země původu** tags
   automatically (use "Načíst tagy z ČSFD" to (re)fetch later).
 3. **Tag** movies (Rok, Žánr, Hodnocení, …) and filter them in the UI.
 4. **Generate the Filmotéka**: movies become a `Category/Tag/film` hardlink tree
@@ -0,0 +1,107 @@
 """One-off migration: rename a tag category inside a pool's metadata index.
 Tags are stored in ``<pool>/.Curator.!index`` as ``"Category/Name"`` strings.
 This rewrites every tag whose category matches ``--old`` to use ``--new``,
 leaving the tag name untouched. A timestamped backup of the index is written
 before saving.
 Usage:
    poetry run python scripts/migrate_tag_category.py <pool_dir> \
        --old "Země" --new "Země původu"
 If ``<pool_dir>`` is omitted, the pool from the global config is used.
 """
 from __future__ import annotations
 import sys
 import json
 import shutil
 import argparse
 from pathlib import Path
 from datetime import datetime
 from loguru import logger
 # Allow running as a plain script (``python scripts/...``) by exposing the repo root.
 sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
 from src.core.config import load_global_config  # noqa: E402
 from src.core.pool_index import INDEX_FILENAME  # noqa: E402
 def _rename_in_tags(tags: list[str], old: str, new: str) -> tuple[list[str], int]:
    """Return (rewritten tags, number of tags changed) for one record."""
    prefix = f"{old}/"
    changed = 0
    result: list[str] = []
    for tag in tags:
        if isinstance(tag, str) and tag.startswith(prefix):
            result.append(f"{new}/{tag[len(prefix):]}")
            changed += 1
        else:
            result.append(tag)
    return result, changed
 def migrate(index_path: Path, old: str, new: str) -> int:
    """Rewrite the category in place and return the number of tags changed."""
    with open(index_path, "r", encoding="utf-8") as f:
        data = json.load(f)
    movies: dict[str, dict] = data.get("movies", {})
    total_changed = 0
    affected_records = 0
    for key, record in movies.items():
        tags = record.get("tags", [])
        new_tags, changed = _rename_in_tags(tags, old, new)
        if changed:
            record["tags"] = new_tags
            total_changed += changed
            affected_records += 1
            logger.debug(f"{key}: {changed} tag(s) renamed")
    if total_changed == 0:
        logger.info(f"No '{old}/…' tags found — nothing to migrate")
        return 0
    backup = index_path.with_suffix(
        index_path.suffix + f".bak-{datetime.now():%Y%m%d-%H%M%S}"
    )
    shutil.copy2(index_path, backup)
    logger.info(f"Backup written: {backup}")
    with open(index_path, "w", encoding="utf-8") as f:
        json.dump(data, f, indent=2, ensure_ascii=False)
    logger.info(
        f"Migrated '{old}' → '{new}': {total_changed} tag(s) "
        f"across {affected_records} record(s)"
    )
    return total_changed
 def main() -> None:
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument(
        "pool_dir",
        nargs="?",
        help="Pool root (default: pool_dir from the global config)",
    )
    parser.add_argument("--old", default="Země", help="Category to rename from")
    parser.add_argument("--new", default="Země původu", help="Category to rename to")
    args = parser.parse_args()
    pool_dir = args.pool_dir or load_global_config().get("pool_dir")
    if not pool_dir:
        parser.error("No pool_dir given and none configured in the global config")
    index_path = Path(pool_dir) / INDEX_FILENAME
    if not index_path.exists():
        parser.error(f"No index found at {index_path}")
    migrate(index_path, args.old, args.new)
 if __name__ == "__main__":
    main()
@@ -8,10 +8,14 @@ from __future__ import annotations
 import re
 import json
 import time
 import hashlib
 from dataclasses import dataclass, field
 from typing import Optional, TYPE_CHECKING
 from urllib.parse import urljoin
 from loguru import logger
 try:
    import requests
    from bs4 import BeautifulSoup
@@ -34,6 +38,16 @@ HEADERS = {
    "Accept-Language": "cs,en;q=0.9",
 }
 # Anubis is the proof-of-work anti-bot wall ČSFD now puts in front of every page.
 # A plain request gets a 200 with a JS challenge page (title "Ujišťujeme se, že
 # nejste robot!") instead of the movie, so JSON-LD/genres/year all parse empty.
 # We detect that page, solve the PoW the way the bundled worker JS does, and
 # replay the request through the same session to obtain the auth cookie.
 ANUBIS_CHALLENGE_MARKER = 'id="anubis_challenge"'
 ANUBIS_PASS_PATH = "/.within.website/x/cmd/anubis/api/pass-challenge"
 # Safety cap so a difficulty bump can never spin forever (difficulty 1 needs ~16).
 ANUBIS_MAX_NONCE = 50_000_000
@dataclass
 class CSFDMovie:
@@ -123,12 +137,103 @@ def _parse_duration(duration_str: str) -> Optional[int]:
    return int(match.group(1)) if match else None
-def fetch_movie(url: str) -> CSFDMovie:
+def _extract_json_blob(html: str, element_id: str):
    """Return the parsed JSON from an Anubis ``<script id=...>`` blob, or None."""
    match = re.search(
        rf'<script id="{re.escape(element_id)}" type="application/json">(.*?)</script>',
        html,
        re.S,
    )
    if not match:
        return None
    try:
        return json.loads(match.group(1))
    except json.JSONDecodeError:
        return None
 def _solve_anubis_pow(random_data: str, difficulty: int) -> tuple[str, int, int]:
    """Brute-force the Anubis proof-of-work.
    Mirrors the bundled ``sha256-purejs`` worker: find the smallest ``nonce``
    such that ``sha256(random_data + str(nonce))`` has ``difficulty`` leading
    zero nibbles. Returns ``(hash_hex, nonce, elapsed_ms)``.
    """
    full_zero_bytes = difficulty // 2
    needs_half_byte = difficulty % 2 != 0
    start = time.monotonic()
    for nonce in range(ANUBIS_MAX_NONCE):
        digest = hashlib.sha256(f"{random_data}{nonce}".encode()).digest()
        if any(digest[i] != 0 for i in range(full_zero_bytes)):
            continue
        if needs_half_byte and digest[full_zero_bytes] >> 4 != 0:
            continue
        elapsed_ms = int((time.monotonic() - start) * 1000)
        return digest.hex(), nonce, elapsed_ms
    raise ValueError(
        f"Anubis PoW unsolved within {ANUBIS_MAX_NONCE} attempts (difficulty {difficulty})"
    )
 def _solve_anubis_challenge(session, html: str, url: str):
    """Solve the Anubis challenge in ``html`` and return the real page response.
    Posts the proof-of-work back to the pass-challenge endpoint through
    ``session`` (which stores the resulting auth cookie) and follows the
    redirect to the originally requested page.
    """
    payload = _extract_json_blob(html, "anubis_challenge")
    if not payload:
        raise ValueError("ČSFD anti-bot stránka bez čitelné Anubis challenge")
    rules = payload.get("rules", {})
    challenge = payload.get("challenge", {})
    random_data = challenge.get("randomData")
    difficulty = int(rules.get("difficulty", 1))
    if not random_data:
        raise ValueError("Anubis challenge neobsahuje randomData")
    base_prefix = _extract_json_blob(html, "anubis_base_prefix") or ""
    logger.debug(f"Solving Anubis challenge (difficulty {difficulty}) for {url}")
    hash_hex, nonce, elapsed_ms = _solve_anubis_pow(random_data, difficulty)
    logger.debug(f"Anubis solved: nonce={nonce}, elapsed={elapsed_ms}ms")
    pass_url = urljoin(CSFD_BASE_URL, f"{base_prefix}{ANUBIS_PASS_PATH}")
    response = session.get(
        pass_url,
        params={
            "id": challenge.get("id"),
            "response": hash_hex,
            "nonce": nonce,
            "redir": url,
            "elapsedTime": elapsed_ms,
        },
        headers=HEADERS,
        timeout=10,
    )
    response.raise_for_status()
    if ANUBIS_CHALLENGE_MARKER in response.text:
        raise ValueError("ČSFD Anubis challenge se nepodařilo vyřešit (odmítnuto)")
    return response
 def _get_page(session, url: str):
    """GET ``url`` through ``session``, transparently clearing an Anubis wall."""
    response = session.get(url, headers=HEADERS, timeout=10)
    response.raise_for_status()
    if ANUBIS_CHALLENGE_MARKER in response.text:
        response = _solve_anubis_challenge(session, response.text, url)
    return response
 def fetch_movie(url: str, session=None) -> CSFDMovie:
    """
    Fetch movie information from CSFD.cz URL.
    Args:
        url: Full CSFD.cz movie URL (e.g., https://www.csfd.cz/film/9423-pane-vy-jste-vdova/)
        session: Optional ``requests.Session`` to reuse (keeps the Anubis auth
            cookie across calls so only the first fetch pays the PoW cost).
    Returns:
        CSFDMovie object with extracted data
@@ -140,8 +245,14 @@ def fetch_movie(url: str) -> CSFDMovie:
    """
    _check_dependencies()
-    response = requests.get(url, headers=HEADERS, timeout=10)
+    own_session = session is None
-    response.raise_for_status()
+    if own_session:
        session = requests.Session()
    try:
        response = _get_page(session, url)
    finally:
        if own_session:
            session.close()
    soup = BeautifulSoup(response.text, "html.parser")
@@ -378,8 +489,8 @@ def search_movies(query: str, limit: int = 10) -> list[CSFDMovie]:
    _check_dependencies()
    search_url = f"{CSFD_SEARCH_URL}?q={requests.utils.quote(query)}"
-    response = requests.get(search_url, headers=HEADERS, timeout=10)
+    with requests.Session() as session:
-    response.raise_for_status()
+        response = _get_page(session, search_url)
    soup = BeautifulSoup(response.text, "html.parser")
    results = []
@@ -51,9 +51,6 @@ class File:
        self.title = None
        self.csfd_link = None
        self.csfd_cache = None
        if self.tagmanager:
            tag = self.tagmanager.add_tag("Stav", "Nové")
            self.tags.append(tag)
    def _build_record(self) -> dict:
        data = {
@@ -142,7 +139,7 @@ class File:
    def apply_csfd_tags(
        self, add_genres: bool = True, add_year: bool = True, add_country: bool = True
    ) -> dict:
-        """Načte informace z CSFD a přiřadí tagy (Žánr, Rok, Země); cachuje data.
+        """Načte informace z CSFD a přiřadí tagy (Žánr, Rok, Země původu); cachuje data.
        Returns:
            dict s klíči 'success', 'movie'/'error', 'tags_added'
@@ -173,7 +170,7 @@ class File:
        if add_year and movie.year:
            _add("Rok", str(movie.year))
        if add_country and movie.country:
-            _add("Země", movie.country)
+            _add("Země původu", movie.country)
        # Use the CSFD title if we don't have one yet
        if movie.title and not self.title:
@@ -1,15 +1,15 @@
 from .tag import Tag
-# Default tags that are always available (order in list = display order)
+# Default tags that are always available (order in list = display order).
-DEFAULT_TAGS = {
+# The legacy Tagger presets (Hodnocení / Barva) were removed for Curator; the
-    "Hodnocení": ["⭐", "⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐", "⭐⭐⭐⭐⭐"],
+# pool is driven by ČSFD-derived tags (Žánr / Rok / Země původu). Add entries here to
-    "Barva": ["🔴 Červená", "🟠 Oranžová", "🟡 Žlutá", "🟢 Zelená", "🔵 Modrá", "🟣 Fialová"],
+# reintroduce always-available predefined tags.
-}
+DEFAULT_TAGS: dict[str, list[str]] = {}
 # Tag sort order for default categories (preserves display order)
-DEFAULT_TAG_ORDER = {
+DEFAULT_TAG_ORDER: dict[str, dict[str, int]] = {
-    "Hodnocení": {name: i for i, name in enumerate(DEFAULT_TAGS["Hodnocení"])},
+    category: {name: i for i, name in enumerate(names)}
-    "Barva": {name: i for i, name in enumerate(DEFAULT_TAGS["Barva"])},
+    for category, names in DEFAULT_TAGS.items()
 }
@@ -31,7 +31,7 @@ from src.core.constants import APP_NAME, VERSION
 from src.core.hardlink_manager import HardlinkManager
 # Categories that drive the generated Filmotéka tree (see PROJECT.md)
-FILMOTEKA_CATEGORIES = ["Rok", "Žánr", "Hodnocení"]
+FILMOTEKA_CATEGORIES = ["Rok", "Žánr", "Země původu", "Hodnocení"]
 class ImportMovieDialog(QDialog):
@@ -101,7 +101,7 @@ class AssignTagsDialog(QDialog):
                else:
                    state = Qt.PartiallyChecked
                item = QTreeWidgetItem([tag.name])
-                item.setFlags(Qt.ItemIsUserCheckable | Qt.ItemIsEnabled | Qt.ItemIsTristate)
+                item.setFlags(Qt.ItemIsUserCheckable | Qt.ItemIsEnabled | Qt.ItemIsAutoTristate)
                item.setCheckState(0, state)
                cat_item.addChild(item)
                self._items.append((tag.full_path, item))
@@ -213,8 +213,8 @@ class QtApp(QMainWindow):
        search_row.addWidget(import_btn)
        main_layout.addLayout(search_row)
-        self.table = QTableWidget(0, 5)
+        self.table = QTableWidget(0, 3)
-        self.table.setHorizontalHeaderLabels(["Název", "Datum", "Štítky", "Velikost", "ČSFD"])
+        self.table.setHorizontalHeaderLabels(["Název", "Štítky", "Velikost"])
        self.table.setSelectionBehavior(QAbstractItemView.SelectRows)
        self.table.setSelectionMode(QAbstractItemView.ExtendedSelection)
        self.table.setEditTriggers(QAbstractItemView.NoEditTriggers)
@@ -223,8 +223,8 @@ class QtApp(QMainWindow):
        self.table.doubleClicked.connect(lambda _: self.open_movies())
        self.table.itemSelectionChanged.connect(self._update_selection_status)
        header = self.table.horizontalHeader()
-        header.setSectionResizeMode(0, QHeaderView.Stretch)
+        header.setSectionResizeMode(0, QHeaderView.Stretch)  # Název
-        header.setSectionResizeMode(2, QHeaderView.Stretch)
+        header.setSectionResizeMode(1, QHeaderView.Stretch)  # Štítky
        main_layout.addWidget(self.table)
        splitter.addWidget(main)
@@ -300,8 +300,7 @@ class QtApp(QMainWindow):
                size = self._format_size(f.file_path.stat().st_size)
            except OSError:
                size = "?"
-            csfd = "🔗" if f.csfd_link else ""
+            for col, value in enumerate([name, tags, size]):
            for col, value in enumerate([name, f.date or "", tags, size, csfd]):
                self.table.setItem(row, col, QTableWidgetItem(value))
        self.refresh_sidebar()
@@ -16,9 +16,19 @@ from src.core.csfd import (
    _extract_genres,
    _extract_origin_info,
    _check_dependencies,
    _solve_anubis_pow,
 )
 def _mock_session(mock_requests):
    """Wire ``mock_requests`` so ``requests.Session()`` (also as a context
    manager) yields a single configurable session mock and return it."""
    session = MagicMock()
    session.__enter__.return_value = session
    mock_requests.Session.return_value = session
    return session
 # Sample HTML for testing
 SAMPLE_JSON_LD = """
 {
@@ -219,7 +229,8 @@ class TestFetchMovie:
        mock_response = MagicMock()
        mock_response.text = SAMPLE_HTML
        mock_response.raise_for_status = MagicMock()
-        mock_requests.get.return_value = mock_response
+        session = _mock_session(mock_requests)
        session.get.return_value = mock_response
        movie = fetch_movie("https://www.csfd.cz/film/123-test/")
@@ -227,13 +238,14 @@ class TestFetchMovie:
        assert movie.csfd_id == 123
        assert movie.rating == 86
        assert "Drama" in movie.genres
-        mock_requests.get.assert_called_once()
+        session.get.assert_called_once()
    @patch("src.core.csfd.requests")
    def test_fetch_movie_network_error(self, mock_requests):
        """Test network error handling."""
        import requests as real_requests
-        mock_requests.get.side_effect = real_requests.RequestException("Network error")
+        session = _mock_session(mock_requests)
        session.get.side_effect = real_requests.RequestException("Network error")
        with pytest.raises(real_requests.RequestException):
            fetch_movie("https://www.csfd.cz/film/123/")
@@ -254,7 +266,8 @@ class TestSearchMovies:
        mock_response = MagicMock()
        mock_response.text = search_html
        mock_response.raise_for_status = MagicMock()
-        mock_requests.get.return_value = mock_response
+        session = _mock_session(mock_requests)
        session.get.return_value = mock_response
        mock_requests.utils.quote = lambda x: x
        results = search_movies("test", limit=10)
@@ -277,6 +290,24 @@ class TestFetchMovieById:
        assert movie.title == "Test"
 class TestAnubisPoW:
    """Tests for the Anubis proof-of-work solver."""
    def test_solve_pow_difficulty_one(self):
        """Difficulty 1 requires a single leading zero nibble in the hash."""
        import hashlib
        random_data = "abc123"
        hash_hex, nonce, _ = _solve_anubis_pow(random_data, difficulty=1)
        assert hash_hex[0] == "0"
        assert hashlib.sha256(f"{random_data}{nonce}".encode()).hexdigest() == hash_hex
    def test_solve_pow_difficulty_two(self):
        """Difficulty 2 requires two leading zero nibbles (one zero byte)."""
        hash_hex, _, _ = _solve_anubis_pow("seed", difficulty=2)
        assert hash_hex[:2] == "00"
 class TestDependencyCheck:
    """Tests for dependency checking."""
@@ -40,10 +40,9 @@ class TestFile:
        assert file_obj.metadata_filename == expected
    def test_file_initial_tags(self, test_file, tag_manager):
-        """Test že nový soubor má tag Stav/Nové"""
+        """Test že nový soubor nemá žádné automatické tagy (Stav/Nové odstraněn)"""
        file_obj = File(test_file, tag_manager)
-        assert len(file_obj.tags) == 1
+        assert file_obj.tags == []
        assert file_obj.tags[0].full_path == "Stav/Nové"
    def test_file_metadata_saved(self, test_file, tag_manager):
        """Test že metadata jsou uložena při vytvoření"""
@@ -75,13 +74,12 @@ class TestFile:
        # Vytvoření nového objektu - měl by načíst metadata
        file_obj2 = File(test_file, tag_manager)
-        assert len(file_obj2.tags) == 2  # Stav/Nové + Video/HD
+        assert len(file_obj2.tags) == 1  # Video/HD
        assert file_obj2.date == "2025-01-15"
        # Kontrola že tagy obsahují správné hodnoty
        tag_paths = {tag.full_path for tag in file_obj2.tags}
        assert "Video/HD" in tag_paths
        assert "Stav/Nové" in tag_paths
    def test_file_set_date(self, test_file, tag_manager):
        """Test nastavení data"""
@@ -115,7 +113,7 @@ class TestFile:
        file_obj.add_tag(tag)
        assert tag in file_obj.tags
-        assert len(file_obj.tags) == 2  # Stav/Nové + Video/4K
+        assert len(file_obj.tags) == 1  # Video/4K
    def test_file_add_tag_string(self, test_file, tag_manager):
        """Test přidání tagu jako string"""
@@ -182,7 +180,7 @@ class TestFile:
        """Test File bez TagManager"""
        file_obj = File(test_file, tagmanager=None)
        assert file_obj.tagmanager is None
-        assert len(file_obj.tags) == 0  # Bez TagManager se nepřidá Stav/Nové
+        assert len(file_obj.tags) == 0  # nový soubor nemá žádné automatické tagy
    def test_file_metadata_persistence(self, test_file, tag_manager):
        """Test že metadata přežijí reload"""
@@ -42,7 +42,7 @@ class TestHardlinkManager:
        # File 1 with multiple tags
        f1 = File(temp_source_dir / "file1.txt", tag_manager)
-        f1.tags.clear()  # Remove default "Stav/Nové" tag
+        f1.tags.clear()  # ensure a clean tag set
        f1.add_tag(Tag("žánr", "Komedie"))
        f1.add_tag(Tag("žánr", "Akční"))
        f1.add_tag(Tag("rok", "1988"))
@@ -50,13 +50,13 @@ class TestHardlinkManager:
        # File 2 with one tag
        f2 = File(temp_source_dir / "file2.txt", tag_manager)
-        f2.tags.clear()  # Remove default "Stav/Nové" tag
+        f2.tags.clear()  # ensure a clean tag set
        f2.add_tag(Tag("žánr", "Drama"))
        files.append(f2)
        # File 3 with no tags
        f3 = File(temp_source_dir / "file3.txt", tag_manager)
-        f3.tags.clear()  # Remove default "Stav/Nové" tag
+        f3.tags.clear()  # ensure a clean tag set
        files.append(f3)
        return files
@@ -63,7 +63,7 @@ class TestFileWithIndex:
        assert not f.metadata_filename.exists()  # no sidecar
        assert index.get(movie) is not None  # record created in index
-        assert f.tags[0].full_path == "Stav/Nové"
+        assert f.tags == []  # no automatic tags
    def test_index_backed_metadata_persists_across_reload(self, tmp_path):
        index = PoolIndex(tmp_path)
@@ -13,24 +13,12 @@ class TestTagManager:
    @pytest.fixture
    def empty_tag_manager(self):
-        """Fixture pro prázdný TagManager (bez default tagů)"""
+        """Fixture pro prázdný TagManager (alias k tag_manager, žádné default tagy)"""
-        tm = TagManager()
+        return TagManager()
        # Odstranit default tagy pro testy které potřebují prázdný manager
        for category in list(tm.tags_by_category.keys()):
            tm.remove_category(category)
        return tm
-    def test_tag_manager_creation_has_defaults(self, tag_manager):
+    def test_tag_manager_creation_has_no_defaults(self, tag_manager):
-        """Test vytvoření TagManager obsahuje default tagy"""
+        """Test že nový TagManager nemá žádné předdefinované tagy"""
-        assert "Hodnocení" in tag_manager.tags_by_category
+        assert tag_manager.tags_by_category == {}
        assert "Barva" in tag_manager.tags_by_category
    def test_tag_manager_default_tags_count(self, tag_manager):
        """Test počtu default tagů"""
        # Hodnocení má 5 hvězdiček
        assert len(tag_manager.tags_by_category["Hodnocení"]) == 5
        # Barva má 6 barev
        assert len(tag_manager.tags_by_category["Barva"]) == 6
    def test_add_category(self, tag_manager):
        """Test přidání kategorie"""
@@ -141,11 +129,9 @@ class TestTagManager:
        assert "Video/4K" in tags
        assert "Audio/MP3" in tags
-    def test_get_all_tags_includes_defaults(self, tag_manager):
+    def test_get_all_tags_empty_on_fresh_manager(self, tag_manager):
-        """Test že get_all_tags obsahuje default tagy"""
+        """Test že čerstvý TagManager nemá žádné tagy (bez defaultů)"""
-        tags = tag_manager.get_all_tags()
+        assert tag_manager.get_all_tags() == []
        # Minimálně 11 default tagů (5 hodnocení + 6 barev)
        assert len(tags) >= 11
    def test_get_categories_empty(self, empty_tag_manager):
        """Test získání kategorií (prázdný manager)"""
@@ -164,11 +150,9 @@ class TestTagManager:
        assert "Audio" in categories
        assert "Foto" in categories
-    def test_get_categories_includes_defaults(self, tag_manager):
+    def test_get_categories_empty_on_fresh_manager(self, tag_manager):
-        """Test že get_categories obsahuje default kategorie"""
+        """Test že čerstvý TagManager nemá žádné kategorie (bez defaultů)"""
-        categories = tag_manager.get_categories()
+        assert tag_manager.get_categories() == []
        assert "Hodnocení" in categories
        assert "Barva" in categories
    def test_get_tags_in_category_empty(self, tag_manager):
        """Test získání tagů z prázdné kategorie"""
@@ -230,81 +214,33 @@ class TestTagManager:
 class TestDefaultTags:
-    """Testy pro defaultní tagy"""
+    """Testy pro defaultní tagy (legacy Tagger presety byly odstraněny)"""
    def test_default_tags_constant_exists(self):
-        """Test že DEFAULT_TAGS konstanta existuje"""
+        """Test že DEFAULT_TAGS konstanta existuje a je prázdná"""
        assert DEFAULT_TAGS is not None
        assert isinstance(DEFAULT_TAGS, dict)
        assert DEFAULT_TAGS == {}
-    def test_default_tags_has_hodnoceni(self):
+    def test_legacy_presets_removed(self):
-        """Test že DEFAULT_TAGS obsahuje Hodnocení"""
+        """Test že staré předdefinované kategorie (Hodnocení, Barva) jsou pryč"""
-        assert "Hodnocení" in DEFAULT_TAGS
+        assert "Hodnocení" not in DEFAULT_TAGS
-        assert len(DEFAULT_TAGS["Hodnocení"]) == 5
+        assert "Barva" not in DEFAULT_TAGS
-    def test_default_tags_has_barva(self):
+    def test_tag_manager_starts_empty(self):
-        """Test že DEFAULT_TAGS obsahuje Barva"""
+        """Test že TagManager bez defaultů startuje prázdný"""
        assert "Barva" in DEFAULT_TAGS
        assert len(DEFAULT_TAGS["Barva"]) == 6
    def test_hodnoceni_stars_content(self):
        """Test obsahu hvězdiček v Hodnocení"""
        stars = DEFAULT_TAGS["Hodnocení"]
        assert "⭐" in stars
        assert "⭐⭐⭐⭐⭐" in stars
    def test_barva_colors_content(self):
        """Test obsahu barev v Barva"""
        colors = DEFAULT_TAGS["Barva"]
        # Kontrolujeme že obsahuje některé barvy
        color_names = " ".join(colors)
        assert "Červená" in color_names
        assert "Zelená" in color_names
        assert "Modrá" in color_names
    def test_tag_manager_loads_all_default_tags(self):
        """Test že TagManager načte všechny default tagy"""
        tm = TagManager()
        assert tm.get_all_tags() == []
        assert tm.get_categories() == []
-        for category, tag_names in DEFAULT_TAGS.items():
+    def test_can_add_custom_tags(self):
-            assert category in tm.tags_by_category
+        """Test že lze přidat vlastní tagy do prázdného manageru"""
            tags_in_category = tm.get_tags_in_category(category)
            assert len(tags_in_category) == len(tag_names)
    def test_can_add_custom_tags_alongside_defaults(self):
        """Test že lze přidat vlastní tagy vedle defaultních"""
        tm = TagManager()
        initial_count = len(tm.get_all_tags())
        tm.add_tag("Custom", "MyTag")
-        assert len(tm.get_all_tags()) == initial_count + 1
+        assert tm.get_all_tags() == ["Custom/MyTag"]
        assert "Custom" in tm.get_categories()
    def test_can_remove_default_category(self):
        """Test že lze odstranit default kategorii"""
        tm = TagManager()
        tm.remove_category("Hodnocení")
        assert "Hodnocení" not in tm.tags_by_category
    def test_hodnoceni_tags_are_sorted_by_stars(self):
        """Test že tagy v Hodnocení jsou seřazeny od 1 do 5 hvězd"""
        tm = TagManager()
        tags = tm.get_tags_in_category("Hodnocení")
        tag_names = [t.name for t in tags]
        assert tag_names == ["⭐", "⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐", "⭐⭐⭐⭐⭐"]
    def test_barva_tags_are_sorted_in_predefined_order(self):
        """Test že tagy v Barva jsou seřazeny v předdefinovaném pořadí"""
        tm = TagManager()
        tags = tm.get_tags_in_category("Barva")
        tag_names = [t.name for t in tags]
        expected = ["🔴 Červená", "🟠 Oranžová", "🟡 Žlutá", "🟢 Zelená", "🔵 Modrá", "🟣 Fialová"]
        assert tag_names == expected
    def test_custom_category_tags_sorted_alphabetically(self):
        """Test že tagy v custom kategorii jsou seřazeny abecedně"""
        tm = TagManager()
@@ -316,12 +252,3 @@ class TestDefaultTags:
        tag_names = [t.name for t in tags]
        assert tag_names == ["4K", "HD", "SD"]
    def test_can_add_tag_to_default_category(self):
        """Test že lze přidat tag do default kategorie"""
        tm = TagManager()
        initial_count = len(tm.get_tags_in_category("Hodnocení"))
        tm.add_tag("Hodnocení", "Custom Rating")
        assert len(tm.get_tags_in_category("Hodnocení")) == initial_count + 1