Add ČSFD Anubis bypass, drop legacy preset tags, rename Země → Země původu

2026-06-12 20:30:14 +02:00
parent 22a14b1e41
commit 86c689b9f1
14 changed files with 349 additions and 146 deletions
@@ -44,7 +44,7 @@ Each version entry uses these sections (include only those that apply):
 - Project `README.md` (overview, concepts, workflow, run/build instructions).
 - **ČSFD scraping** (`csfd.py`, ported from the Tagger devel branch): fetches
  movie data from a ČSFD link (JSON-LD + HTML parsing). `File.apply_csfd_tags`
-  assigns Žánr / Rok / Země tags and caches the fetched data in the metadata.
+  assigns Žánr / Rok / Země původu tags and caches the fetched data in the metadata.
  The GUI auto-fetches on import when a link is given and offers "Načíst tagy
  z ČSFD" for selected movies.
 - App startup injects `truststore` so HTTPS uses the OS certificate store —
@@ -56,12 +56,35 @@ Each version entry uses these sections (include only those that apply):
 - ČSFD parsing updated for the current site HTML: year is read from JSON-LD
  `dateCreated`, and the origin line (now bullet-separated, no commas) is
  tokenized so country / year / duration are extracted correctly.
+- **ČSFD anti-bot wall (Anubis):** ČSFD now serves a proof-of-work challenge
+  page instead of the movie, so fetches returned a film with no genres/year
+  ("načteno 0 tagů"). `csfd.py` now detects the Anubis challenge, solves the
+  SHA-256 proof-of-work the way the bundled worker JS does, and replays the
+  request through a `requests.Session` (reused across a batch so only the first
+  fetch pays the PoW cost). Žánr / Rok / Země původu tags load again.
+- "Assign tags" dialog crashed on PySide6/Qt6 — `Qt.ItemIsTristate` was renamed
+  to `Qt.ItemIsAutoTristate`.

 ### Changed
+- ČSFD country tag category renamed **Země → Země původu**. Added
+  `scripts/migrate_tag_category.py` to rewrite the category in an existing pool
+  index (backs up `.Curator.!index` first); run against the live pool.
+- Filmotéka tree now also builds the **Země původu** branch — it was missing
+  from `FILMOTEKA_CATEGORIES`, so the country level was never generated. Tree
+  categories are now Rok / Žánr / Země původu / Hodnocení.
+- Movie table trimmed to **Název / Štítky / Velikost** — the Datum and ČSFD
+  columns were dropped (a ČSFD link is a prerequisite, so its indicator was
+  always the same).
 - All references to "Tagger" renamed to "Curator" (code, spec, config filenames
  `.Curator.!gtag` / `.Curator.!ftag`, tests).
 - `requires-python` narrowed to `>=3.14,<3.15` (PySide6 compatibility).

+### Removed
+- Legacy Tagger predefined tags: the always-available **Hodnocení** (⭐ rating)
+  and **Barva** (color) categories in `TagManager`, and the automatic
+  **Stav/Nové** tag assigned to every newly imported file. `DEFAULT_TAGS` is now
+  empty; the pool is driven by ČSFD-derived tags (Žánr / Rok / Země původu).
+
 ### Dependencies
 - Added `pyside6` (GUI), `requests` + `beautifulsoup4` (ČSFD scraping),
  `truststore` (OS cert store for HTTPS). Declared `python-dotenv`, `pillow`,
@@ -67,7 +67,7 @@ movie table, and one-click Filmotéka generation.
  and files are never moved manually, so it is not exposed to path drift.
 - **Import dialog:** collects only **Title** + **ČSFD link**. The file is renamed
  to `Title.ext`. When a ČSFD link is given, Curator fetches the movie and assigns
-  Žánr / Rok / Země tags automatically; further tags can be added via the UI.
+  Žánr / Rok / Země původu tags automatically; further tags can be added via the UI.
 - **Genres:** a movie can have **multiple genres**, so it appears under each of
  its genre branches in the Filmotéka (multiple hardlinks).
 - **Pool layout:** two top-level folders — **Filmy** and **Seriály**. Movies are
@@ -84,7 +84,7 @@ movie table, and one-click Filmotéka generation.
  the source is left in place.
 - **Filmotéka tree:** **one level per category** — `output/Category/Tag/film`
  (hardlink), same shape as the current hardlink manager. For now the tree is
-  built from these categories: **Rok**, **Žánr**, **Hodnocení**.
+  built from these categories: **Rok**, **Žánr**, **Země původu**, **Hodnocení**.

 ## Tasks

@@ -96,7 +96,8 @@ movie table, and one-click Filmotéka generation.
 - Filmy / Seriály top-level folder handling in the pool
 - "Import movie" dialog (Title + ČSFD link), copy into pool/Filmy as Title.ext
 - Remove-from-pool (delete file + its metadata)
- Generate the Filmotéka hardlink tree from the pool (Rok / Žánr / Hodnocení)
+- Generate the Filmotéka hardlink tree from the pool (Rok / Žánr / Země původu /
+  Hodnocení)
 - Filmotéka fully regenerable from the pool alone (delete output = no loss)
 - GUI reframed around the Filmotéka and rewritten in PySide6
 - Seriály "copy-as-is" mirror: pool/Seriály cloned 1:1 into the output as
@@ -108,10 +109,19 @@ movie table, and one-click Filmotéka generation.
  from the GUI); each is mirrored 1:1 during Filmotéka generation (Seriály default)
 - README.md written (overview, concepts, workflow, run/build instructions)
 - ČSFD scraping (`csfd.py`, ported from Tagger devel): `File.apply_csfd_tags`
-  fetches a movie and assigns Žánr / Rok / Země tags (cached in metadata); wired
+  fetches a movie and assigns Žánr / Rok / Země původu tags (cached in metadata); wired
  into the GUI (auto-fetch on import with a ČSFD link, plus "Načíst tagy z ČSFD").
  Parsing updated for current ČSFD HTML and verified live against Matrix
  (film/9499); HTTPS uses the OS cert store via `truststore` (corporate SSL)
+- ČSFD Anubis anti-bot wall handled: `csfd.py` detects the proof-of-work
+  challenge page, solves it (SHA-256 PoW matching the bundled worker JS) and
+  replays via a shared `requests.Session`, so Žánr / Rok / Země původu tags load again
+  (the "nalezeno 1 film, načteno 0 tagů" symptom). Verified live (Matrix 1999)
+- Removed the inherited Tagger predefined tags: `DEFAULT_TAGS` is now empty
+  (no Hodnocení ⭐ / Barva categories) and new files no longer get an automatic
+  `Stav/Nové` tag. Tags now come from ČSFD (Žánr / Rok / Země původu) and manual edits.
+  Note: `Hodnocení` is still listed in `FILMOTEKA_CATEGORIES`, so that branch is
+  simply empty until something assigns a Hodnocení tag again
 - Fixed template cruft: `src/constants.py` made consistent (Curator values,
  `get_version`/`get_debug_mode` API) and `test_constants.py` aligned; removed
  the imported `tagger/` devel dump
@@ -27,7 +27,7 @@ hardlink-tree machinery is inherited and extended into a full library workflow.
 2. **Import a movie**: pick a video, enter its **Title** and a **ČSFD link**. The
   file is copied (non-destructively) into `pool/Filmy` as `Title.ext` and
   recorded in the index. If a ČSFD link is given, Curator fetches the movie from
-   [ČSFD.cz](https://www.csfd.cz) and assigns **Žánr / Rok / Země** tags
+   [ČSFD.cz](https://www.csfd.cz) and assigns **Žánr / Rok / Země původu** tags
   automatically (use "Načíst tagy z ČSFD" to (re)fetch later).
 3. **Tag** movies (Rok, Žánr, Hodnocení, …) and filter them in the UI.
 4. **Generate the Filmotéka**: movies become a `Category/Tag/film` hardlink tree
@@ -0,0 +1,107 @@
+"""One-off migration: rename a tag category inside a pool's metadata index.
+
+Tags are stored in ``<pool>/.Curator.!index`` as ``"Category/Name"`` strings.
+This rewrites every tag whose category matches ``--old`` to use ``--new``,
+leaving the tag name untouched. A timestamped backup of the index is written
+before saving.
+
+Usage:
+    poetry run python scripts/migrate_tag_category.py <pool_dir> \
+        --old "Země" --new "Země původu"
+
+If ``<pool_dir>`` is omitted, the pool from the global config is used.
+"""
+
+from __future__ import annotations
+
+import sys
+import json
+import shutil
+import argparse
+from pathlib import Path
+from datetime import datetime
+
+from loguru import logger
+
+# Allow running as a plain script (``python scripts/...``) by exposing the repo root.
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from src.core.config import load_global_config  # noqa: E402
+from src.core.pool_index import INDEX_FILENAME  # noqa: E402
+
+
+def _rename_in_tags(tags: list[str], old: str, new: str) -> tuple[list[str], int]:
+    """Return (rewritten tags, number of tags changed) for one record."""
+    prefix = f"{old}/"
+    changed = 0
+    result: list[str] = []
+    for tag in tags:
+        if isinstance(tag, str) and tag.startswith(prefix):
+            result.append(f"{new}/{tag[len(prefix):]}")
+            changed += 1
+        else:
+            result.append(tag)
+    return result, changed
+
+
+def migrate(index_path: Path, old: str, new: str) -> int:
+    """Rewrite the category in place and return the number of tags changed."""
+    with open(index_path, "r", encoding="utf-8") as f:
+        data = json.load(f)
+
+    movies: dict[str, dict] = data.get("movies", {})
+    total_changed = 0
+    affected_records = 0
+    for key, record in movies.items():
+        tags = record.get("tags", [])
+        new_tags, changed = _rename_in_tags(tags, old, new)
+        if changed:
+            record["tags"] = new_tags
+            total_changed += changed
+            affected_records += 1
+            logger.debug(f"{key}: {changed} tag(s) renamed")
+
+    if total_changed == 0:
+        logger.info(f"No '{old}/…' tags found — nothing to migrate")
+        return 0
+
+    backup = index_path.with_suffix(
+        index_path.suffix + f".bak-{datetime.now():%Y%m%d-%H%M%S}"
+    )
+    shutil.copy2(index_path, backup)
+    logger.info(f"Backup written: {backup}")
+
+    with open(index_path, "w", encoding="utf-8") as f:
+        json.dump(data, f, indent=2, ensure_ascii=False)
+
+    logger.info(
+        f"Migrated '{old}' → '{new}': {total_changed} tag(s) "
+        f"across {affected_records} record(s)"
+    )
+    return total_changed
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "pool_dir",
+        nargs="?",
+        help="Pool root (default: pool_dir from the global config)",
+    )
+    parser.add_argument("--old", default="Země", help="Category to rename from")
+    parser.add_argument("--new", default="Země původu", help="Category to rename to")
+    args = parser.parse_args()
+
+    pool_dir = args.pool_dir or load_global_config().get("pool_dir")
+    if not pool_dir:
+        parser.error("No pool_dir given and none configured in the global config")
+
+    index_path = Path(pool_dir) / INDEX_FILENAME
+    if not index_path.exists():
+        parser.error(f"No index found at {index_path}")
+
+    migrate(index_path, args.old, args.new)
+
+
+if __name__ == "__main__":
+    main()
@@ -8,10 +8,14 @@ from __future__ import annotations

 import re
 import json
+import time
+import hashlib
 from dataclasses import dataclass, field
 from typing import Optional, TYPE_CHECKING
 from urllib.parse import urljoin

+from loguru import logger
+
 try:
    import requests
    from bs4 import BeautifulSoup
@@ -34,6 +38,16 @@ HEADERS = {
    "Accept-Language": "cs,en;q=0.9",
 }

+# Anubis is the proof-of-work anti-bot wall ČSFD now puts in front of every page.
+# A plain request gets a 200 with a JS challenge page (title "Ujišťujeme se, že
+# nejste robot!") instead of the movie, so JSON-LD/genres/year all parse empty.
+# We detect that page, solve the PoW the way the bundled worker JS does, and
+# replay the request through the same session to obtain the auth cookie.
+ANUBIS_CHALLENGE_MARKER = 'id="anubis_challenge"'
+ANUBIS_PASS_PATH = "/.within.website/x/cmd/anubis/api/pass-challenge"
+# Safety cap so a difficulty bump can never spin forever (difficulty 1 needs ~16).
+ANUBIS_MAX_NONCE = 50_000_000
+

@dataclass
 class CSFDMovie:
@@ -123,12 +137,103 @@ def _parse_duration(duration_str: str) -> Optional[int]:
    return int(match.group(1)) if match else None


-def fetch_movie(url: str) -> CSFDMovie:
+def _extract_json_blob(html: str, element_id: str):
+    """Return the parsed JSON from an Anubis ``<script id=...>`` blob, or None."""
+    match = re.search(
+        rf'<script id="{re.escape(element_id)}" type="application/json">(.*?)</script>',
+        html,
+        re.S,
+    )
+    if not match:
+        return None
+    try:
+        return json.loads(match.group(1))
+    except json.JSONDecodeError:
+        return None
+
+
+def _solve_anubis_pow(random_data: str, difficulty: int) -> tuple[str, int, int]:
+    """Brute-force the Anubis proof-of-work.
+
+    Mirrors the bundled ``sha256-purejs`` worker: find the smallest ``nonce``
+    such that ``sha256(random_data + str(nonce))`` has ``difficulty`` leading
+    zero nibbles. Returns ``(hash_hex, nonce, elapsed_ms)``.
+    """
+    full_zero_bytes = difficulty // 2
+    needs_half_byte = difficulty % 2 != 0
+    start = time.monotonic()
+    for nonce in range(ANUBIS_MAX_NONCE):
+        digest = hashlib.sha256(f"{random_data}{nonce}".encode()).digest()
+        if any(digest[i] != 0 for i in range(full_zero_bytes)):
+            continue
+        if needs_half_byte and digest[full_zero_bytes] >> 4 != 0:
+            continue
+        elapsed_ms = int((time.monotonic() - start) * 1000)
+        return digest.hex(), nonce, elapsed_ms
+    raise ValueError(
+        f"Anubis PoW unsolved within {ANUBIS_MAX_NONCE} attempts (difficulty {difficulty})"
+    )
+
+
+def _solve_anubis_challenge(session, html: str, url: str):
+    """Solve the Anubis challenge in ``html`` and return the real page response.
+
+    Posts the proof-of-work back to the pass-challenge endpoint through
+    ``session`` (which stores the resulting auth cookie) and follows the
+    redirect to the originally requested page.
+    """
+    payload = _extract_json_blob(html, "anubis_challenge")
+    if not payload:
+        raise ValueError("ČSFD anti-bot stránka bez čitelné Anubis challenge")
+
+    rules = payload.get("rules", {})
+    challenge = payload.get("challenge", {})
+    random_data = challenge.get("randomData")
+    difficulty = int(rules.get("difficulty", 1))
+    if not random_data:
+        raise ValueError("Anubis challenge neobsahuje randomData")
+
+    base_prefix = _extract_json_blob(html, "anubis_base_prefix") or ""
+    logger.debug(f"Solving Anubis challenge (difficulty {difficulty}) for {url}")
+    hash_hex, nonce, elapsed_ms = _solve_anubis_pow(random_data, difficulty)
+    logger.debug(f"Anubis solved: nonce={nonce}, elapsed={elapsed_ms}ms")
+
+    pass_url = urljoin(CSFD_BASE_URL, f"{base_prefix}{ANUBIS_PASS_PATH}")
+    response = session.get(
+        pass_url,
+        params={
+            "id": challenge.get("id"),
+            "response": hash_hex,
+            "nonce": nonce,
+            "redir": url,
+            "elapsedTime": elapsed_ms,
+        },
+        headers=HEADERS,
+        timeout=10,
+    )
+    response.raise_for_status()
+    if ANUBIS_CHALLENGE_MARKER in response.text:
+        raise ValueError("ČSFD Anubis challenge se nepodařilo vyřešit (odmítnuto)")
+    return response
+
+
+def _get_page(session, url: str):
+    """GET ``url`` through ``session``, transparently clearing an Anubis wall."""
+    response = session.get(url, headers=HEADERS, timeout=10)
+    response.raise_for_status()
+    if ANUBIS_CHALLENGE_MARKER in response.text:
+        response = _solve_anubis_challenge(session, response.text, url)
+    return response
+
+
+def fetch_movie(url: str, session=None) -> CSFDMovie:
    """
    Fetch movie information from CSFD.cz URL.

    Args:
        url: Full CSFD.cz movie URL (e.g., https://www.csfd.cz/film/9423-pane-vy-jste-vdova/)
+        session: Optional ``requests.Session`` to reuse (keeps the Anubis auth
+            cookie across calls so only the first fetch pays the PoW cost).

    Returns:
        CSFDMovie object with extracted data
@@ -140,8 +245,14 @@ def fetch_movie(url: str) -> CSFDMovie:
    """
    _check_dependencies()

-    response = requests.get(url, headers=HEADERS, timeout=10)
-    response.raise_for_status()
+    own_session = session is None
+    if own_session:
+        session = requests.Session()
+    try:
+        response = _get_page(session, url)
+    finally:
+        if own_session:
+            session.close()

    soup = BeautifulSoup(response.text, "html.parser")

@@ -378,8 +489,8 @@ def search_movies(query: str, limit: int = 10) -> list[CSFDMovie]:
    _check_dependencies()

    search_url = f"{CSFD_SEARCH_URL}?q={requests.utils.quote(query)}"
-    response = requests.get(search_url, headers=HEADERS, timeout=10)
-    response.raise_for_status()
+    with requests.Session() as session:
+        response = _get_page(session, search_url)

    soup = BeautifulSoup(response.text, "html.parser")
    results = []
@@ -51,9 +51,6 @@ class File:
        self.title = None
        self.csfd_link = None
        self.csfd_cache = None
-        if self.tagmanager:
-            tag = self.tagmanager.add_tag("Stav", "Nové")
-            self.tags.append(tag)

    def _build_record(self) -> dict:
        data = {
@@ -142,7 +139,7 @@ class File:
    def apply_csfd_tags(
        self, add_genres: bool = True, add_year: bool = True, add_country: bool = True
    ) -> dict:
-        """Načte informace z CSFD a přiřadí tagy (Žánr, Rok, Země); cachuje data.
+        """Načte informace z CSFD a přiřadí tagy (Žánr, Rok, Země původu); cachuje data.

        Returns:
            dict s klíči 'success', 'movie'/'error', 'tags_added'
@@ -173,7 +170,7 @@ class File:
        if add_year and movie.year:
            _add("Rok", str(movie.year))
        if add_country and movie.country:
-            _add("Země", movie.country)
+            _add("Země původu", movie.country)

        # Use the CSFD title if we don't have one yet
        if movie.title and not self.title:
@@ -1,15 +1,15 @@
 from .tag import Tag

-# Default tags that are always available (order in list = display order)
-DEFAULT_TAGS = {
-    "Hodnocení": ["⭐", "⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐", "⭐⭐⭐⭐⭐"],
-    "Barva": ["🔴 Červená", "🟠 Oranžová", "🟡 Žlutá", "🟢 Zelená", "🔵 Modrá", "🟣 Fialová"],
-}
+# Default tags that are always available (order in list = display order).
+# The legacy Tagger presets (Hodnocení / Barva) were removed for Curator; the
+# pool is driven by ČSFD-derived tags (Žánr / Rok / Země původu). Add entries here to
+# reintroduce always-available predefined tags.
+DEFAULT_TAGS: dict[str, list[str]] = {}

 # Tag sort order for default categories (preserves display order)
-DEFAULT_TAG_ORDER = {
-    "Hodnocení": {name: i for i, name in enumerate(DEFAULT_TAGS["Hodnocení"])},
-    "Barva": {name: i for i, name in enumerate(DEFAULT_TAGS["Barva"])},
+DEFAULT_TAG_ORDER: dict[str, dict[str, int]] = {
+    category: {name: i for i, name in enumerate(names)}
+    for category, names in DEFAULT_TAGS.items()
 }


@@ -31,7 +31,7 @@ from src.core.constants import APP_NAME, VERSION
 from src.core.hardlink_manager import HardlinkManager

 # Categories that drive the generated Filmotéka tree (see PROJECT.md)
-FILMOTEKA_CATEGORIES = ["Rok", "Žánr", "Hodnocení"]
+FILMOTEKA_CATEGORIES = ["Rok", "Žánr", "Země původu", "Hodnocení"]


 class ImportMovieDialog(QDialog):
@@ -101,7 +101,7 @@ class AssignTagsDialog(QDialog):
                else:
                    state = Qt.PartiallyChecked
                item = QTreeWidgetItem([tag.name])
-                item.setFlags(Qt.ItemIsUserCheckable | Qt.ItemIsEnabled | Qt.ItemIsTristate)
+                item.setFlags(Qt.ItemIsUserCheckable | Qt.ItemIsEnabled | Qt.ItemIsAutoTristate)
                item.setCheckState(0, state)
                cat_item.addChild(item)
                self._items.append((tag.full_path, item))
@@ -213,8 +213,8 @@ class QtApp(QMainWindow):
        search_row.addWidget(import_btn)
        main_layout.addLayout(search_row)

-        self.table = QTableWidget(0, 5)
-        self.table.setHorizontalHeaderLabels(["Název", "Datum", "Štítky", "Velikost", "ČSFD"])
+        self.table = QTableWidget(0, 3)
+        self.table.setHorizontalHeaderLabels(["Název", "Štítky", "Velikost"])
        self.table.setSelectionBehavior(QAbstractItemView.SelectRows)
        self.table.setSelectionMode(QAbstractItemView.ExtendedSelection)
        self.table.setEditTriggers(QAbstractItemView.NoEditTriggers)
@@ -223,8 +223,8 @@ class QtApp(QMainWindow):
        self.table.doubleClicked.connect(lambda _: self.open_movies())
        self.table.itemSelectionChanged.connect(self._update_selection_status)
        header = self.table.horizontalHeader()
-        header.setSectionResizeMode(0, QHeaderView.Stretch)
-        header.setSectionResizeMode(2, QHeaderView.Stretch)
+        header.setSectionResizeMode(0, QHeaderView.Stretch)  # Název
+        header.setSectionResizeMode(1, QHeaderView.Stretch)  # Štítky
        main_layout.addWidget(self.table)

        splitter.addWidget(main)
@@ -300,8 +300,7 @@ class QtApp(QMainWindow):
                size = self._format_size(f.file_path.stat().st_size)
            except OSError:
                size = "?"
-            csfd = "🔗" if f.csfd_link else ""
-            for col, value in enumerate([name, f.date or "", tags, size, csfd]):
+            for col, value in enumerate([name, tags, size]):
                self.table.setItem(row, col, QTableWidgetItem(value))

        self.refresh_sidebar()
@@ -16,9 +16,19 @@ from src.core.csfd import (
    _extract_genres,
    _extract_origin_info,
    _check_dependencies,
+    _solve_anubis_pow,
 )


+def _mock_session(mock_requests):
+    """Wire ``mock_requests`` so ``requests.Session()`` (also as a context
+    manager) yields a single configurable session mock and return it."""
+    session = MagicMock()
+    session.__enter__.return_value = session
+    mock_requests.Session.return_value = session
+    return session
+
+
 # Sample HTML for testing
 SAMPLE_JSON_LD = """
 {
@@ -219,7 +229,8 @@ class TestFetchMovie:
        mock_response = MagicMock()
        mock_response.text = SAMPLE_HTML
        mock_response.raise_for_status = MagicMock()
-        mock_requests.get.return_value = mock_response
+        session = _mock_session(mock_requests)
+        session.get.return_value = mock_response

        movie = fetch_movie("https://www.csfd.cz/film/123-test/")

@@ -227,13 +238,14 @@ class TestFetchMovie:
        assert movie.csfd_id == 123
        assert movie.rating == 86
        assert "Drama" in movie.genres
-        mock_requests.get.assert_called_once()
+        session.get.assert_called_once()

    @patch("src.core.csfd.requests")
    def test_fetch_movie_network_error(self, mock_requests):
        """Test network error handling."""
        import requests as real_requests
-        mock_requests.get.side_effect = real_requests.RequestException("Network error")
+        session = _mock_session(mock_requests)
+        session.get.side_effect = real_requests.RequestException("Network error")

        with pytest.raises(real_requests.RequestException):
            fetch_movie("https://www.csfd.cz/film/123/")
@@ -254,7 +266,8 @@ class TestSearchMovies:
        mock_response = MagicMock()
        mock_response.text = search_html
        mock_response.raise_for_status = MagicMock()
-        mock_requests.get.return_value = mock_response
+        session = _mock_session(mock_requests)
+        session.get.return_value = mock_response
        mock_requests.utils.quote = lambda x: x

        results = search_movies("test", limit=10)
@@ -277,6 +290,24 @@ class TestFetchMovieById:
        assert movie.title == "Test"


+class TestAnubisPoW:
+    """Tests for the Anubis proof-of-work solver."""
+
+    def test_solve_pow_difficulty_one(self):
+        """Difficulty 1 requires a single leading zero nibble in the hash."""
+        import hashlib
+
+        random_data = "abc123"
+        hash_hex, nonce, _ = _solve_anubis_pow(random_data, difficulty=1)
+        assert hash_hex[0] == "0"
+        assert hashlib.sha256(f"{random_data}{nonce}".encode()).hexdigest() == hash_hex
+
+    def test_solve_pow_difficulty_two(self):
+        """Difficulty 2 requires two leading zero nibbles (one zero byte)."""
+        hash_hex, _, _ = _solve_anubis_pow("seed", difficulty=2)
+        assert hash_hex[:2] == "00"
+
+
 class TestDependencyCheck:
    """Tests for dependency checking."""

@@ -40,10 +40,9 @@ class TestFile:
        assert file_obj.metadata_filename == expected

    def test_file_initial_tags(self, test_file, tag_manager):
-        """Test že nový soubor má tag Stav/Nové"""
+        """Test že nový soubor nemá žádné automatické tagy (Stav/Nové odstraněn)"""
        file_obj = File(test_file, tag_manager)
-        assert len(file_obj.tags) == 1
-        assert file_obj.tags[0].full_path == "Stav/Nové"
+        assert file_obj.tags == []

    def test_file_metadata_saved(self, test_file, tag_manager):
        """Test že metadata jsou uložena při vytvoření"""
@@ -75,13 +74,12 @@ class TestFile:

        # Vytvoření nového objektu - měl by načíst metadata
        file_obj2 = File(test_file, tag_manager)
-        assert len(file_obj2.tags) == 2  # Stav/Nové + Video/HD
+        assert len(file_obj2.tags) == 1  # Video/HD
        assert file_obj2.date == "2025-01-15"

        # Kontrola že tagy obsahují správné hodnoty
        tag_paths = {tag.full_path for tag in file_obj2.tags}
        assert "Video/HD" in tag_paths
-        assert "Stav/Nové" in tag_paths

    def test_file_set_date(self, test_file, tag_manager):
        """Test nastavení data"""
@@ -115,7 +113,7 @@ class TestFile:
        file_obj.add_tag(tag)

        assert tag in file_obj.tags
-        assert len(file_obj.tags) == 2  # Stav/Nové + Video/4K
+        assert len(file_obj.tags) == 1  # Video/4K

    def test_file_add_tag_string(self, test_file, tag_manager):
        """Test přidání tagu jako string"""
@@ -182,7 +180,7 @@ class TestFile:
        """Test File bez TagManager"""
        file_obj = File(test_file, tagmanager=None)
        assert file_obj.tagmanager is None
-        assert len(file_obj.tags) == 0  # Bez TagManager se nepřidá Stav/Nové
+        assert len(file_obj.tags) == 0  # nový soubor nemá žádné automatické tagy

    def test_file_metadata_persistence(self, test_file, tag_manager):
        """Test že metadata přežijí reload"""
@@ -42,7 +42,7 @@ class TestHardlinkManager:

        # File 1 with multiple tags
        f1 = File(temp_source_dir / "file1.txt", tag_manager)
-        f1.tags.clear()  # Remove default "Stav/Nové" tag
+        f1.tags.clear()  # ensure a clean tag set
        f1.add_tag(Tag("žánr", "Komedie"))
        f1.add_tag(Tag("žánr", "Akční"))
        f1.add_tag(Tag("rok", "1988"))
@@ -50,13 +50,13 @@ class TestHardlinkManager:

        # File 2 with one tag
        f2 = File(temp_source_dir / "file2.txt", tag_manager)
-        f2.tags.clear()  # Remove default "Stav/Nové" tag
+        f2.tags.clear()  # ensure a clean tag set
        f2.add_tag(Tag("žánr", "Drama"))
        files.append(f2)

        # File 3 with no tags
        f3 = File(temp_source_dir / "file3.txt", tag_manager)
-        f3.tags.clear()  # Remove default "Stav/Nové" tag
+        f3.tags.clear()  # ensure a clean tag set
        files.append(f3)

        return files
@@ -63,7 +63,7 @@ class TestFileWithIndex:

        assert not f.metadata_filename.exists()  # no sidecar
        assert index.get(movie) is not None  # record created in index
-        assert f.tags[0].full_path == "Stav/Nové"
+        assert f.tags == []  # no automatic tags

    def test_index_backed_metadata_persists_across_reload(self, tmp_path):
        index = PoolIndex(tmp_path)
@@ -13,24 +13,12 @@ class TestTagManager:

    @pytest.fixture
    def empty_tag_manager(self):
-        """Fixture pro prázdný TagManager (bez default tagů)"""
-        tm = TagManager()
-        # Odstranit default tagy pro testy které potřebují prázdný manager
-        for category in list(tm.tags_by_category.keys()):
-            tm.remove_category(category)
-        return tm
+        """Fixture pro prázdný TagManager (alias k tag_manager, žádné default tagy)"""
+        return TagManager()

-    def test_tag_manager_creation_has_defaults(self, tag_manager):
-        """Test vytvoření TagManager obsahuje default tagy"""
-        assert "Hodnocení" in tag_manager.tags_by_category
-        assert "Barva" in tag_manager.tags_by_category
-
-    def test_tag_manager_default_tags_count(self, tag_manager):
-        """Test počtu default tagů"""
-        # Hodnocení má 5 hvězdiček
-        assert len(tag_manager.tags_by_category["Hodnocení"]) == 5
-        # Barva má 6 barev
-        assert len(tag_manager.tags_by_category["Barva"]) == 6
+    def test_tag_manager_creation_has_no_defaults(self, tag_manager):
+        """Test že nový TagManager nemá žádné předdefinované tagy"""
+        assert tag_manager.tags_by_category == {}

    def test_add_category(self, tag_manager):
        """Test přidání kategorie"""
@@ -141,11 +129,9 @@ class TestTagManager:
        assert "Video/4K" in tags
        assert "Audio/MP3" in tags

-    def test_get_all_tags_includes_defaults(self, tag_manager):
-        """Test že get_all_tags obsahuje default tagy"""
-        tags = tag_manager.get_all_tags()
-        # Minimálně 11 default tagů (5 hodnocení + 6 barev)
-        assert len(tags) >= 11
+    def test_get_all_tags_empty_on_fresh_manager(self, tag_manager):
+        """Test že čerstvý TagManager nemá žádné tagy (bez defaultů)"""
+        assert tag_manager.get_all_tags() == []

    def test_get_categories_empty(self, empty_tag_manager):
        """Test získání kategorií (prázdný manager)"""
@@ -164,11 +150,9 @@ class TestTagManager:
        assert "Audio" in categories
        assert "Foto" in categories

-    def test_get_categories_includes_defaults(self, tag_manager):
-        """Test že get_categories obsahuje default kategorie"""
-        categories = tag_manager.get_categories()
-        assert "Hodnocení" in categories
-        assert "Barva" in categories
+    def test_get_categories_empty_on_fresh_manager(self, tag_manager):
+        """Test že čerstvý TagManager nemá žádné kategorie (bez defaultů)"""
+        assert tag_manager.get_categories() == []

    def test_get_tags_in_category_empty(self, tag_manager):
        """Test získání tagů z prázdné kategorie"""
@@ -230,81 +214,33 @@ class TestTagManager:


 class TestDefaultTags:
-    """Testy pro defaultní tagy"""
+    """Testy pro defaultní tagy (legacy Tagger presety byly odstraněny)"""

    def test_default_tags_constant_exists(self):
-        """Test že DEFAULT_TAGS konstanta existuje"""
-        assert DEFAULT_TAGS is not None
+        """Test že DEFAULT_TAGS konstanta existuje a je prázdná"""
        assert isinstance(DEFAULT_TAGS, dict)
+        assert DEFAULT_TAGS == {}

-    def test_default_tags_has_hodnoceni(self):
-        """Test že DEFAULT_TAGS obsahuje Hodnocení"""
-        assert "Hodnocení" in DEFAULT_TAGS
-        assert len(DEFAULT_TAGS["Hodnocení"]) == 5
+    def test_legacy_presets_removed(self):
+        """Test že staré předdefinované kategorie (Hodnocení, Barva) jsou pryč"""
+        assert "Hodnocení" not in DEFAULT_TAGS
+        assert "Barva" not in DEFAULT_TAGS

-    def test_default_tags_has_barva(self):
-        """Test že DEFAULT_TAGS obsahuje Barva"""
-        assert "Barva" in DEFAULT_TAGS
-        assert len(DEFAULT_TAGS["Barva"]) == 6
-
-    def test_hodnoceni_stars_content(self):
-        """Test obsahu hvězdiček v Hodnocení"""
-        stars = DEFAULT_TAGS["Hodnocení"]
-        assert "⭐" in stars
-        assert "⭐⭐⭐⭐⭐" in stars
-
-    def test_barva_colors_content(self):
-        """Test obsahu barev v Barva"""
-        colors = DEFAULT_TAGS["Barva"]
-        # Kontrolujeme že obsahuje některé barvy
-        color_names = " ".join(colors)
-        assert "Červená" in color_names
-        assert "Zelená" in color_names
-        assert "Modrá" in color_names
-
-    def test_tag_manager_loads_all_default_tags(self):
-        """Test že TagManager načte všechny default tagy"""
+    def test_tag_manager_starts_empty(self):
+        """Test že TagManager bez defaultů startuje prázdný"""
        tm = TagManager()
+        assert tm.get_all_tags() == []
+        assert tm.get_categories() == []

-        for category, tag_names in DEFAULT_TAGS.items():
-            assert category in tm.tags_by_category
-            tags_in_category = tm.get_tags_in_category(category)
-            assert len(tags_in_category) == len(tag_names)
-
-    def test_can_add_custom_tags_alongside_defaults(self):
-        """Test že lze přidat vlastní tagy vedle defaultních"""
+    def test_can_add_custom_tags(self):
+        """Test že lze přidat vlastní tagy do prázdného manageru"""
        tm = TagManager()
-        initial_count = len(tm.get_all_tags())

        tm.add_tag("Custom", "MyTag")

-        assert len(tm.get_all_tags()) == initial_count + 1
+        assert tm.get_all_tags() == ["Custom/MyTag"]
        assert "Custom" in tm.get_categories()

-    def test_can_remove_default_category(self):
-        """Test že lze odstranit default kategorii"""
-        tm = TagManager()
-        tm.remove_category("Hodnocení")
-
-        assert "Hodnocení" not in tm.tags_by_category
-
-    def test_hodnoceni_tags_are_sorted_by_stars(self):
-        """Test že tagy v Hodnocení jsou seřazeny od 1 do 5 hvězd"""
-        tm = TagManager()
-        tags = tm.get_tags_in_category("Hodnocení")
-
-        tag_names = [t.name for t in tags]
-        assert tag_names == ["⭐", "⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐", "⭐⭐⭐⭐⭐"]
-
-    def test_barva_tags_are_sorted_in_predefined_order(self):
-        """Test že tagy v Barva jsou seřazeny v předdefinovaném pořadí"""
-        tm = TagManager()
-        tags = tm.get_tags_in_category("Barva")
-
-        tag_names = [t.name for t in tags]
-        expected = ["🔴 Červená", "🟠 Oranžová", "🟡 Žlutá", "🟢 Zelená", "🔵 Modrá", "🟣 Fialová"]
-        assert tag_names == expected
-
    def test_custom_category_tags_sorted_alphabetically(self):
        """Test že tagy v custom kategorii jsou seřazeny abecedně"""
        tm = TagManager()
@@ -316,12 +252,3 @@ class TestDefaultTags:
        tag_names = [t.name for t in tags]

        assert tag_names == ["4K", "HD", "SD"]
-
-    def test_can_add_tag_to_default_category(self):
-        """Test že lze přidat tag do default kategorie"""
-        tm = TagManager()
-        initial_count = len(tm.get_tags_in_category("Hodnocení"))
-
-        tm.add_tag("Hodnocení", "Custom Rating")
-
-        assert len(tm.get_tags_in_category("Hodnocení")) == initial_count + 1