diff --git a/docs/07-pre-import-cleanup.md b/docs/07-pre-import-cleanup.md
new file mode 100644
index 0000000..321b836
--- /dev/null
+++ b/docs/07-pre-import-cleanup.md
@@ -0,0 +1,1002 @@
+# 07 — Pre-Import Cleanup Ruleset (tv.s8n.ru)
+
+Last updated: 2026-05-08
+Server: Jellyfin 10.10.3 on nullstone, container `jellyfin`
+Library root inside container: `/media`
+Library root on host: `/home/user/media`
+
+This document defines the **normative pre-import cleanup ruleset** for the
+personal Jellyfin deploy. The owner downloads scene/group releases (e.g.
+`Futurama Season 1 [1080p AI x265 10bit FS99 Joy]/`) which contain a mixture
+of media files and non-media junk (codec readmes, release-group brags, Windows
+installer shortcuts, comparison images, OS thumbnail caches, etc.). This junk
+must NOT land in `/home/user/media/` because:
+
+1. It clutters the library and confuses scrapers.
+2. Promo PNGs may be mis-identified as artwork.
+3. Release-group `.nfo` files break the NFO-override flow (doc 02 § 11).
+4. **Windows executables and installer shortcuts (`.exe`, `.msi`, `.website`,
+ `.url`, `.lnk`, `.scr`, `.bat`, `.ps1`) are a real security vector.** Even
+ though the Linux server cannot execute them, friends with a Jellyfin
+ account can download them through the web UI and run them on their PC.
+
+Cross-linked to:
+
+- [`01-artwork-and-images.md`](01-artwork-and-images.md) — what counts as a
+ recognised poster / backdrop on disk.
+- [`02-metadata-and-titles.md`](02-metadata-and-titles.md) — NFO sidecar
+ override flow; what a "real" Jellyfin NFO looks like.
+- [`03-subtitles.md`](03-subtitles.md) — which subtitle files to keep.
+- [`05-file-structure-rules.md`](05-file-structure-rules.md) — canonical
+ folder layout. § 8 of doc 05 defines the recognised extras subfolders;
+ this doc enforces them at import time.
+- [`08-filename-normalization.md`](08-filename-normalization.md) — the
+ **next** stage of the pipeline (sibling agent), called after this doc's
+ `cleanup-import.sh` has produced a clean staging tree.
+
+Sources of truth:
+
+- — extras subfolders
+ and artwork filename patterns.
+- — same for series.
+- — NFO XML schema;
+ used here to distinguish a real metadata NFO from a release-group brag.
+
+---
+
+## 0. Top-level cleanup rules
+
+These are non-negotiable. They wrap the doc 05 top-level rules with one
+guarantee: **nothing leaves staging until cleanup has run and been
+confirmed.**
+
+1. **Never clean in-place on the source download.** The download directory
+ (`/home/admin/Downloads/...`) is treated as a read-only artefact until
+ the user explicitly approves deletion. The cleanup script copies into a
+ staging area and operates there.
+2. **Quarantine first, delete later.** First run of the cleanup script on a
+ release moves junk to `~/.jellyfin-quarantine///`
+ instead of deleting. The user reviews, then a second pass empties the
+ quarantine after sign-off. Subsequent runs on the same release are
+ idempotent.
+3. **Two-list policy.** Every file is matched against an `ALLOW` list (KEEP)
+ or a `DENY` list (DELETE). Anything not on either list is **flagged** and
+ surfaced in the audit report — a human decides. Never auto-delete on
+ "unknown".
+4. **Never run cleanup as root.** All operations are as the unprivileged
+ `admin` (onyx) or `user` (nullstone) account. The live `/home/user/media/`
+ tree is touched only by the rename step in doc 08, after cleanup has
+ produced an intermediate staging copy.
+5. **Idempotent.** Running cleanup twice on the same source must produce the
+ same staging tree byte-for-byte (same `find -printf '%p %s\n' | sort`
+ output, modulo timestamps).
+6. **Dry-run is the default.** The cleanup script with no flags lists what it
+ *would* do and exits without writing. `--apply` is required to actually
+ move/quarantine files.
+
+---
+
+## 1. Categorical taxonomy of non-media files in scene/group releases
+
+Scene and group ("p2p") releases follow loose conventions. The following
+categories cover everything observed in the wild plus everything in the
+Futurama download set:
+
+### 1.1 Codec / player promotion
+
+Text files and Windows shortcut files steering the user toward a specific
+codec pack or media player (often K-Lite + MPC-HC). Frequently the file is
+an `.url` or `.website` (Internet Shortcut) pointing to a third-party
+installer. **Always DELETE.**
+
+Real-world examples (`/home/admin/Downloads/futrama/`):
+
+- `How to play HEVC (THIS FILE).txt` — 65 lines of MPC-HC marketing.
+- `Ninite K-Lite Codecs Unattended Silent Installer and Updater.website`
+ — `URL=https://ninite.com/klitecodecs/` Internet Shortcut.
+
+Patterns:
+
+- `How to play *.txt`, `Read*Me*.txt`, `INSTALL*.txt`, `PLAY*.txt`
+- `*.website`, `*.url`, `*.lnk`
+- `K-Lite*`, `MPC-HC*`, `VLC*`, `MX Player*`, `LAV*`
+
+### 1.2 Release-group brag
+
+Plain-text or `.nfo` files where the release group identifies itself,
+documents encoder settings, or pumps its tracker URL. Distinguishable from a
+**Jellyfin-compatible metadata NFO** (XML, root `` / `` /
+``) by content — see § 3.
+
+Real-world examples:
+
+- `Encoded by JoyBell (UTR).txt` — 41-line manifesto from "Unity Team
+ Release group" pointing to `UNITEAM.CO`.
+- `RARBG.txt`, `WWW.YIFY-TORRENTS.COM.url`, `.nfo` with ASCII art.
+
+Patterns:
+
+- `Encoded by *.txt`, `Ripped by *.txt`, `.txt`
+- `RARBG.txt`, `RARBG_DO_NOT_MIRROR.exe` (yes, those exist; § 1.10)
+- `*-readme.txt`, `release notes.txt`
+- `*.nfo` containing only ASCII art (no `` / `` /
+ `` root element)
+- `*.diz`, `file_id.diz` — old "BBS description" file, scene leftover
+
+### 1.3 Promo images that are NOT poster artwork
+
+Images that LOOK like artwork to a naive globber but are actually before/after
+comparisons, group banners, or screenshot proofs. **Delete unless they live
+inside a recognised extras folder (§ 4) or match the strict allow-list of
+poster/backdrop names from doc 01.**
+
+Real-world example:
+
+- `Futurama Compare.png` (1.05 MB) — encoder before/after comparison.
+
+Patterns to delete:
+
+- `*Compare*.{png,jpg,jpeg,webp}`
+- `*Sample*.{png,jpg,jpeg}` (when not in a `samples/` extras folder)
+- `*Screen*.{png,jpg}`, `*Screens/*`, `*Proof/*`, `*Preview/*`
+- `*-banner.png` from a group (NOT the same as Jellyfin's `banner.jpg`;
+ group banners typically have the group name in the filename — heuristic
+ match `*JoyBell*`, `*UTR*`, `*JoY*`, etc.)
+- Stray `*.gif` files (animated previews); Jellyfin doesn't use GIF.
+
+### 1.4 OS-generated thumbnail caches
+
+Per-OS file managers (Windows Explorer, macOS Finder, GNOME Files) leave
+turds in every directory they browse. **Always DELETE — never useful, never
+metadata.**
+
+Patterns:
+
+- `Thumbs.db`, `ehthumbs.db`, `ehthumbs_vista.db`
+- `.DS_Store`, `._*` (macOS resource forks)
+- `Desktop.ini`, `desktop.ini`
+- `.directory` (KDE)
+- `.fseventsd/`, `.Spotlight-V100/`, `.Trashes/` (macOS)
+- `$RECYCLE.BIN/`, `System Volume Information/` (Windows mount)
+
+### 1.5 Sample files (lower-quality previews)
+
+Scene releases sometimes ship a 30-second sample file at lower bitrate.
+Jellyfin treats a `samples/` subfolder as extras (doc 05 § 8.2), but a stray
+`Movie.sample.mkv` next to the main file would scrape as "another version".
+
+**Default: DELETE.** Reasoning: we have the full file; the sample is dead
+weight. If the user genuinely wants samples, drop them into a `samples/`
+subfolder before running cleanup and the script will preserve the folder.
+
+Patterns to delete (when at the top level of a release):
+
+- `sample.{mkv,mp4,avi,m4v}`
+- `*-sample.{mkv,mp4,avi,m4v}`, `*.sample.{mkv,mp4,avi,m4v}`
+- `*_sample.{mkv,mp4,avi,m4v}`
+- `Sample/` directory (rename to `samples/` to preserve as extras, OR delete)
+
+### 1.6 Subtitle leftovers
+
+VobSub (DVD/Blu-ray bitmap subs) are shipped as a pair: `en.idx` (index) +
+`en.sub` (bitmap stream). Jellyfin can render them, but if a `.srt` exists
+with the same language tag the bitmap pair is redundant and slow.
+
+**Default: KEEP all `.srt` and `.ass`. KEEP `.idx`/`.sub` only if no `.srt`
+of the same language exists.** This is a per-file decision — surface to the
+user in the audit report rather than auto-pruning.
+
+Patterns:
+
+- `*.srt`, `*.ass`, `*.ssa`, `*.vtt` — KEEP (per doc 03).
+- `*.sup` (PGS bitmap, Blu-ray) — KEEP (Jellyfin renders).
+- `*.idx` + `*.sub` (VobSub) — KEEP if no `.srt` with same lang code; else
+ flag for human review.
+- `*.smi`, `*.rt` — DELETE (obsolete formats Jellyfin doesn't support).
+
+### 1.7 Torrent residue
+
+Files left by the torrent client itself. None are useful to Jellyfin.
+
+Patterns to delete:
+
+- `*.torrent`, `*.magnet`
+- `*.parts`, `*.!ut`, `*.!qB`, `*.bc!` (in-progress fragments)
+- `*.meta`, `*.aria2`
+- `*.pad`, `padding/`, `__padding_file_*` (mktorrent padding)
+- `*.sfv` (checksum manifest; harmless but useless after download)
+- `*.md5`, `*.sha1`, `*.sha256` (release-checksum sidecars)
+
+### 1.8 Test / proof images and folders
+
+Some groups ship a `Proof/` or `Screens/` folder with screenshots to "prove"
+the rip's quality. Useless inside a Jellyfin library.
+
+Patterns to delete (whole folders):
+
+- `Proof/`, `proof/`, `PROOF/`
+- `Screens/`, `screens/`, `Screenshots/`, `Caps/`
+- `Preview/`, `Previews/`
+- `_screens/`, `screenshots-only/`
+
+### 1.9 Multi-disc DVD/Blu-ray cruft
+
+When a release is a straight ISO rip the `VIDEO_TS/` or `BDMV/` directory
+sometimes survives next to the encoded file. Jellyfin can play
+`VIDEO_TS.IFO` directly, but a partial DVD structure left over from the
+encode is just clutter.
+
+Patterns:
+
+- `VIDEO_TS/` — KEEP if it contains a complete `VIDEO_TS.VOB` set;
+ otherwise flag.
+- `*.IFO`, `*.BUP`, `*.VOB` — KEEP if inside a complete `VIDEO_TS/`;
+ DELETE if loose.
+- `BDMV/`, `CERTIFICATE/`, `AACS/` — KEEP if complete BD structure;
+ flag if partial.
+- `*.iso` inside a media folder — flag for human review (could be the
+ intentional rip OR a Windows malware vector — see § 8).
+
+### 1.10 Outright malicious / suspicious
+
+Some releases historically shipped Windows executables disguised as
+"DO NOT MIRROR" anti-leech files. Even on a Linux server these must be
+deleted because the friend with a Jellyfin account can download them via
+the web UI ("Download original file" button) and run them locally.
+
+**Always DELETE, never quarantine, never preserve, no exceptions.**
+
+Patterns:
+
+- `*.exe`, `*.msi`, `*.bat`, `*.cmd`, `*.com`, `*.scr`, `*.ps1`, `*.vbs`,
+ `*.wsf`, `*.hta`, `*.jar`
+- `*.app/` (macOS bundle dropped by macOS-using uploader)
+- `*.dll`, `*.sys` (rare, but seen)
+- Anything with a double extension like `Movie.mkv.exe`
+
+---
+
+## 2. KEEP vs DELETE — exhaustive table
+
+This table is the **canonical decision matrix** for `cleanup-import.sh`.
+Patterns are case-insensitive on `ext4`+Jellyfin. `KEEP` means it goes to the
+staging tree; `DELETE` means it goes to quarantine on first run, then
+recycle-bin on confirm.
+
+| Pattern | Action | Why |
+|---|---|---|
+| `*.mkv`, `*.mp4`, `*.avi`, `*.m4v`, `*.ts`, `*.mov`, `*.webm`, `*.wmv`, `*.flv`, `*.mpg`, `*.mpeg` | **KEEP** | Media — the entire point. |
+| `*.srt`, `*.ass`, `*.ssa`, `*.vtt`, `*.sup` | **KEEP** | Subtitles (doc 03). |
+| `*.idx` + `*.sub` (VobSub pair) | **KEEP** if no `.srt` of same lang exists; else **FLAG** | Bitmap subs; redundant with SRT. |
+| `*.smi`, `*.rt` | **DELETE** | Obsolete subtitle formats; Jellyfin can't render. |
+| `folder.{jpg,png}`, `poster.{jpg,png}`, `cover.{jpg,png}`, `default.{jpg,png}`, `show.{jpg,png}`, `jacket.{jpg,png}`, `movie.{jpg,png}` | **KEEP** | Jellyfin-recognised primary artwork (doc 01). |
+| `backdrop.{jpg,png}`, `fanart.{jpg,png}`, `background.{jpg,png}`, `art.{jpg,png}`, `backdrop[0-9]*.{jpg,png}`, `backdrop-[0-9]*.{jpg,png}` | **KEEP** | Jellyfin-recognised backdrops (doc 01). |
+| `logo.{png,jpg}`, `clearlogo.{png,jpg}`, `banner.{jpg,png}`, `landscape.{jpg,png}`, `thumb.{jpg,png}`, `disc.{png,jpg}`, `clearart.{png,jpg}` | **KEEP** | Jellyfin-recognised auxiliary artwork. |
+| `season[0-9]*-poster.{jpg,png}`, `season[0-9]*.{jpg,png}`, `season-specials-poster.{jpg,png}` | **KEEP** | Per-season artwork (doc 01 / TV layout). |
+| `extrafanart/*.{jpg,png}`, `backdrops/*.{jpg,png,mp4}` | **KEEP** | Multi-backdrop folders (doc 05 § 8). |
+| `*.nfo` with XML root `` / `` / `` / `` / `` / `` | **KEEP** | Jellyfin-compatible metadata sidecar (doc 02 § 11). |
+| `*.nfo` without one of the above XML roots | **DELETE** | Release-group ASCII-art brag — pretends to be metadata, isn't. |
+| `*Compare*.{png,jpg,jpeg,webp,gif}` | **DELETE** | Encoder before/after — group promo. |
+| `*Sample*.{png,jpg,jpeg}` (image, top level) | **DELETE** | Group promo (NOT a Jellyfin sample folder). |
+| `*Screen*.{png,jpg}`, `Screens/`, `Screenshots/`, `Caps/` | **DELETE** | Proof shots. |
+| `Proof/`, `proof/`, `PROOF/` | **DELETE** (whole folder) | Quality-proof shots. |
+| `Preview/`, `Previews/` | **DELETE** (whole folder) | Lower-quality teaser. |
+| `*.txt` (any) | **DELETE** | Readme / group brag — Jellyfin doesn't read TXT. |
+| `*.diz`, `file_id.diz` | **DELETE** | Scene description file — obsolete. |
+| `*.website`, `*.url`, `*.lnk` | **DELETE** | Windows Internet Shortcut — points at codec/installer pages. **Security: § 8.** |
+| `*.exe`, `*.msi`, `*.bat`, `*.cmd`, `*.com`, `*.scr`, `*.ps1`, `*.vbs`, `*.wsf`, `*.hta`, `*.jar`, `*.dll`, `*.sys` | **DELETE** | Windows executable. **Security: § 8.** |
+| `*.app/` | **DELETE** (whole folder) | macOS bundle. |
+| `Thumbs.db`, `ehthumbs.db`, `ehthumbs_vista.db` | **DELETE** | Windows Explorer thumbnail cache. |
+| `.DS_Store`, `._*` | **DELETE** | macOS Finder. |
+| `Desktop.ini`, `desktop.ini` | **DELETE** | Windows folder customisation. |
+| `.directory` | **DELETE** | KDE Dolphin. |
+| `.fseventsd/`, `.Spotlight-V100/`, `.Trashes/`, `$RECYCLE.BIN/`, `System Volume Information/` | **DELETE** (whole folder) | OS metadata directories. |
+| `sample.{mkv,mp4,avi,m4v}` (top level) | **DELETE** | Lower-quality preview (doc 05 § 8.1: full file already present). |
+| `*-sample.{mkv,mp4,avi,m4v}`, `*_sample.{mkv,mp4,avi,m4v}`, `*.sample.{mkv,mp4,avi,m4v}` | **DELETE** | Same. |
+| `Sample/` (directory, top level) | **DELETE** | Lower-quality preview folder. |
+| `samples/` (directory, recognised name) | **KEEP** | Jellyfin extras folder (doc 05 § 8.2). |
+| `featurettes/`, `behind the scenes/`, `deleted scenes/`, `interviews/`, `scenes/`, `shorts/`, `clips/`, `trailers/`, `extras/`, `other/`, `theme-music/`, `backdrops/` | **KEEP** (whole folder) | Jellyfin extras (doc 05 § 8.2). |
+| `Featurettes/`, `Behind The Scenes/`, etc. (capitalised) | **KEEP** but **rename to lowercase** | Jellyfin matches case-insensitive but lowercase is the documented form. |
+| Any other folder name | **FLAG** | Surface to human; might be a typo of an extras folder. |
+| `*.torrent`, `*.magnet` | **DELETE** | Torrent client residue. |
+| `*.parts`, `*.!ut`, `*.!qB`, `*.bc!`, `*.aria2` | **DELETE** | In-progress download fragments (shouldn't be here, but defensive). |
+| `*.meta` | **DELETE** | aria2/torrent metadata. |
+| `*.pad`, `padding/`, `__padding_file_*`, `_____padding_file_*` | **DELETE** | mktorrent padding files. |
+| `*.sfv`, `*.md5`, `*.sha1`, `*.sha256` | **DELETE** | Checksum manifests; harmless but useless after download. |
+| `*.rar`, `*.r[0-9][0-9]`, `*.zip`, `*.7z`, `*.tar`, `*.tar.gz` | **FLAG** | Compressed archive in a media folder is suspicious — release should have been extracted before download. |
+| `*.iso` inside a media folder | **FLAG** | Could be intentional DVD/BD rip OR Windows-installer disguise. Human review. |
+| `VIDEO_TS/` (complete) | **KEEP** | Jellyfin plays DVD structure directly. |
+| `*.IFO`, `*.BUP`, `*.VOB` (loose, no `VIDEO_TS/`) | **DELETE** | Orphan DVD remnants. |
+| `BDMV/` (complete) | **KEEP** | Jellyfin plays BD structure. |
+| `CERTIFICATE/`, `AACS/` (without `BDMV/`) | **DELETE** | Orphan BD remnants. |
+| `RARBG*.{txt,exe}`, `WWW.*.url`, `*.YIFY*.url` | **DELETE** | Tracker promo. |
+| `RARBG_DO_NOT_MIRROR.exe` and similar | **DELETE** (security: § 8) | Historic anti-leech file; sometimes weaponised. |
+| Anything else | **FLAG** | Two-list policy: never auto-delete on "unknown". |
+
+---
+
+## 3. NFO handling — the nuanced case
+
+`.nfo` is overloaded. Two completely different file kinds share the
+extension:
+
+- **Scene release `.nfo`** — plain text, ASCII art, encoder credits, tracker
+ URL. Useless to Jellyfin (and at worst gets scraped as garbage metadata
+ if NFO Saver is enabled).
+- **Jellyfin/Kodi/Emby metadata NFO** — XML, root element is one of
+ ``, ``, ``, ``, ``,
+ ``. Documented in doc 02 § 11.
+
+### 3.1 The discriminator one-liner
+
+```bash
+is_jellyfin_nfo() {
+ # Returns 0 (KEEP) if the file looks like a Jellyfin/Kodi NFO,
+ # 1 (DELETE) if it looks like scene-group ASCII-art brag.
+ head -c 4096 "$1" | tr -d '[:space:]' \
+ | grep -qE '<(movie|tvshow|episodedetails|artist|album|musicvideo|season)\b'
+}
+
+# Usage:
+if is_jellyfin_nfo "$f"; then echo "KEEP $f"; else echo "DELETE $f"; fi
+```
+
+The first 4096 bytes are enough — a real Jellyfin NFO declares its root
+within the first kilobyte. `tr -d '[:space:]'` is needed because some
+encoders pretty-print the XML and put ``, ``): DELETE.
+ Jellyfin won't read it; nothing to preserve.
+- An NFO with valid XML but **stale TMDB/IMDB IDs** that conflict with a
+ newer scrape: KEEP, but flag for the user — doc 02 § 11.5 explains how
+ the NFO Saver overwrites these on next scrape.
+- Multiple NFOs in one folder (e.g. `release.nfo` from the group AND
+ `tvshow.nfo` from a previous Jellyfin write): KEEP `tvshow.nfo`,
+ DELETE `release.nfo`. Use the discriminator above on each.
+
+### 3.3 First-100-bytes shortcut
+
+The task brief proposes this:
+
+```bash
+if head -c 100 file.nfo | grep -qE '<(movie|tvshow|episodedetails)\b'; then echo KEEP; else echo DELETE; fi
+```
+
+This works for the common case but misses NFOs that start with an XML
+declaration (`` plus possibly a comment) before the
+root element — that prologue alone can be > 100 bytes. The 4096-byte
+version above is safer; we use that in `cleanup-import.sh`.
+
+---
+
+## 4. Featurettes / Extras / Bonus folders — the canonical list
+
+Per the Jellyfin docs (movies and shows pages), these subfolder names are
+recognised and the contained files are tagged with the matching extra
+type. **Folder name match is case-insensitive but lowercase is the
+documented canonical form** — `cleanup-import.sh` lowercases on copy to
+staging.
+
+| Folder name | Extra type | Notes |
+|---|---|---|
+| `behind the scenes` | Behind The Scenes | spaces, not dashes |
+| `deleted scenes` | Deleted Scene | |
+| `interviews` | Interview | |
+| `scenes` | Scene | |
+| `samples` | Sample | distinct from a top-level `Sample/` (§ 1.5) |
+| `shorts` | Short | |
+| `featurettes` | Featurette | |
+| `clips` | Clip | |
+| `other` | Other | catch-all |
+| `extras` | Extra | generic catch-all |
+| `trailers` | Trailer | |
+| `theme-music` | Theme music | `.mp3` files; doc 05 § 8.3 |
+| `backdrops` | Backdrop video | rotating video backgrounds |
+
+Anything else (e.g. `Bonus Features/`, `BTS/`, `Special Features/`,
+`Featurette/` singular, `behind-the-scenes/` with dashes) is **NOT** matched
+by Jellyfin and the contents won't surface as extras. Cleanup either
+renames to the canonical name (when the mapping is unambiguous) or flags
+for human review.
+
+### 4.1 Canonical-name mapping (auto-rename)
+
+| Found | Renamed to |
+|---|---|
+| `Featurettes/`, `Featurette/`, `FEATURETTES/` | `featurettes/` |
+| `Behind The Scenes/`, `BTS/`, `behind-the-scenes/` | `behind the scenes/` |
+| `Deleted Scenes/`, `Deleted_Scenes/`, `deleted-scenes/` | `deleted scenes/` |
+| `Interviews/`, `Interview/` | `interviews/` |
+| `Trailers/`, `Trailer/` | `trailers/` |
+| `Bonus/`, `Bonus Features/`, `Bonus Material/`, `Special Features/`, `Specials/` | `extras/` (generic catch-all) |
+| `Outtakes/`, `Bloopers/`, `Gag Reel/` | `extras/` (no dedicated folder) |
+
+The `Specials/` rename to `extras/` is **important** — for a TV series,
+`Specials/` looks like a season folder (Season 0 specials), but if the
+files inside are featurettes rather than aired specials, putting them in
+the wrong folder mis-scrapes them as episodes. When in doubt, flag.
+
+### 4.2 Real-world example: Futurama download
+
+The four Futurama season folders all contain a `Featurettes/` subfolder:
+
+```
+Futurama Season 1 [1080p AI x265 10bit FS99 Joy]/Featurettes/
+├── Episode One Animatic.mkv
+└── Welcome to the World of Tomorrow.mkv
+
+Futurama Season 2 .../Featurettes/
+├── Animatic -Why Must I be a Crustacean in Love.mkv
+└── Futurama Game Trailer.mkv
+
+Futurama Season 3 .../Featurettes/
+├── An X-Mas Message From David X. Cohen.mkv
+└── Deleted Scenes.mkv
+
+Futurama Season 4 .../Featurettes/
+├── Futurama Welcome to the World of Tomorrow (x265 Joy).mkv
+├── Outtakes - Kif Gets Knocked Up a Notch [1080p x265 10bit Joy].mkv
+└── Panel on Voice Actors [1080p x265 10bit Joy].mkv
+```
+
+After cleanup these become `featurettes/` (lowercase) inside the season
+folder. Doc 08 (filename normalization) then renames the season folder
+itself to `Season 01/` and may relocate the season-level featurettes to a
+**series-level** `featurettes/` folder if the user prefers extras at the
+series root (this is a doc 05 § 8 / doc 08 decision, not this doc's).
+
+> Note: `Season 3 / Deleted Scenes.mkv` is a single file and should arguably
+> be moved into a `deleted scenes/` subfolder rather than left in
+> `featurettes/`. That's a manual disambiguation — flagged, not auto-moved.
+
+---
+
+## 5. Audit-then-clean workflow
+
+Three-stage pipeline. Stage 1 is mandatory; stage 2 runs on user approval;
+stage 3 is reversible until the quarantine retention window expires.
+
+### 5.1 Stage 1 — Dry-run audit
+
+Lists every file in the source release classified as KEEP / DELETE / FLAG.
+Writes nothing.
+
+```bash
+# Dry-run audit on a single release dir.
+cleanup-import.sh "/home/admin/Downloads/futrama/Futurama Season 1 [1080p AI x265 10bit FS99 Joy]"
+```
+
+Output (one line per file):
+
+```
+KEEP Futurama S01E01 Space Pilot 3000 [1080p x265 10bit Joy].mkv
+KEEP folder.jpg
+KEEP Featurettes/Episode One Animatic.mkv -> featurettes/Episode One Animatic.mkv
+DELETE Encoded by JoyBell (UTR).txt [release-group brag]
+DELETE How to play HEVC (THIS FILE).txt [codec promo .txt]
+DELETE Ninite K-Lite Codecs Unattended Silent ....website [windows .website -- SECURITY]
+DELETE Futurama Compare.png [encoder compare image]
+FLAG SomeUnknownFile.bin [unknown extension]
+```
+
+A **summary** at the bottom:
+
+```
+KEEP 16 files (5.92 GiB)
+DELETE 4 files (1.08 MiB)
+FLAG 0 files
+Run with --apply to quarantine the DELETE set.
+```
+
+Quick one-liner equivalents (for ad-hoc spot checks; the script § 9 is
+preferred):
+
+```bash
+# What would I delete?
+find "$SRC" \( \
+ -iname '*.txt' -o -iname '*.nfo' -o -iname '*.url' -o -iname '*.website' \
+ -o -iname '*.lnk' -o -iname '*.exe' -o -iname '*.msi' -o -iname '*.bat' \
+ -o -iname '*.scr' -o -iname '*.ps1' -o -iname '*.cmd' -o -iname '*.com' \
+ -o -iname 'Thumbs.db' -o -iname '.DS_Store' -o -iname 'Desktop.ini' \
+ -o -iname '*Compare*.png' -o -iname '*Compare*.jpg' \
+ -o -iname 'sample.mkv' -o -iname '*.sample.mkv' -o -iname '*-sample.mkv' \
+ -o -iname '*.torrent' -o -iname '*.sfv' -o -iname '*.md5' \
+\) -print
+
+# What looks like a real Jellyfin NFO vs a release-group brag?
+find "$SRC" -iname '*.nfo' -print0 | while IFS= read -r -d '' f; do
+ if head -c 4096 "$f" | tr -d '[:space:]' \
+ | grep -qE '<(movie|tvshow|episodedetails|artist|album|musicvideo|season)\b'; then
+ printf 'KEEP %s\n' "$f"
+ else
+ printf 'DELETE %s\n' "$f"
+ fi
+done
+```
+
+### 5.2 Stage 2 — Quarantine apply
+
+```bash
+cleanup-import.sh --apply "/home/admin/Downloads/futrama/Futurama Season 1 [...]"
+```
+
+What it does:
+
+1. **Copies** the source directory tree to
+ `/home/admin/.jellyfin-staging//`. The source is never
+ modified.
+2. Inside the staging copy, **moves** every DELETE-classified file to
+ `/home/admin/.jellyfin-quarantine///`,
+ preserving relative paths so a user can `diff -r` to confirm.
+3. **Renames** non-canonical extras subfolders to canonical lowercase
+ (§ 4.1).
+4. Writes a manifest at
+ `/home/admin/.jellyfin-staging//.cleanup-manifest.json`
+ listing every file action with sha256, source path, action, target
+ path. This is what stage 3 reads.
+5. Returns the staging path on stdout — that's the input to doc 08's
+ filename normalizer.
+
+### 5.3 Stage 3 — Confirm and recycle
+
+After the user reviews the quarantine directory and approves:
+
+```bash
+cleanup-import.sh --confirm-quarantine 2026-05-08
+```
+
+Moves `/home/admin/.jellyfin-quarantine/2026-05-08/` to the system trash
+(via `gio trash`) — still recoverable, but no longer cluttering the
+quarantine root. After 30 days a cron sweep empties trash older than that.
+
+### 5.4 Never delete from source
+
+The source download (`/home/admin/Downloads/futrama/...`) is **never**
+modified by `cleanup-import.sh`. Reasons:
+
+- The user may want to re-seed the torrent.
+- The user may want to re-run cleanup with different rules later.
+- Bugs in the cleanup script must never destroy original artefacts.
+
+Source deletion is a separate manual step the user does AFTER the
+import is verified in Jellyfin and the library is happy. There is no
+script for it on purpose.
+
+---
+
+## 6. Idempotency, edge cases, and "unknown" handling
+
+- **Idempotent.** `cleanup-import.sh --apply` on an already-cleaned staging
+ directory is a no-op (nothing matches DELETE). The script detects this
+ and exits 0 with `nothing to do`.
+- **Re-runnable on source.** Re-running the script on the same source
+ produces a fresh staging copy, overwriting (after backup) the previous
+ staging directory. Quarantine is dated, so two runs on the same day for
+ the same release append rather than overwrite (`.2/`,
+ `.3/`, etc.).
+- **Unknown extension** (e.g. `.dat`, `.bin`, `.iso`, `.bin.txt`) — never
+ auto-deleted. FLAGGED in the audit output, surfaced to the user. The
+ user adds it to the local override file
+ `~/.config/jellyfin-cleanup/local-rules.conf` if they want it
+ classified next time.
+- **Hidden dotfiles** (anything starting with `.` other than known OS
+ caches like `.DS_Store`) — FLAGGED. Don't auto-delete; could be a
+ legitimate `.subliminal.cache` (subtitles plugin) or similar.
+- **Symlinks** — never followed. A symlink in a release directory is
+ always FLAGGED; the script refuses to copy or quarantine it.
+- **Permission denied** — script bails with non-zero exit. Never
+ partially applies.
+
+---
+
+## 7. The `Futurama Compare.png` problem (artwork false-positive)
+
+`Futurama Compare.png` is a 1.05 MB PNG sitting next to the season's MKV
+files. To a naive image-globber it looks like artwork — same extension as
+`folder.jpg`, larger than the typical poster, sitting in the right
+location. It's actually an encoder comparison shot.
+
+The rule from doc 01 (artwork) and enforced here:
+
+> **An image file in the release root is KEPT only if its name is on the
+> exact recognised-artwork allow-list.** Anything else is DELETED.
+
+Recognised artwork allow-list (top-level of an item folder):
+
+- `folder.{jpg,jpeg,png,webp}`
+- `poster.{jpg,jpeg,png,webp}`
+- `cover.{jpg,jpeg,png,webp}`
+- `default.{jpg,jpeg,png,webp}`
+- `show.{jpg,jpeg,png,webp}` (series only)
+- `jacket.{jpg,jpeg,png,webp}` (series only)
+- `movie.{jpg,jpeg,png,webp}` (movies only)
+- `backdrop.{jpg,jpeg,png,webp}` and `backdrop[0-9]*.{jpg,jpeg,png,webp}`
+- `fanart.{jpg,jpeg,png,webp}`, `background.{jpg,jpeg,png,webp}`,
+ `art.{jpg,jpeg,png,webp}`
+- `logo.{png,jpg}`, `clearlogo.{png,jpg}`
+- `banner.{jpg,png}`, `landscape.{jpg,png}`, `thumb.{jpg,png}`,
+ `disc.{png,jpg}`, `clearart.{png,jpg}`
+- `season[0-9]*-poster.{jpg,png}`, `season[0-9]*.{jpg,png}`,
+ `season-specials-poster.{jpg,png}`
+- `extrafanart/` and `backdrops/` directories (any contents OK)
+
+Exception: images **inside** a recognised extras folder (`extras/`,
+`featurettes/`, etc.) are KEPT regardless of name — they're presumed to be
+intentional content of that extra.
+
+`Futurama Compare.png` matches none of these allow-list patterns and is
+not inside an extras folder, so it's DELETED.
+
+---
+
+## 8. Security rules
+
+The single most important rule in this document:
+
+> **Windows-executable extensions and Internet Shortcut formats are
+> auto-deleted, never quarantined for "review", because the threat model
+> isn't the Linux server, it's the Jellyfin user who downloads them.**
+
+Jellyfin has a "Download original file" button for every item. If a
+release contains `Codec Installer.exe`, Jellyfin will happily serve it to
+any user with library access — including the friend on Windows who might
+not understand that downloading and running an `.exe` from a media library
+is a terrible idea. We don't trust the upload chain (the release group),
+so we strip these on the server side.
+
+Exhaustive auto-delete list (security override — these bypass the
+"FLAG unknown" rule):
+
+| Pattern | Risk |
+|---|---|
+| `*.exe` | Windows executable. Direct code execution on download+run. |
+| `*.msi` | Windows Installer package. Silent install possible. |
+| `*.bat`, `*.cmd` | Windows batch script. Runs in `cmd.exe`. |
+| `*.com` | Old DOS-style executable. Still runs on modern Windows. |
+| `*.scr` | Windows screensaver = .exe in disguise. Classic malware vector. |
+| `*.ps1` | PowerShell script. Common modern malware delivery. |
+| `*.vbs`, `*.wsf`, `*.hta`, `*.js` (Windows Script Host) | Active scripting. |
+| `*.jar` | Java archive — runs as `java -jar` on systems with JRE. |
+| `*.dll`, `*.sys` | Windows libraries / drivers. Side-load attacks. |
+| `*.url`, `*.website`, `*.lnk` | Internet Shortcut / Windows Shortcut. Points at attacker-controlled URL. |
+| `*.iso`, `*.img` (in a media folder, not at the library root) | Mountable disk image. Can carry Windows installers. **FLAG, not auto-delete** — could legitimately be a DVD rip. |
+| `*.app/` | macOS application bundle. Auto-deleted. |
+| `Autorun.inf` | Windows autorun config. **AUTO-DELETE.** |
+
+Total auto-delete categories that are **purely** security-driven (not
+Jellyfin-irrelevance-driven): **15** — `.exe`, `.msi`, `.bat`, `.cmd`,
+`.com`, `.scr`, `.ps1`, `.vbs`, `.wsf`, `.hta`, `.jar`, `.dll`, `.sys`,
+`.url`/`.website`/`.lnk`, `Autorun.inf`. Plus 1 flagged for human review:
+`.iso`/`.img`.
+
+### 8.1 Why `.url` is in the security list
+
+`.url` is a plain-text Internet Shortcut. On Windows, double-clicking it
+opens the target in the default browser. The "target" is whatever the
+release group put in the `URL=` line. Historically this was used to push
+codec-pack download pages with bundled adware. There is no benign reason
+for a `.url` to ship in a media release.
+
+The Futurama release contains exactly this pattern:
+
+```
+[InternetShortcut]
+URL=https://ninite.com/klitecodecs/
+```
+
+Ninite itself is reputable — but the principle is "do not ship clickable
+URLs to third-party installers in a media library, ever".
+
+### 8.2 The `RARBG_DO_NOT_MIRROR.exe` historic case
+
+Some releases historically contained a file named
+`RARBG_DO_NOT_MIRROR.exe`, ostensibly to discourage mirror sites from
+re-uploading. In several documented cases this file was actually adware
+or a cryptominer. Auto-delete, no questions asked.
+
+---
+
+## 9. Prepared cleanup script — `cleanup-import.sh`
+
+Idempotent. Dry-run by default. Quarantine-first. Source-immutable.
+Returns the staging path on stdout for piping to doc 08's normalizer.
+
+Save to `bin/cleanup-import.sh` in the `jellyfin-stack` repo.
+
+```bash
+#!/usr/bin/env bash
+# cleanup-import.sh — Pre-import cleanup for tv.s8n.ru
+# Version 1.0 (2026-05-08) — see docs/07-pre-import-cleanup.md
+#
+# Usage:
+# cleanup-import.sh SRC # dry-run
+# cleanup-import.sh --apply SRC # quarantine
+# cleanup-import.sh --confirm-quarantine YYYY-MM-DD # recycle
+#
+# Exit codes:
+# 0 success / nothing to do
+# 1 user error (bad args, source not found)
+# 2 internal error (permission, partial state)
+# 3 flagged files present — user must review before --apply
+set -euo pipefail
+
+STAGING_ROOT="${JELLYFIN_STAGING_ROOT:-$HOME/.jellyfin-staging}"
+QUARANTINE_ROOT="${JELLYFIN_QUARANTINE_ROOT:-$HOME/.jellyfin-quarantine}"
+TODAY="$(date +%Y-%m-%d)"
+
+# ----- classification -----
+# Returns one of: KEEP DELETE FLAG
+classify() {
+ local path="$1"
+ local base
+ base="$(basename "$path")"
+ local lower
+ lower="$(printf '%s' "$base" | tr '[:upper:]' '[:lower:]')"
+
+ # Security overrides — bypass everything else
+ case "$lower" in
+ *.exe|*.msi|*.bat|*.cmd|*.com|*.scr|*.ps1|*.vbs|*.wsf|*.hta|*.jar|*.dll|*.sys) echo DELETE; return ;;
+ *.url|*.website|*.lnk) echo DELETE; return ;;
+ autorun.inf) echo DELETE; return ;;
+ esac
+
+ # OS junk
+ case "$lower" in
+ thumbs.db|ehthumbs.db|ehthumbs_vista.db|.ds_store|desktop.ini|.directory) echo DELETE; return ;;
+ ._*) echo DELETE; return ;;
+ esac
+
+ # Media — KEEP
+ case "$lower" in
+ *.mkv|*.mp4|*.avi|*.m4v|*.ts|*.mov|*.webm|*.wmv|*.flv|*.mpg|*.mpeg) echo KEEP; return ;;
+ *.srt|*.ass|*.ssa|*.vtt|*.sup|*.idx|*.sub) echo KEEP; return ;;
+ *.mp3|*.flac|*.ogg|*.opus|*.m4a|*.wav) echo KEEP; return ;;
+ esac
+
+ # Recognised artwork at item root
+ case "$lower" in
+ folder.jpg|folder.jpeg|folder.png|folder.webp) echo KEEP; return ;;
+ poster.jpg|poster.jpeg|poster.png|poster.webp) echo KEEP; return ;;
+ cover.jpg|cover.jpeg|cover.png|cover.webp) echo KEEP; return ;;
+ default.jpg|default.png|show.jpg|show.png|jacket.jpg|jacket.png|movie.jpg|movie.png) echo KEEP; return ;;
+ backdrop.jpg|backdrop.png|backdrop[0-9]*.jpg|backdrop[0-9]*.png) echo KEEP; return ;;
+ fanart.jpg|fanart.png|background.jpg|background.png|art.jpg|art.png) echo KEEP; return ;;
+ logo.png|logo.jpg|clearlogo.png|clearlogo.jpg|banner.jpg|banner.png) echo KEEP; return ;;
+ landscape.jpg|landscape.png|thumb.jpg|thumb.png|disc.png|disc.jpg|clearart.png|clearart.jpg) echo KEEP; return ;;
+ season[0-9]*-poster.jpg|season[0-9]*-poster.png|season[0-9]*.jpg|season[0-9]*.png) echo KEEP; return ;;
+ season-specials-poster.jpg|season-specials-poster.png) echo KEEP; return ;;
+ esac
+
+ # Promo images masquerading as art
+ case "$lower" in
+ *compare*.png|*compare*.jpg|*compare*.jpeg|*compare*.webp|*compare*.gif) echo DELETE; return ;;
+ *sample*.png|*sample*.jpg|*sample*.jpeg) echo DELETE; return ;;
+ *screen*.png|*screen*.jpg|*preview*.png|*preview*.jpg) echo DELETE; return ;;
+ esac
+
+ # Text-flavoured junk
+ case "$lower" in
+ *.txt|*.diz|file_id.diz) echo DELETE; return ;;
+ esac
+
+ # Sample files
+ case "$lower" in
+ sample.mkv|sample.mp4|sample.avi|sample.m4v) echo DELETE; return ;;
+ *-sample.mkv|*-sample.mp4|*.sample.mkv|*.sample.mp4|*_sample.mkv|*_sample.mp4) echo DELETE; return ;;
+ esac
+
+ # Torrent residue
+ case "$lower" in
+ *.torrent|*.magnet|*.parts|*.aria2|*.meta) echo DELETE; return ;;
+ *.pad|__padding_file_*|_____padding_file_*) echo DELETE; return ;;
+ *.sfv|*.md5|*.sha1|*.sha256) echo DELETE; return ;;
+ esac
+
+ # NFO discriminator — KEEP if Jellyfin-compatible XML, else DELETE
+ case "$lower" in
+ *.nfo)
+ if head -c 4096 "$path" | tr -d '[:space:]' \
+ | grep -qE '<(movie|tvshow|episodedetails|artist|album|musicvideo|season)\b'; then
+ echo KEEP
+ else
+ echo DELETE
+ fi
+ return
+ ;;
+ esac
+
+ # Suspicious archives in a media folder
+ case "$lower" in
+ *.rar|*.r[0-9][0-9]|*.zip|*.7z|*.tar|*.tar.gz|*.iso|*.img) echo FLAG; return ;;
+ esac
+
+ echo FLAG
+}
+
+# ----- folder classification -----
+# Returns one of: KEEP_AS-IS RENAME: DELETE FLAG
+classify_dir() {
+ local d="$1"
+ local lower
+ lower="$(basename "$d" | tr '[:upper:]' '[:lower:]')"
+ case "$lower" in
+ behind\ the\ scenes|deleted\ scenes|interviews|scenes|samples|shorts|featurettes|clips|other|extras|trailers|theme-music|backdrops)
+ echo "RENAME:$lower"; return ;;
+ bts|behind-the-scenes) echo "RENAME:behind the scenes"; return ;;
+ deleted-scenes|deleted_scenes) echo "RENAME:deleted scenes"; return ;;
+ bonus|bonus\ features|bonus\ material|special\ features|outtakes|bloopers|gag\ reel) echo "RENAME:extras"; return ;;
+ proof|screens|screenshots|caps|preview|previews) echo DELETE; return ;;
+ sample) echo DELETE; return ;;
+ .fseventsd|.spotlight-v100|.trashes|\$recycle.bin|system\ volume\ information) echo DELETE; return ;;
+ extrafanart) echo "RENAME:extrafanart"; return ;; # case stays, recognised
+ *) echo FLAG; return ;;
+ esac
+}
+
+# ----- main -----
+APPLY=0
+CONFIRM_DATE=""
+SRC=""
+
+while [[ $# -gt 0 ]]; do
+ case "$1" in
+ --apply) APPLY=1; shift ;;
+ --confirm-quarantine) CONFIRM_DATE="$2"; shift 2 ;;
+ -h|--help) sed -n '2,12p' "$0"; exit 0 ;;
+ -*) echo "unknown flag: $1" >&2; exit 1 ;;
+ *) SRC="$1"; shift ;;
+ esac
+done
+
+if [[ -n "$CONFIRM_DATE" ]]; then
+ if [[ -d "$QUARANTINE_ROOT/$CONFIRM_DATE" ]]; then
+ gio trash "$QUARANTINE_ROOT/$CONFIRM_DATE"
+ echo "Recycled $QUARANTINE_ROOT/$CONFIRM_DATE"
+ else
+ echo "No quarantine for $CONFIRM_DATE" >&2; exit 1
+ fi
+ exit 0
+fi
+
+[[ -n "$SRC" && -d "$SRC" ]] || { echo "usage: $0 [--apply] SRC" >&2; exit 1; }
+
+RELEASE="$(basename "$SRC")"
+STAGE="$STAGING_ROOT/$RELEASE"
+QUAR="$QUARANTINE_ROOT/$TODAY/$RELEASE"
+
+declare -i KEEP_N=0 DEL_N=0 FLAG_N=0
+
+# Walk source, classify each entry
+while IFS= read -r -d '' f; do
+ rel="${f#$SRC/}"
+ if [[ -d "$f" ]]; then
+ case "$(classify_dir "$f")" in
+ KEEP_AS-IS|RENAME:*) ;;
+ DELETE) printf 'DELETE %s/ [junk dir]\n' "$rel"; DEL_N+=1 ;;
+ FLAG) printf 'FLAG %s/ [unknown dir name]\n' "$rel"; FLAG_N+=1 ;;
+ esac
+ continue
+ fi
+ case "$(classify "$f")" in
+ KEEP) printf 'KEEP %s\n' "$rel"; KEEP_N+=1 ;;
+ DELETE) printf 'DELETE %s\n' "$rel"; DEL_N+=1 ;;
+ FLAG) printf 'FLAG %s\n' "$rel"; FLAG_N+=1 ;;
+ esac
+done < <(find "$SRC" -mindepth 1 -print0)
+
+echo "---"
+echo "KEEP $KEEP_N"
+echo "DELETE $DEL_N"
+echo "FLAG $FLAG_N"
+
+if (( FLAG_N > 0 )); then
+ echo "FLAG count > 0; review before re-running with --apply." >&2
+ (( APPLY == 0 )) || exit 3
+fi
+
+if (( APPLY == 0 )); then
+ echo "Dry run only. Re-run with --apply to quarantine."
+ exit 0
+fi
+
+# --- APPLY path: copy to staging, move DELETE to quarantine ---
+mkdir -p "$STAGE" "$QUAR"
+# rsync -a preserves perms and is idempotent
+rsync -a --delete "$SRC/" "$STAGE/"
+
+while IFS= read -r -d '' f; do
+ rel="${f#$STAGE/}"
+ if [[ -d "$f" ]]; then
+ res="$(classify_dir "$f")"
+ case "$res" in
+ RENAME:*)
+ target="${res#RENAME:}"
+ parent="$(dirname "$f")"
+ [[ "$(basename "$f")" == "$target" ]] || mv "$f" "$parent/$target"
+ ;;
+ DELETE)
+ mkdir -p "$QUAR/$(dirname "$rel")"
+ mv "$f" "$QUAR/$rel"
+ ;;
+ esac
+ continue
+ fi
+ case "$(classify "$f")" in
+ DELETE)
+ mkdir -p "$QUAR/$(dirname "$rel")"
+ mv "$f" "$QUAR/$rel"
+ ;;
+ esac
+done < <(find "$STAGE" -mindepth 1 -print0)
+
+# Manifest
+{
+ echo "{"
+ echo " \"release\": \"$RELEASE\","
+ echo " \"date\": \"$TODAY\","
+ echo " \"source\": \"$SRC\","
+ echo " \"staging\": \"$STAGE\","
+ echo " \"quarantine\": \"$QUAR\""
+ echo "}"
+} > "$STAGE/.cleanup-manifest.json"
+
+# Stdout: the staging path, for piping to doc 08's normalizer
+echo "$STAGE"
+```
+
+### 9.1 Pipeline integration
+
+```bash
+# Full pre-import flow:
+SRC="/home/admin/Downloads/futrama/Futurama Season 1 [1080p AI x265 10bit FS99 Joy]"
+STAGING="$(cleanup-import.sh --apply "$SRC")"
+# STAGING is now ~/.jellyfin-staging/Futurama Season 1.../ with junk gone.
+# Hand off to doc 08:
+normalize-filenames.sh "$STAGING"
+# Then move to live media tree (manual; doc 05 confirms layout):
+mv "$STAGING" "/home/user/media/tv/Futurama (1999)/Season 01"
+```
+
+The `mv` to the live tree is **deliberately manual**. Cleanup and rename
+are reproducible from source; the move into `/home/user/media/` is the
+point of no return and the user runs it consciously.
+
+---
+
+## 10. What this doc explicitly does NOT do
+
+- **Filename normalization** — that's doc 08. This doc only deletes; doc 08
+ renames `Futurama S01E01 Space Pilot 3000 [1080p x265 10bit Joy].mkv`
+ into the canonical `Futurama (1999) - S01E01 - Space Pilot 3000.mkv`.
+- **Subtitle reconciliation** — doc 03 covers per-language naming; this
+ doc only deletes obsolete formats (`.smi`, `.rt`).
+- **Library refresh** — after files land in `/home/user/media/`, run
+ `POST /Library/Refresh` on the Jellyfin API (doc 02 § 2). Cleanup never
+ touches the running container.
+- **NFO writing** — doc 02 § 11 covers writing override NFOs. This doc
+ only filters incoming NFOs.
+- **Source deletion** — never. The source download is read-only to this
+ pipeline; the user removes it manually post-import.
+
+---
+
+## 11. TL;DR
+
+| Step | What | Where |
+|---|---|---|
+| 1 | Audit (dry-run) | `cleanup-import.sh "$SRC"` |
+| 2 | Apply (quarantine) | `cleanup-import.sh --apply "$SRC"` → prints staging path |
+| 3 | Review quarantine | `ls ~/.jellyfin-quarantine/$(date +%F)/` |
+| 4 | Normalize filenames | doc 08, takes staging path as input |
+| 5 | Move to live tree | manual `mv "$STAGING" /home/user/media/...` |
+| 6 | Refresh library | `POST /Library/Refresh` (doc 02) |
+| 7 | Confirm quarantine | `cleanup-import.sh --confirm-quarantine YYYY-MM-DD` |
+| 8 | Delete source | manual, only after Jellyfin shows the item correctly |
+
+The hard rule, repeated: **the source download is never modified, the live
+media tree is never written by cleanup, and Windows executables never
+reach a Jellyfin user's browser.**
diff --git a/docs/08-filename-normalization.md b/docs/08-filename-normalization.md
new file mode 100644
index 0000000..cf62291
--- /dev/null
+++ b/docs/08-filename-normalization.md
@@ -0,0 +1,1853 @@
+# 08 — Filename & Folder Normalization Ruleset (tv.s8n.ru)
+
+Last updated: 2026-05-08
+Server: Jellyfin 10.10.3 on nullstone, container `jellyfin`
+Library root inside container: `/media`
+Library root on host: `/home/user/media`
+
+This document is the **normative ruleset** for renaming downloaded media into a
+canonical, predictable, group-tag-free shape before it lands in the live
+library tree. It is the layer between "torrent dump" and "file ready for the
+scanner".
+
+Cross-links:
+
+- [`05-file-structure-rules.md`](05-file-structure-rules.md) — what Jellyfin's
+ parser accepts; this doc picks one of the accepted forms and locks it in.
+- [`07-cleanup-and-imports.md`](07-cleanup-and-imports.md) — the operational
+ pipeline (move, dedupe, garbage collect) that consumes this ruleset. Doc 08
+ defines *what* canonical looks like; doc 07 defines *how* to apply it.
+- [`02-metadata-and-titles.md`](02-metadata-and-titles.md) — what Jellyfin
+ does after the rename (parse, scrape, lock).
+- [`03-subtitles.md`](03-subtitles.md) — sidecar `.srt` / `.ass` naming
+ (referenced from § 5.6 below).
+
+> **Status of this doc:** specification + reference implementation. The
+> `normalize.py` script in § 11 is canonical. Anything not codified by the
+> script is documentation only — when the doc and the script disagree, the
+> script wins, and the doc gets fixed.
+
+---
+
+## 0. Why a normalization ruleset (and why now)
+
+Doc 05 establishes that Jellyfin's parser is permissive: dots, dashes,
+underscores, and spaces are interchangeable; `S01E01`, `s01e01`, `1x01`, and
+`Season 1 Episode 1` all parse to the same thing. That permissiveness is great
+for *getting Jellyfin to scrape a torrent dump*, but it is a disaster for
+**operating a library at scale**:
+
+1. **Search becomes noisy.** SMB / Syncthing / Dolphin search across mixed
+ patterns surfaces irrelevant matches (`S01E01` vs `1x01` vs `s01.e01`).
+2. **Diff / audit / dedupe scripts** get harder. Every regex needs to handle
+ N forms. The cleanup pass (doc 07) is dramatically cheaper if every file
+ in the tree obeys one shape.
+3. **Visual scan in `ls`** becomes unreadable when half the filenames have
+ `[1080p AI x265 10bit FS99 Joy]` glued on and the other half don't.
+4. **Future migrations** (Plex, Kodi, mobile sync to a Win/Mac client) all
+ have stricter parsers than Jellyfin. The strictest sane shape that
+ Jellyfin accepts is also the most portable. Pay the cost once.
+5. **Cross-platform safety.** This deploy is Linux-only today, but the
+ workspace's Syncthing setup (see ai-lab `SYSTEM.md`) implies future
+ sync to Win/Mac clients. Choose Windows-safe filenames now and never
+ touch this again.
+
+The cost of the ruleset is one Python script and discipline at import time.
+Both are bounded. The cost of *not* having one compounds with every new
+release.
+
+---
+
+## 1. Canonical formats — what the tree must look like
+
+This is the lock-in. **One shape per category. No alternatives. No "but my
+release group did it differently".**
+
+### 1.1 Movies
+
+```
+Movies/ ()/ ().
+Movies/ ()/ () - . (when edition matters)
+Movies/ () []/ () []. (when ambiguous)
+```
+
+- `` — smart title case (§ 5.1), forbidden chars stripped (§ 5.5).
+- `` — first theatrical-release year, in parens, single space before `(`.
+ Mandatory in this deploy (doc 05 § 0 rule 5), even when the title is unique.
+- `` — when present, exactly one of:
+ `Director's Cut`, `Extended`, `Theatrical`, `IMAX`, `Unrated`, `Final Cut`,
+ `Remastered`. Anything else (e.g. `Snyder Cut`, `Workprint`, `4K
+ Remaster`) is admissible only with a written justification in the import
+ log; otherwise normalize to the closest of the seven canonical labels
+ above.
+- `` — `imdbid-tt0123456` / `tmdbid-12345` / `tvdbid-12345`
+ in square brackets. Optional unless year-based disambiguation isn't
+ enough (§ 6.2).
+- `` — lowercase: `mkv`, `mp4`, `webm`, `avi`. (`mkv` is the rip
+ default; `mp4` is the streaming-original default.) Never uppercase
+ `.MKV`, `.MP4`.
+
+**Forbidden in the filename**: resolution tags (`1080p`, `2160p`, `720p`,
+`4K`), codec tags (`x264`, `x265`, `h264`, `h265`, `HEVC`, `AVC`), source
+tags (`WEB`, `WEB-DL`, `BluRay`, `BRRip`, `HDTV`, `DVDRip`, `WEBRip`),
+audio tags (`AAC`, `AC3`, `DTS`, `DTS-HD.MA`, `5.1`, `7.1`, `Atmos`,
+`Opus`), bitness/HDR tags (`10bit`, `8bit`, `HDR`, `DV`, `SDR`), release
+tags (`PROPER`, `REPACK`, `INTERNAL`, `LIMITED`, `RERIP`), language tags
+(`MULTi`, `DUBBED`, `SUBBED`, `iNTERNAL`), group tags
+(`[YIFY]`, `[RARBG]`, `[FS99 Joy]`, `-NOGRP`, `-EVO`, `-SPARKS`),
+and website refs (`WWW.YIFY-TORRENTS.COM`, `RARBG.txt`-derived names).
+
+**Justification — why no resolution/codec tag:**
+
+Jellyfin reads stream attributes (resolution, codec, bit-depth, HDR, audio
+codec) directly from the file via `ffprobe` on every scan. The web UI
+displays them. The mobile clients display them. The transcoder picks
+based on them. The filename contributes **zero new information**.
+Including those tags pollutes search results, breaks the byte-exact
+folder-vs-file match required for multi-version movies (doc 05 § 1.2),
+and makes humans skim past the title to find the title. The only
+exception is `Movie (Year) - 1080p.mkv` AS the multi-version label
+when two distinct rips of *the same movie* are kept in the same folder
+(e.g. `Blade Runner 2049 (2017) - 2160p.mkv` next to
+`Blade Runner 2049 (2017) - 1080p.mkv`). In that exact case, the
+resolution IS the disambiguation token. Otherwise, no.
+
+#### Examples
+
+```
+Movies/Blade Runner (1982)/Blade Runner (1982).mkv
+Movies/Blade Runner (1982)/Blade Runner (1982) - Final Cut.mkv
+Movies/Blade Runner (1982)/Blade Runner (1982) - Director's Cut.mkv
+Movies/Blade Runner 2049 (2017)/Blade Runner 2049 (2017) - 2160p.mkv
+Movies/Blade Runner 2049 (2017)/Blade Runner 2049 (2017) - 1080p.mkv
+Movies/Dune (1984) [imdbid-tt0087182]/Dune (1984) [imdbid-tt0087182].mkv
+```
+
+### 1.2 TV shows
+
+```
+TV/ ()/Season / () - SE - .
+TV/ ()/Season / () - SE-E - .
+TV/ ()/Season 00/ () - S00E - .
+```
+
+- `` — smart title case, no provider-id in show folder unless the
+ scraper picks the wrong show twice in a row (then add `[tvdbid-NNNN]`).
+- `` — series **first-air year**, mandatory even when title is unique
+ (doc 05 § 0 rule 5; this deploy convention is stricter than upstream
+ permissive parsing).
+- `` — zero-padded two digits. `Season 01`, not `Season 1`. `S01`, not `S1`.
+- `` — zero-padded two digits. Three digits permissible only for shows
+ that exceed 99 episodes per *season* (rare; e.g. some daily anime). See
+ doc 05 § 3.1.
+- `` — title from the metadata provider (TVDB/TMDB) with
+ smart title case. Required for human readability; Jellyfin overwrites it
+ during scrape but the file basename is what humans see in `ls`.
+- Multi-episode files: `SE-E` — single hyphen, no spaces.
+ Verified parsing per doc 05 § 2.2 table.
+
+#### Examples
+
+```
+TV/Futurama (1999)/Season 01/Futurama (1999) - S01E01 - Space Pilot 3000.mkv
+TV/Futurama (1999)/Season 01/Futurama (1999) - S01E03-E04 - I, Roommate / Love's Labours Lost in Space.mkv
+TV/Futurama (1999)/Season 00/Futurama (1999) - S00E01 - Bender's Big Score.mkv
+TV/The Office (2005)/Season 02/The Office (2005) - S02E01 - The Dundies.mkv
+```
+
+#### Why this shape (not the slimmer `Show S01E01.mkv`)
+
+Doc 05 § 2.2 shows three accepted patterns:
+
+```
+Futurama (1999) S01E01.mkv
+Futurama (1999) S01E01 - Space Pilot 3000.mkv
+Futurama (1999) - S01E01 - Space Pilot 3000.mkv ← canonical for this deploy
+```
+
+The third form (with the leading ` - ` before `S01E01` and the title) is
+chosen because:
+
+1. The leading dash visually separates the series-name block from the
+ episode-id block. Important when the show's title contains spaces and
+ numbers (`Star Trek The Next Generation S01E01`) — without the dash, the
+ eye trips over `Generation S01E01`.
+2. Symmetric with the Movies multi-version pattern (`Title (Year) -