1003 lines
41 KiB
Markdown
1003 lines
41 KiB
Markdown
|
|
# 07 — Pre-Import Cleanup Ruleset (tv.s8n.ru)
|
||
|
|
|
||
|
|
Last updated: 2026-05-08
|
||
|
|
Server: Jellyfin 10.10.3 on nullstone, container `jellyfin`
|
||
|
|
Library root inside container: `/media`
|
||
|
|
Library root on host: `/home/user/media`
|
||
|
|
|
||
|
|
This document defines the **normative pre-import cleanup ruleset** for the
|
||
|
|
personal Jellyfin deploy. The owner downloads scene/group releases (e.g.
|
||
|
|
`Futurama Season 1 [1080p AI x265 10bit FS99 Joy]/`) which contain a mixture
|
||
|
|
of media files and non-media junk (codec readmes, release-group brags, Windows
|
||
|
|
installer shortcuts, comparison images, OS thumbnail caches, etc.). This junk
|
||
|
|
must NOT land in `/home/user/media/` because:
|
||
|
|
|
||
|
|
1. It clutters the library and confuses scrapers.
|
||
|
|
2. Promo PNGs may be mis-identified as artwork.
|
||
|
|
3. Release-group `.nfo` files break the NFO-override flow (doc 02 § 11).
|
||
|
|
4. **Windows executables and installer shortcuts (`.exe`, `.msi`, `.website`,
|
||
|
|
`.url`, `.lnk`, `.scr`, `.bat`, `.ps1`) are a real security vector.** Even
|
||
|
|
though the Linux server cannot execute them, friends with a Jellyfin
|
||
|
|
account can download them through the web UI and run them on their PC.
|
||
|
|
|
||
|
|
Cross-linked to:
|
||
|
|
|
||
|
|
- [`01-artwork-and-images.md`](01-artwork-and-images.md) — what counts as a
|
||
|
|
recognised poster / backdrop on disk.
|
||
|
|
- [`02-metadata-and-titles.md`](02-metadata-and-titles.md) — NFO sidecar
|
||
|
|
override flow; what a "real" Jellyfin NFO looks like.
|
||
|
|
- [`03-subtitles.md`](03-subtitles.md) — which subtitle files to keep.
|
||
|
|
- [`05-file-structure-rules.md`](05-file-structure-rules.md) — canonical
|
||
|
|
folder layout. § 8 of doc 05 defines the recognised extras subfolders;
|
||
|
|
this doc enforces them at import time.
|
||
|
|
- [`08-filename-normalization.md`](08-filename-normalization.md) — the
|
||
|
|
**next** stage of the pipeline (sibling agent), called after this doc's
|
||
|
|
`cleanup-import.sh` has produced a clean staging tree.
|
||
|
|
|
||
|
|
Sources of truth:
|
||
|
|
|
||
|
|
- <https://jellyfin.org/docs/general/server/media/movies/> — extras subfolders
|
||
|
|
and artwork filename patterns.
|
||
|
|
- <https://jellyfin.org/docs/general/server/media/shows/> — same for series.
|
||
|
|
- <https://jellyfin.org/docs/general/server/metadata/nfo/> — NFO XML schema;
|
||
|
|
used here to distinguish a real metadata NFO from a release-group brag.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 0. Top-level cleanup rules
|
||
|
|
|
||
|
|
These are non-negotiable. They wrap the doc 05 top-level rules with one
|
||
|
|
guarantee: **nothing leaves staging until cleanup has run and been
|
||
|
|
confirmed.**
|
||
|
|
|
||
|
|
1. **Never clean in-place on the source download.** The download directory
|
||
|
|
(`/home/admin/Downloads/...`) is treated as a read-only artefact until
|
||
|
|
the user explicitly approves deletion. The cleanup script copies into a
|
||
|
|
staging area and operates there.
|
||
|
|
2. **Quarantine first, delete later.** First run of the cleanup script on a
|
||
|
|
release moves junk to `~/.jellyfin-quarantine/<YYYY-MM-DD>/<release-name>/`
|
||
|
|
instead of deleting. The user reviews, then a second pass empties the
|
||
|
|
quarantine after sign-off. Subsequent runs on the same release are
|
||
|
|
idempotent.
|
||
|
|
3. **Two-list policy.** Every file is matched against an `ALLOW` list (KEEP)
|
||
|
|
or a `DENY` list (DELETE). Anything not on either list is **flagged** and
|
||
|
|
surfaced in the audit report — a human decides. Never auto-delete on
|
||
|
|
"unknown".
|
||
|
|
4. **Never run cleanup as root.** All operations are as the unprivileged
|
||
|
|
`admin` (onyx) or `user` (nullstone) account. The live `/home/user/media/`
|
||
|
|
tree is touched only by the rename step in doc 08, after cleanup has
|
||
|
|
produced an intermediate staging copy.
|
||
|
|
5. **Idempotent.** Running cleanup twice on the same source must produce the
|
||
|
|
same staging tree byte-for-byte (same `find -printf '%p %s\n' | sort`
|
||
|
|
output, modulo timestamps).
|
||
|
|
6. **Dry-run is the default.** The cleanup script with no flags lists what it
|
||
|
|
*would* do and exits without writing. `--apply` is required to actually
|
||
|
|
move/quarantine files.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 1. Categorical taxonomy of non-media files in scene/group releases
|
||
|
|
|
||
|
|
Scene and group ("p2p") releases follow loose conventions. The following
|
||
|
|
categories cover everything observed in the wild plus everything in the
|
||
|
|
Futurama download set:
|
||
|
|
|
||
|
|
### 1.1 Codec / player promotion
|
||
|
|
|
||
|
|
Text files and Windows shortcut files steering the user toward a specific
|
||
|
|
codec pack or media player (often K-Lite + MPC-HC). Frequently the file is
|
||
|
|
an `.url` or `.website` (Internet Shortcut) pointing to a third-party
|
||
|
|
installer. **Always DELETE.**
|
||
|
|
|
||
|
|
Real-world examples (`/home/admin/Downloads/futrama/`):
|
||
|
|
|
||
|
|
- `How to play HEVC (THIS FILE).txt` — 65 lines of MPC-HC marketing.
|
||
|
|
- `Ninite K-Lite Codecs Unattended Silent Installer and Updater.website`
|
||
|
|
— `URL=https://ninite.com/klitecodecs/` Internet Shortcut.
|
||
|
|
|
||
|
|
Patterns:
|
||
|
|
|
||
|
|
- `How to play *.txt`, `Read*Me*.txt`, `INSTALL*.txt`, `PLAY*.txt`
|
||
|
|
- `*.website`, `*.url`, `*.lnk`
|
||
|
|
- `K-Lite*`, `MPC-HC*`, `VLC*`, `MX Player*`, `LAV*`
|
||
|
|
|
||
|
|
### 1.2 Release-group brag
|
||
|
|
|
||
|
|
Plain-text or `.nfo` files where the release group identifies itself,
|
||
|
|
documents encoder settings, or pumps its tracker URL. Distinguishable from a
|
||
|
|
**Jellyfin-compatible metadata NFO** (XML, root `<movie>` / `<tvshow>` /
|
||
|
|
`<episodedetails>`) by content — see § 3.
|
||
|
|
|
||
|
|
Real-world examples:
|
||
|
|
|
||
|
|
- `Encoded by JoyBell (UTR).txt` — 41-line manifesto from "Unity Team
|
||
|
|
Release group" pointing to `UNITEAM.CO`.
|
||
|
|
- `RARBG.txt`, `WWW.YIFY-TORRENTS.COM.url`, `<group>.nfo` with ASCII art.
|
||
|
|
|
||
|
|
Patterns:
|
||
|
|
|
||
|
|
- `Encoded by *.txt`, `Ripped by *.txt`, `<GROUP>.txt`
|
||
|
|
- `RARBG.txt`, `RARBG_DO_NOT_MIRROR.exe` (yes, those exist; § 1.10)
|
||
|
|
- `*-readme.txt`, `release notes.txt`
|
||
|
|
- `*.nfo` containing only ASCII art (no `<movie>` / `<tvshow>` /
|
||
|
|
`<episodedetails>` root element)
|
||
|
|
- `*.diz`, `file_id.diz` — old "BBS description" file, scene leftover
|
||
|
|
|
||
|
|
### 1.3 Promo images that are NOT poster artwork
|
||
|
|
|
||
|
|
Images that LOOK like artwork to a naive globber but are actually before/after
|
||
|
|
comparisons, group banners, or screenshot proofs. **Delete unless they live
|
||
|
|
inside a recognised extras folder (§ 4) or match the strict allow-list of
|
||
|
|
poster/backdrop names from doc 01.**
|
||
|
|
|
||
|
|
Real-world example:
|
||
|
|
|
||
|
|
- `Futurama Compare.png` (1.05 MB) — encoder before/after comparison.
|
||
|
|
|
||
|
|
Patterns to delete:
|
||
|
|
|
||
|
|
- `*Compare*.{png,jpg,jpeg,webp}`
|
||
|
|
- `*Sample*.{png,jpg,jpeg}` (when not in a `samples/` extras folder)
|
||
|
|
- `*Screen*.{png,jpg}`, `*Screens/*`, `*Proof/*`, `*Preview/*`
|
||
|
|
- `*-banner.png` from a group (NOT the same as Jellyfin's `banner.jpg`;
|
||
|
|
group banners typically have the group name in the filename — heuristic
|
||
|
|
match `*JoyBell*`, `*UTR*`, `*JoY*`, etc.)
|
||
|
|
- Stray `*.gif` files (animated previews); Jellyfin doesn't use GIF.
|
||
|
|
|
||
|
|
### 1.4 OS-generated thumbnail caches
|
||
|
|
|
||
|
|
Per-OS file managers (Windows Explorer, macOS Finder, GNOME Files) leave
|
||
|
|
turds in every directory they browse. **Always DELETE — never useful, never
|
||
|
|
metadata.**
|
||
|
|
|
||
|
|
Patterns:
|
||
|
|
|
||
|
|
- `Thumbs.db`, `ehthumbs.db`, `ehthumbs_vista.db`
|
||
|
|
- `.DS_Store`, `._*` (macOS resource forks)
|
||
|
|
- `Desktop.ini`, `desktop.ini`
|
||
|
|
- `.directory` (KDE)
|
||
|
|
- `.fseventsd/`, `.Spotlight-V100/`, `.Trashes/` (macOS)
|
||
|
|
- `$RECYCLE.BIN/`, `System Volume Information/` (Windows mount)
|
||
|
|
|
||
|
|
### 1.5 Sample files (lower-quality previews)
|
||
|
|
|
||
|
|
Scene releases sometimes ship a 30-second sample file at lower bitrate.
|
||
|
|
Jellyfin treats a `samples/` subfolder as extras (doc 05 § 8.2), but a stray
|
||
|
|
`Movie.sample.mkv` next to the main file would scrape as "another version".
|
||
|
|
|
||
|
|
**Default: DELETE.** Reasoning: we have the full file; the sample is dead
|
||
|
|
weight. If the user genuinely wants samples, drop them into a `samples/`
|
||
|
|
subfolder before running cleanup and the script will preserve the folder.
|
||
|
|
|
||
|
|
Patterns to delete (when at the top level of a release):
|
||
|
|
|
||
|
|
- `sample.{mkv,mp4,avi,m4v}`
|
||
|
|
- `*-sample.{mkv,mp4,avi,m4v}`, `*.sample.{mkv,mp4,avi,m4v}`
|
||
|
|
- `*_sample.{mkv,mp4,avi,m4v}`
|
||
|
|
- `Sample/` directory (rename to `samples/` to preserve as extras, OR delete)
|
||
|
|
|
||
|
|
### 1.6 Subtitle leftovers
|
||
|
|
|
||
|
|
VobSub (DVD/Blu-ray bitmap subs) are shipped as a pair: `en.idx` (index) +
|
||
|
|
`en.sub` (bitmap stream). Jellyfin can render them, but if a `.srt` exists
|
||
|
|
with the same language tag the bitmap pair is redundant and slow.
|
||
|
|
|
||
|
|
**Default: KEEP all `.srt` and `.ass`. KEEP `.idx`/`.sub` only if no `.srt`
|
||
|
|
of the same language exists.** This is a per-file decision — surface to the
|
||
|
|
user in the audit report rather than auto-pruning.
|
||
|
|
|
||
|
|
Patterns:
|
||
|
|
|
||
|
|
- `*.srt`, `*.ass`, `*.ssa`, `*.vtt` — KEEP (per doc 03).
|
||
|
|
- `*.sup` (PGS bitmap, Blu-ray) — KEEP (Jellyfin renders).
|
||
|
|
- `*.idx` + `*.sub` (VobSub) — KEEP if no `.srt` with same lang code; else
|
||
|
|
flag for human review.
|
||
|
|
- `*.smi`, `*.rt` — DELETE (obsolete formats Jellyfin doesn't support).
|
||
|
|
|
||
|
|
### 1.7 Torrent residue
|
||
|
|
|
||
|
|
Files left by the torrent client itself. None are useful to Jellyfin.
|
||
|
|
|
||
|
|
Patterns to delete:
|
||
|
|
|
||
|
|
- `*.torrent`, `*.magnet`
|
||
|
|
- `*.parts`, `*.!ut`, `*.!qB`, `*.bc!` (in-progress fragments)
|
||
|
|
- `*.meta`, `*.aria2`
|
||
|
|
- `*.pad`, `padding/`, `__padding_file_*` (mktorrent padding)
|
||
|
|
- `*.sfv` (checksum manifest; harmless but useless after download)
|
||
|
|
- `*.md5`, `*.sha1`, `*.sha256` (release-checksum sidecars)
|
||
|
|
|
||
|
|
### 1.8 Test / proof images and folders
|
||
|
|
|
||
|
|
Some groups ship a `Proof/` or `Screens/` folder with screenshots to "prove"
|
||
|
|
the rip's quality. Useless inside a Jellyfin library.
|
||
|
|
|
||
|
|
Patterns to delete (whole folders):
|
||
|
|
|
||
|
|
- `Proof/`, `proof/`, `PROOF/`
|
||
|
|
- `Screens/`, `screens/`, `Screenshots/`, `Caps/`
|
||
|
|
- `Preview/`, `Previews/`
|
||
|
|
- `_screens/`, `screenshots-only/`
|
||
|
|
|
||
|
|
### 1.9 Multi-disc DVD/Blu-ray cruft
|
||
|
|
|
||
|
|
When a release is a straight ISO rip the `VIDEO_TS/` or `BDMV/` directory
|
||
|
|
sometimes survives next to the encoded file. Jellyfin can play
|
||
|
|
`VIDEO_TS.IFO` directly, but a partial DVD structure left over from the
|
||
|
|
encode is just clutter.
|
||
|
|
|
||
|
|
Patterns:
|
||
|
|
|
||
|
|
- `VIDEO_TS/` — KEEP if it contains a complete `VIDEO_TS.VOB` set;
|
||
|
|
otherwise flag.
|
||
|
|
- `*.IFO`, `*.BUP`, `*.VOB` — KEEP if inside a complete `VIDEO_TS/`;
|
||
|
|
DELETE if loose.
|
||
|
|
- `BDMV/`, `CERTIFICATE/`, `AACS/` — KEEP if complete BD structure;
|
||
|
|
flag if partial.
|
||
|
|
- `*.iso` inside a media folder — flag for human review (could be the
|
||
|
|
intentional rip OR a Windows malware vector — see § 8).
|
||
|
|
|
||
|
|
### 1.10 Outright malicious / suspicious
|
||
|
|
|
||
|
|
Some releases historically shipped Windows executables disguised as
|
||
|
|
"DO NOT MIRROR" anti-leech files. Even on a Linux server these must be
|
||
|
|
deleted because the friend with a Jellyfin account can download them via
|
||
|
|
the web UI ("Download original file" button) and run them locally.
|
||
|
|
|
||
|
|
**Always DELETE, never quarantine, never preserve, no exceptions.**
|
||
|
|
|
||
|
|
Patterns:
|
||
|
|
|
||
|
|
- `*.exe`, `*.msi`, `*.bat`, `*.cmd`, `*.com`, `*.scr`, `*.ps1`, `*.vbs`,
|
||
|
|
`*.wsf`, `*.hta`, `*.jar`
|
||
|
|
- `*.app/` (macOS bundle dropped by macOS-using uploader)
|
||
|
|
- `*.dll`, `*.sys` (rare, but seen)
|
||
|
|
- Anything with a double extension like `Movie.mkv.exe`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 2. KEEP vs DELETE — exhaustive table
|
||
|
|
|
||
|
|
This table is the **canonical decision matrix** for `cleanup-import.sh`.
|
||
|
|
Patterns are case-insensitive on `ext4`+Jellyfin. `KEEP` means it goes to the
|
||
|
|
staging tree; `DELETE` means it goes to quarantine on first run, then
|
||
|
|
recycle-bin on confirm.
|
||
|
|
|
||
|
|
| Pattern | Action | Why |
|
||
|
|
|---|---|---|
|
||
|
|
| `*.mkv`, `*.mp4`, `*.avi`, `*.m4v`, `*.ts`, `*.mov`, `*.webm`, `*.wmv`, `*.flv`, `*.mpg`, `*.mpeg` | **KEEP** | Media — the entire point. |
|
||
|
|
| `*.srt`, `*.ass`, `*.ssa`, `*.vtt`, `*.sup` | **KEEP** | Subtitles (doc 03). |
|
||
|
|
| `*.idx` + `*.sub` (VobSub pair) | **KEEP** if no `.srt` of same lang exists; else **FLAG** | Bitmap subs; redundant with SRT. |
|
||
|
|
| `*.smi`, `*.rt` | **DELETE** | Obsolete subtitle formats; Jellyfin can't render. |
|
||
|
|
| `folder.{jpg,png}`, `poster.{jpg,png}`, `cover.{jpg,png}`, `default.{jpg,png}`, `show.{jpg,png}`, `jacket.{jpg,png}`, `movie.{jpg,png}` | **KEEP** | Jellyfin-recognised primary artwork (doc 01). |
|
||
|
|
| `backdrop.{jpg,png}`, `fanart.{jpg,png}`, `background.{jpg,png}`, `art.{jpg,png}`, `backdrop[0-9]*.{jpg,png}`, `backdrop-[0-9]*.{jpg,png}` | **KEEP** | Jellyfin-recognised backdrops (doc 01). |
|
||
|
|
| `logo.{png,jpg}`, `clearlogo.{png,jpg}`, `banner.{jpg,png}`, `landscape.{jpg,png}`, `thumb.{jpg,png}`, `disc.{png,jpg}`, `clearart.{png,jpg}` | **KEEP** | Jellyfin-recognised auxiliary artwork. |
|
||
|
|
| `season[0-9]*-poster.{jpg,png}`, `season[0-9]*.{jpg,png}`, `season-specials-poster.{jpg,png}` | **KEEP** | Per-season artwork (doc 01 / TV layout). |
|
||
|
|
| `extrafanart/*.{jpg,png}`, `backdrops/*.{jpg,png,mp4}` | **KEEP** | Multi-backdrop folders (doc 05 § 8). |
|
||
|
|
| `*.nfo` with XML root `<movie>` / `<tvshow>` / `<episodedetails>` / `<artist>` / `<album>` / `<musicvideo>` | **KEEP** | Jellyfin-compatible metadata sidecar (doc 02 § 11). |
|
||
|
|
| `*.nfo` without one of the above XML roots | **DELETE** | Release-group ASCII-art brag — pretends to be metadata, isn't. |
|
||
|
|
| `*Compare*.{png,jpg,jpeg,webp,gif}` | **DELETE** | Encoder before/after — group promo. |
|
||
|
|
| `*Sample*.{png,jpg,jpeg}` (image, top level) | **DELETE** | Group promo (NOT a Jellyfin sample folder). |
|
||
|
|
| `*Screen*.{png,jpg}`, `Screens/`, `Screenshots/`, `Caps/` | **DELETE** | Proof shots. |
|
||
|
|
| `Proof/`, `proof/`, `PROOF/` | **DELETE** (whole folder) | Quality-proof shots. |
|
||
|
|
| `Preview/`, `Previews/` | **DELETE** (whole folder) | Lower-quality teaser. |
|
||
|
|
| `*.txt` (any) | **DELETE** | Readme / group brag — Jellyfin doesn't read TXT. |
|
||
|
|
| `*.diz`, `file_id.diz` | **DELETE** | Scene description file — obsolete. |
|
||
|
|
| `*.website`, `*.url`, `*.lnk` | **DELETE** | Windows Internet Shortcut — points at codec/installer pages. **Security: § 8.** |
|
||
|
|
| `*.exe`, `*.msi`, `*.bat`, `*.cmd`, `*.com`, `*.scr`, `*.ps1`, `*.vbs`, `*.wsf`, `*.hta`, `*.jar`, `*.dll`, `*.sys` | **DELETE** | Windows executable. **Security: § 8.** |
|
||
|
|
| `*.app/` | **DELETE** (whole folder) | macOS bundle. |
|
||
|
|
| `Thumbs.db`, `ehthumbs.db`, `ehthumbs_vista.db` | **DELETE** | Windows Explorer thumbnail cache. |
|
||
|
|
| `.DS_Store`, `._*` | **DELETE** | macOS Finder. |
|
||
|
|
| `Desktop.ini`, `desktop.ini` | **DELETE** | Windows folder customisation. |
|
||
|
|
| `.directory` | **DELETE** | KDE Dolphin. |
|
||
|
|
| `.fseventsd/`, `.Spotlight-V100/`, `.Trashes/`, `$RECYCLE.BIN/`, `System Volume Information/` | **DELETE** (whole folder) | OS metadata directories. |
|
||
|
|
| `sample.{mkv,mp4,avi,m4v}` (top level) | **DELETE** | Lower-quality preview (doc 05 § 8.1: full file already present). |
|
||
|
|
| `*-sample.{mkv,mp4,avi,m4v}`, `*_sample.{mkv,mp4,avi,m4v}`, `*.sample.{mkv,mp4,avi,m4v}` | **DELETE** | Same. |
|
||
|
|
| `Sample/` (directory, top level) | **DELETE** | Lower-quality preview folder. |
|
||
|
|
| `samples/` (directory, recognised name) | **KEEP** | Jellyfin extras folder (doc 05 § 8.2). |
|
||
|
|
| `featurettes/`, `behind the scenes/`, `deleted scenes/`, `interviews/`, `scenes/`, `shorts/`, `clips/`, `trailers/`, `extras/`, `other/`, `theme-music/`, `backdrops/` | **KEEP** (whole folder) | Jellyfin extras (doc 05 § 8.2). |
|
||
|
|
| `Featurettes/`, `Behind The Scenes/`, etc. (capitalised) | **KEEP** but **rename to lowercase** | Jellyfin matches case-insensitive but lowercase is the documented form. |
|
||
|
|
| Any other folder name | **FLAG** | Surface to human; might be a typo of an extras folder. |
|
||
|
|
| `*.torrent`, `*.magnet` | **DELETE** | Torrent client residue. |
|
||
|
|
| `*.parts`, `*.!ut`, `*.!qB`, `*.bc!`, `*.aria2` | **DELETE** | In-progress download fragments (shouldn't be here, but defensive). |
|
||
|
|
| `*.meta` | **DELETE** | aria2/torrent metadata. |
|
||
|
|
| `*.pad`, `padding/`, `__padding_file_*`, `_____padding_file_*` | **DELETE** | mktorrent padding files. |
|
||
|
|
| `*.sfv`, `*.md5`, `*.sha1`, `*.sha256` | **DELETE** | Checksum manifests; harmless but useless after download. |
|
||
|
|
| `*.rar`, `*.r[0-9][0-9]`, `*.zip`, `*.7z`, `*.tar`, `*.tar.gz` | **FLAG** | Compressed archive in a media folder is suspicious — release should have been extracted before download. |
|
||
|
|
| `*.iso` inside a media folder | **FLAG** | Could be intentional DVD/BD rip OR Windows-installer disguise. Human review. |
|
||
|
|
| `VIDEO_TS/` (complete) | **KEEP** | Jellyfin plays DVD structure directly. |
|
||
|
|
| `*.IFO`, `*.BUP`, `*.VOB` (loose, no `VIDEO_TS/`) | **DELETE** | Orphan DVD remnants. |
|
||
|
|
| `BDMV/` (complete) | **KEEP** | Jellyfin plays BD structure. |
|
||
|
|
| `CERTIFICATE/`, `AACS/` (without `BDMV/`) | **DELETE** | Orphan BD remnants. |
|
||
|
|
| `RARBG*.{txt,exe}`, `WWW.*.url`, `*.YIFY*.url` | **DELETE** | Tracker promo. |
|
||
|
|
| `RARBG_DO_NOT_MIRROR.exe` and similar | **DELETE** (security: § 8) | Historic anti-leech file; sometimes weaponised. |
|
||
|
|
| Anything else | **FLAG** | Two-list policy: never auto-delete on "unknown". |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 3. NFO handling — the nuanced case
|
||
|
|
|
||
|
|
`.nfo` is overloaded. Two completely different file kinds share the
|
||
|
|
extension:
|
||
|
|
|
||
|
|
- **Scene release `.nfo`** — plain text, ASCII art, encoder credits, tracker
|
||
|
|
URL. Useless to Jellyfin (and at worst gets scraped as garbage metadata
|
||
|
|
if NFO Saver is enabled).
|
||
|
|
- **Jellyfin/Kodi/Emby metadata NFO** — XML, root element is one of
|
||
|
|
`<movie>`, `<tvshow>`, `<episodedetails>`, `<artist>`, `<album>`,
|
||
|
|
`<musicvideo>`. Documented in doc 02 § 11.
|
||
|
|
|
||
|
|
### 3.1 The discriminator one-liner
|
||
|
|
|
||
|
|
```bash
|
||
|
|
is_jellyfin_nfo() {
|
||
|
|
# Returns 0 (KEEP) if the file looks like a Jellyfin/Kodi NFO,
|
||
|
|
# 1 (DELETE) if it looks like scene-group ASCII-art brag.
|
||
|
|
head -c 4096 "$1" | tr -d '[:space:]' \
|
||
|
|
| grep -qE '<(movie|tvshow|episodedetails|artist|album|musicvideo|season)\b'
|
||
|
|
}
|
||
|
|
|
||
|
|
# Usage:
|
||
|
|
if is_jellyfin_nfo "$f"; then echo "KEEP $f"; else echo "DELETE $f"; fi
|
||
|
|
```
|
||
|
|
|
||
|
|
The first 4096 bytes are enough — a real Jellyfin NFO declares its root
|
||
|
|
within the first kilobyte. `tr -d '[:space:]'` is needed because some
|
||
|
|
encoders pretty-print the XML and put `<movie` on a different line from `<`.
|
||
|
|
|
||
|
|
### 3.2 Edge cases
|
||
|
|
|
||
|
|
- An NFO with both ASCII art **and** an XML root: KEEP. Jellyfin's parser
|
||
|
|
ignores leading non-XML noise as long as the XML element parses.
|
||
|
|
- An NFO with a different XML root (e.g. `<root>`, `<info>`): DELETE.
|
||
|
|
Jellyfin won't read it; nothing to preserve.
|
||
|
|
- An NFO with valid XML but **stale TMDB/IMDB IDs** that conflict with a
|
||
|
|
newer scrape: KEEP, but flag for the user — doc 02 § 11.5 explains how
|
||
|
|
the NFO Saver overwrites these on next scrape.
|
||
|
|
- Multiple NFOs in one folder (e.g. `release.nfo` from the group AND
|
||
|
|
`tvshow.nfo` from a previous Jellyfin write): KEEP `tvshow.nfo`,
|
||
|
|
DELETE `release.nfo`. Use the discriminator above on each.
|
||
|
|
|
||
|
|
### 3.3 First-100-bytes shortcut
|
||
|
|
|
||
|
|
The task brief proposes this:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
if head -c 100 file.nfo | grep -qE '<(movie|tvshow|episodedetails)\b'; then echo KEEP; else echo DELETE; fi
|
||
|
|
```
|
||
|
|
|
||
|
|
This works for the common case but misses NFOs that start with an XML
|
||
|
|
declaration (`<?xml version="1.0"?>` plus possibly a comment) before the
|
||
|
|
root element — that prologue alone can be > 100 bytes. The 4096-byte
|
||
|
|
version above is safer; we use that in `cleanup-import.sh`.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 4. Featurettes / Extras / Bonus folders — the canonical list
|
||
|
|
|
||
|
|
Per the Jellyfin docs (movies and shows pages), these subfolder names are
|
||
|
|
recognised and the contained files are tagged with the matching extra
|
||
|
|
type. **Folder name match is case-insensitive but lowercase is the
|
||
|
|
documented canonical form** — `cleanup-import.sh` lowercases on copy to
|
||
|
|
staging.
|
||
|
|
|
||
|
|
| Folder name | Extra type | Notes |
|
||
|
|
|---|---|---|
|
||
|
|
| `behind the scenes` | Behind The Scenes | spaces, not dashes |
|
||
|
|
| `deleted scenes` | Deleted Scene | |
|
||
|
|
| `interviews` | Interview | |
|
||
|
|
| `scenes` | Scene | |
|
||
|
|
| `samples` | Sample | distinct from a top-level `Sample/` (§ 1.5) |
|
||
|
|
| `shorts` | Short | |
|
||
|
|
| `featurettes` | Featurette | |
|
||
|
|
| `clips` | Clip | |
|
||
|
|
| `other` | Other | catch-all |
|
||
|
|
| `extras` | Extra | generic catch-all |
|
||
|
|
| `trailers` | Trailer | |
|
||
|
|
| `theme-music` | Theme music | `.mp3` files; doc 05 § 8.3 |
|
||
|
|
| `backdrops` | Backdrop video | rotating video backgrounds |
|
||
|
|
|
||
|
|
Anything else (e.g. `Bonus Features/`, `BTS/`, `Special Features/`,
|
||
|
|
`Featurette/` singular, `behind-the-scenes/` with dashes) is **NOT** matched
|
||
|
|
by Jellyfin and the contents won't surface as extras. Cleanup either
|
||
|
|
renames to the canonical name (when the mapping is unambiguous) or flags
|
||
|
|
for human review.
|
||
|
|
|
||
|
|
### 4.1 Canonical-name mapping (auto-rename)
|
||
|
|
|
||
|
|
| Found | Renamed to |
|
||
|
|
|---|---|
|
||
|
|
| `Featurettes/`, `Featurette/`, `FEATURETTES/` | `featurettes/` |
|
||
|
|
| `Behind The Scenes/`, `BTS/`, `behind-the-scenes/` | `behind the scenes/` |
|
||
|
|
| `Deleted Scenes/`, `Deleted_Scenes/`, `deleted-scenes/` | `deleted scenes/` |
|
||
|
|
| `Interviews/`, `Interview/` | `interviews/` |
|
||
|
|
| `Trailers/`, `Trailer/` | `trailers/` |
|
||
|
|
| `Bonus/`, `Bonus Features/`, `Bonus Material/`, `Special Features/`, `Specials/` | `extras/` (generic catch-all) |
|
||
|
|
| `Outtakes/`, `Bloopers/`, `Gag Reel/` | `extras/` (no dedicated folder) |
|
||
|
|
|
||
|
|
The `Specials/` rename to `extras/` is **important** — for a TV series,
|
||
|
|
`Specials/` looks like a season folder (Season 0 specials), but if the
|
||
|
|
files inside are featurettes rather than aired specials, putting them in
|
||
|
|
the wrong folder mis-scrapes them as episodes. When in doubt, flag.
|
||
|
|
|
||
|
|
### 4.2 Real-world example: Futurama download
|
||
|
|
|
||
|
|
The four Futurama season folders all contain a `Featurettes/` subfolder:
|
||
|
|
|
||
|
|
```
|
||
|
|
Futurama Season 1 [1080p AI x265 10bit FS99 Joy]/Featurettes/
|
||
|
|
├── Episode One Animatic.mkv
|
||
|
|
└── Welcome to the World of Tomorrow.mkv
|
||
|
|
|
||
|
|
Futurama Season 2 .../Featurettes/
|
||
|
|
├── Animatic -Why Must I be a Crustacean in Love.mkv
|
||
|
|
└── Futurama Game Trailer.mkv
|
||
|
|
|
||
|
|
Futurama Season 3 .../Featurettes/
|
||
|
|
├── An X-Mas Message From David X. Cohen.mkv
|
||
|
|
└── Deleted Scenes.mkv
|
||
|
|
|
||
|
|
Futurama Season 4 .../Featurettes/
|
||
|
|
├── Futurama Welcome to the World of Tomorrow (x265 Joy).mkv
|
||
|
|
├── Outtakes - Kif Gets Knocked Up a Notch [1080p x265 10bit Joy].mkv
|
||
|
|
└── Panel on Voice Actors [1080p x265 10bit Joy].mkv
|
||
|
|
```
|
||
|
|
|
||
|
|
After cleanup these become `featurettes/` (lowercase) inside the season
|
||
|
|
folder. Doc 08 (filename normalization) then renames the season folder
|
||
|
|
itself to `Season 01/` and may relocate the season-level featurettes to a
|
||
|
|
**series-level** `featurettes/` folder if the user prefers extras at the
|
||
|
|
series root (this is a doc 05 § 8 / doc 08 decision, not this doc's).
|
||
|
|
|
||
|
|
> Note: `Season 3 / Deleted Scenes.mkv` is a single file and should arguably
|
||
|
|
> be moved into a `deleted scenes/` subfolder rather than left in
|
||
|
|
> `featurettes/`. That's a manual disambiguation — flagged, not auto-moved.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 5. Audit-then-clean workflow
|
||
|
|
|
||
|
|
Three-stage pipeline. Stage 1 is mandatory; stage 2 runs on user approval;
|
||
|
|
stage 3 is reversible until the quarantine retention window expires.
|
||
|
|
|
||
|
|
### 5.1 Stage 1 — Dry-run audit
|
||
|
|
|
||
|
|
Lists every file in the source release classified as KEEP / DELETE / FLAG.
|
||
|
|
Writes nothing.
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Dry-run audit on a single release dir.
|
||
|
|
cleanup-import.sh "/home/admin/Downloads/futrama/Futurama Season 1 [1080p AI x265 10bit FS99 Joy]"
|
||
|
|
```
|
||
|
|
|
||
|
|
Output (one line per file):
|
||
|
|
|
||
|
|
```
|
||
|
|
KEEP Futurama S01E01 Space Pilot 3000 [1080p x265 10bit Joy].mkv
|
||
|
|
KEEP folder.jpg
|
||
|
|
KEEP Featurettes/Episode One Animatic.mkv -> featurettes/Episode One Animatic.mkv
|
||
|
|
DELETE Encoded by JoyBell (UTR).txt [release-group brag]
|
||
|
|
DELETE How to play HEVC (THIS FILE).txt [codec promo .txt]
|
||
|
|
DELETE Ninite K-Lite Codecs Unattended Silent ....website [windows .website -- SECURITY]
|
||
|
|
DELETE Futurama Compare.png [encoder compare image]
|
||
|
|
FLAG SomeUnknownFile.bin [unknown extension]
|
||
|
|
```
|
||
|
|
|
||
|
|
A **summary** at the bottom:
|
||
|
|
|
||
|
|
```
|
||
|
|
KEEP 16 files (5.92 GiB)
|
||
|
|
DELETE 4 files (1.08 MiB)
|
||
|
|
FLAG 0 files
|
||
|
|
Run with --apply to quarantine the DELETE set.
|
||
|
|
```
|
||
|
|
|
||
|
|
Quick one-liner equivalents (for ad-hoc spot checks; the script § 9 is
|
||
|
|
preferred):
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# What would I delete?
|
||
|
|
find "$SRC" \( \
|
||
|
|
-iname '*.txt' -o -iname '*.nfo' -o -iname '*.url' -o -iname '*.website' \
|
||
|
|
-o -iname '*.lnk' -o -iname '*.exe' -o -iname '*.msi' -o -iname '*.bat' \
|
||
|
|
-o -iname '*.scr' -o -iname '*.ps1' -o -iname '*.cmd' -o -iname '*.com' \
|
||
|
|
-o -iname 'Thumbs.db' -o -iname '.DS_Store' -o -iname 'Desktop.ini' \
|
||
|
|
-o -iname '*Compare*.png' -o -iname '*Compare*.jpg' \
|
||
|
|
-o -iname 'sample.mkv' -o -iname '*.sample.mkv' -o -iname '*-sample.mkv' \
|
||
|
|
-o -iname '*.torrent' -o -iname '*.sfv' -o -iname '*.md5' \
|
||
|
|
\) -print
|
||
|
|
|
||
|
|
# What looks like a real Jellyfin NFO vs a release-group brag?
|
||
|
|
find "$SRC" -iname '*.nfo' -print0 | while IFS= read -r -d '' f; do
|
||
|
|
if head -c 4096 "$f" | tr -d '[:space:]' \
|
||
|
|
| grep -qE '<(movie|tvshow|episodedetails|artist|album|musicvideo|season)\b'; then
|
||
|
|
printf 'KEEP %s\n' "$f"
|
||
|
|
else
|
||
|
|
printf 'DELETE %s\n' "$f"
|
||
|
|
fi
|
||
|
|
done
|
||
|
|
```
|
||
|
|
|
||
|
|
### 5.2 Stage 2 — Quarantine apply
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cleanup-import.sh --apply "/home/admin/Downloads/futrama/Futurama Season 1 [...]"
|
||
|
|
```
|
||
|
|
|
||
|
|
What it does:
|
||
|
|
|
||
|
|
1. **Copies** the source directory tree to
|
||
|
|
`/home/admin/.jellyfin-staging/<release-name>/`. The source is never
|
||
|
|
modified.
|
||
|
|
2. Inside the staging copy, **moves** every DELETE-classified file to
|
||
|
|
`/home/admin/.jellyfin-quarantine/<YYYY-MM-DD>/<release-name>/`,
|
||
|
|
preserving relative paths so a user can `diff -r` to confirm.
|
||
|
|
3. **Renames** non-canonical extras subfolders to canonical lowercase
|
||
|
|
(§ 4.1).
|
||
|
|
4. Writes a manifest at
|
||
|
|
`/home/admin/.jellyfin-staging/<release-name>/.cleanup-manifest.json`
|
||
|
|
listing every file action with sha256, source path, action, target
|
||
|
|
path. This is what stage 3 reads.
|
||
|
|
5. Returns the staging path on stdout — that's the input to doc 08's
|
||
|
|
filename normalizer.
|
||
|
|
|
||
|
|
### 5.3 Stage 3 — Confirm and recycle
|
||
|
|
|
||
|
|
After the user reviews the quarantine directory and approves:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cleanup-import.sh --confirm-quarantine 2026-05-08
|
||
|
|
```
|
||
|
|
|
||
|
|
Moves `/home/admin/.jellyfin-quarantine/2026-05-08/` to the system trash
|
||
|
|
(via `gio trash`) — still recoverable, but no longer cluttering the
|
||
|
|
quarantine root. After 30 days a cron sweep empties trash older than that.
|
||
|
|
|
||
|
|
### 5.4 Never delete from source
|
||
|
|
|
||
|
|
The source download (`/home/admin/Downloads/futrama/...`) is **never**
|
||
|
|
modified by `cleanup-import.sh`. Reasons:
|
||
|
|
|
||
|
|
- The user may want to re-seed the torrent.
|
||
|
|
- The user may want to re-run cleanup with different rules later.
|
||
|
|
- Bugs in the cleanup script must never destroy original artefacts.
|
||
|
|
|
||
|
|
Source deletion is a separate manual step the user does AFTER the
|
||
|
|
import is verified in Jellyfin and the library is happy. There is no
|
||
|
|
script for it on purpose.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 6. Idempotency, edge cases, and "unknown" handling
|
||
|
|
|
||
|
|
- **Idempotent.** `cleanup-import.sh --apply` on an already-cleaned staging
|
||
|
|
directory is a no-op (nothing matches DELETE). The script detects this
|
||
|
|
and exits 0 with `nothing to do`.
|
||
|
|
- **Re-runnable on source.** Re-running the script on the same source
|
||
|
|
produces a fresh staging copy, overwriting (after backup) the previous
|
||
|
|
staging directory. Quarantine is dated, so two runs on the same day for
|
||
|
|
the same release append rather than overwrite (`<release-name>.2/`,
|
||
|
|
`.3/`, etc.).
|
||
|
|
- **Unknown extension** (e.g. `.dat`, `.bin`, `.iso`, `.bin.txt`) — never
|
||
|
|
auto-deleted. FLAGGED in the audit output, surfaced to the user. The
|
||
|
|
user adds it to the local override file
|
||
|
|
`~/.config/jellyfin-cleanup/local-rules.conf` if they want it
|
||
|
|
classified next time.
|
||
|
|
- **Hidden dotfiles** (anything starting with `.` other than known OS
|
||
|
|
caches like `.DS_Store`) — FLAGGED. Don't auto-delete; could be a
|
||
|
|
legitimate `.subliminal.cache` (subtitles plugin) or similar.
|
||
|
|
- **Symlinks** — never followed. A symlink in a release directory is
|
||
|
|
always FLAGGED; the script refuses to copy or quarantine it.
|
||
|
|
- **Permission denied** — script bails with non-zero exit. Never
|
||
|
|
partially applies.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 7. The `Futurama Compare.png` problem (artwork false-positive)
|
||
|
|
|
||
|
|
`Futurama Compare.png` is a 1.05 MB PNG sitting next to the season's MKV
|
||
|
|
files. To a naive image-globber it looks like artwork — same extension as
|
||
|
|
`folder.jpg`, larger than the typical poster, sitting in the right
|
||
|
|
location. It's actually an encoder comparison shot.
|
||
|
|
|
||
|
|
The rule from doc 01 (artwork) and enforced here:
|
||
|
|
|
||
|
|
> **An image file in the release root is KEPT only if its name is on the
|
||
|
|
> exact recognised-artwork allow-list.** Anything else is DELETED.
|
||
|
|
|
||
|
|
Recognised artwork allow-list (top-level of an item folder):
|
||
|
|
|
||
|
|
- `folder.{jpg,jpeg,png,webp}`
|
||
|
|
- `poster.{jpg,jpeg,png,webp}`
|
||
|
|
- `cover.{jpg,jpeg,png,webp}`
|
||
|
|
- `default.{jpg,jpeg,png,webp}`
|
||
|
|
- `show.{jpg,jpeg,png,webp}` (series only)
|
||
|
|
- `jacket.{jpg,jpeg,png,webp}` (series only)
|
||
|
|
- `movie.{jpg,jpeg,png,webp}` (movies only)
|
||
|
|
- `backdrop.{jpg,jpeg,png,webp}` and `backdrop[0-9]*.{jpg,jpeg,png,webp}`
|
||
|
|
- `fanart.{jpg,jpeg,png,webp}`, `background.{jpg,jpeg,png,webp}`,
|
||
|
|
`art.{jpg,jpeg,png,webp}`
|
||
|
|
- `logo.{png,jpg}`, `clearlogo.{png,jpg}`
|
||
|
|
- `banner.{jpg,png}`, `landscape.{jpg,png}`, `thumb.{jpg,png}`,
|
||
|
|
`disc.{png,jpg}`, `clearart.{png,jpg}`
|
||
|
|
- `season[0-9]*-poster.{jpg,png}`, `season[0-9]*.{jpg,png}`,
|
||
|
|
`season-specials-poster.{jpg,png}`
|
||
|
|
- `extrafanart/` and `backdrops/` directories (any contents OK)
|
||
|
|
|
||
|
|
Exception: images **inside** a recognised extras folder (`extras/`,
|
||
|
|
`featurettes/`, etc.) are KEPT regardless of name — they're presumed to be
|
||
|
|
intentional content of that extra.
|
||
|
|
|
||
|
|
`Futurama Compare.png` matches none of these allow-list patterns and is
|
||
|
|
not inside an extras folder, so it's DELETED.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 8. Security rules
|
||
|
|
|
||
|
|
The single most important rule in this document:
|
||
|
|
|
||
|
|
> **Windows-executable extensions and Internet Shortcut formats are
|
||
|
|
> auto-deleted, never quarantined for "review", because the threat model
|
||
|
|
> isn't the Linux server, it's the Jellyfin user who downloads them.**
|
||
|
|
|
||
|
|
Jellyfin has a "Download original file" button for every item. If a
|
||
|
|
release contains `Codec Installer.exe`, Jellyfin will happily serve it to
|
||
|
|
any user with library access — including the friend on Windows who might
|
||
|
|
not understand that downloading and running an `.exe` from a media library
|
||
|
|
is a terrible idea. We don't trust the upload chain (the release group),
|
||
|
|
so we strip these on the server side.
|
||
|
|
|
||
|
|
Exhaustive auto-delete list (security override — these bypass the
|
||
|
|
"FLAG unknown" rule):
|
||
|
|
|
||
|
|
| Pattern | Risk |
|
||
|
|
|---|---|
|
||
|
|
| `*.exe` | Windows executable. Direct code execution on download+run. |
|
||
|
|
| `*.msi` | Windows Installer package. Silent install possible. |
|
||
|
|
| `*.bat`, `*.cmd` | Windows batch script. Runs in `cmd.exe`. |
|
||
|
|
| `*.com` | Old DOS-style executable. Still runs on modern Windows. |
|
||
|
|
| `*.scr` | Windows screensaver = .exe in disguise. Classic malware vector. |
|
||
|
|
| `*.ps1` | PowerShell script. Common modern malware delivery. |
|
||
|
|
| `*.vbs`, `*.wsf`, `*.hta`, `*.js` (Windows Script Host) | Active scripting. |
|
||
|
|
| `*.jar` | Java archive — runs as `java -jar` on systems with JRE. |
|
||
|
|
| `*.dll`, `*.sys` | Windows libraries / drivers. Side-load attacks. |
|
||
|
|
| `*.url`, `*.website`, `*.lnk` | Internet Shortcut / Windows Shortcut. Points at attacker-controlled URL. |
|
||
|
|
| `*.iso`, `*.img` (in a media folder, not at the library root) | Mountable disk image. Can carry Windows installers. **FLAG, not auto-delete** — could legitimately be a DVD rip. |
|
||
|
|
| `*.app/` | macOS application bundle. Auto-deleted. |
|
||
|
|
| `Autorun.inf` | Windows autorun config. **AUTO-DELETE.** |
|
||
|
|
|
||
|
|
Total auto-delete categories that are **purely** security-driven (not
|
||
|
|
Jellyfin-irrelevance-driven): **15** — `.exe`, `.msi`, `.bat`, `.cmd`,
|
||
|
|
`.com`, `.scr`, `.ps1`, `.vbs`, `.wsf`, `.hta`, `.jar`, `.dll`, `.sys`,
|
||
|
|
`.url`/`.website`/`.lnk`, `Autorun.inf`. Plus 1 flagged for human review:
|
||
|
|
`.iso`/`.img`.
|
||
|
|
|
||
|
|
### 8.1 Why `.url` is in the security list
|
||
|
|
|
||
|
|
`.url` is a plain-text Internet Shortcut. On Windows, double-clicking it
|
||
|
|
opens the target in the default browser. The "target" is whatever the
|
||
|
|
release group put in the `URL=` line. Historically this was used to push
|
||
|
|
codec-pack download pages with bundled adware. There is no benign reason
|
||
|
|
for a `.url` to ship in a media release.
|
||
|
|
|
||
|
|
The Futurama release contains exactly this pattern:
|
||
|
|
|
||
|
|
```
|
||
|
|
[InternetShortcut]
|
||
|
|
URL=https://ninite.com/klitecodecs/
|
||
|
|
```
|
||
|
|
|
||
|
|
Ninite itself is reputable — but the principle is "do not ship clickable
|
||
|
|
URLs to third-party installers in a media library, ever".
|
||
|
|
|
||
|
|
### 8.2 The `RARBG_DO_NOT_MIRROR.exe` historic case
|
||
|
|
|
||
|
|
Some releases historically contained a file named
|
||
|
|
`RARBG_DO_NOT_MIRROR.exe`, ostensibly to discourage mirror sites from
|
||
|
|
re-uploading. In several documented cases this file was actually adware
|
||
|
|
or a cryptominer. Auto-delete, no questions asked.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 9. Prepared cleanup script — `cleanup-import.sh`
|
||
|
|
|
||
|
|
Idempotent. Dry-run by default. Quarantine-first. Source-immutable.
|
||
|
|
Returns the staging path on stdout for piping to doc 08's normalizer.
|
||
|
|
|
||
|
|
Save to `bin/cleanup-import.sh` in the `jellyfin-stack` repo.
|
||
|
|
|
||
|
|
```bash
|
||
|
|
#!/usr/bin/env bash
|
||
|
|
# cleanup-import.sh — Pre-import cleanup for tv.s8n.ru
|
||
|
|
# Version 1.0 (2026-05-08) — see docs/07-pre-import-cleanup.md
|
||
|
|
#
|
||
|
|
# Usage:
|
||
|
|
# cleanup-import.sh SRC # dry-run
|
||
|
|
# cleanup-import.sh --apply SRC # quarantine
|
||
|
|
# cleanup-import.sh --confirm-quarantine YYYY-MM-DD # recycle
|
||
|
|
#
|
||
|
|
# Exit codes:
|
||
|
|
# 0 success / nothing to do
|
||
|
|
# 1 user error (bad args, source not found)
|
||
|
|
# 2 internal error (permission, partial state)
|
||
|
|
# 3 flagged files present — user must review before --apply
|
||
|
|
set -euo pipefail
|
||
|
|
|
||
|
|
STAGING_ROOT="${JELLYFIN_STAGING_ROOT:-$HOME/.jellyfin-staging}"
|
||
|
|
QUARANTINE_ROOT="${JELLYFIN_QUARANTINE_ROOT:-$HOME/.jellyfin-quarantine}"
|
||
|
|
TODAY="$(date +%Y-%m-%d)"
|
||
|
|
|
||
|
|
# ----- classification -----
|
||
|
|
# Returns one of: KEEP DELETE FLAG
|
||
|
|
classify() {
|
||
|
|
local path="$1"
|
||
|
|
local base
|
||
|
|
base="$(basename "$path")"
|
||
|
|
local lower
|
||
|
|
lower="$(printf '%s' "$base" | tr '[:upper:]' '[:lower:]')"
|
||
|
|
|
||
|
|
# Security overrides — bypass everything else
|
||
|
|
case "$lower" in
|
||
|
|
*.exe|*.msi|*.bat|*.cmd|*.com|*.scr|*.ps1|*.vbs|*.wsf|*.hta|*.jar|*.dll|*.sys) echo DELETE; return ;;
|
||
|
|
*.url|*.website|*.lnk) echo DELETE; return ;;
|
||
|
|
autorun.inf) echo DELETE; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# OS junk
|
||
|
|
case "$lower" in
|
||
|
|
thumbs.db|ehthumbs.db|ehthumbs_vista.db|.ds_store|desktop.ini|.directory) echo DELETE; return ;;
|
||
|
|
._*) echo DELETE; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# Media — KEEP
|
||
|
|
case "$lower" in
|
||
|
|
*.mkv|*.mp4|*.avi|*.m4v|*.ts|*.mov|*.webm|*.wmv|*.flv|*.mpg|*.mpeg) echo KEEP; return ;;
|
||
|
|
*.srt|*.ass|*.ssa|*.vtt|*.sup|*.idx|*.sub) echo KEEP; return ;;
|
||
|
|
*.mp3|*.flac|*.ogg|*.opus|*.m4a|*.wav) echo KEEP; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# Recognised artwork at item root
|
||
|
|
case "$lower" in
|
||
|
|
folder.jpg|folder.jpeg|folder.png|folder.webp) echo KEEP; return ;;
|
||
|
|
poster.jpg|poster.jpeg|poster.png|poster.webp) echo KEEP; return ;;
|
||
|
|
cover.jpg|cover.jpeg|cover.png|cover.webp) echo KEEP; return ;;
|
||
|
|
default.jpg|default.png|show.jpg|show.png|jacket.jpg|jacket.png|movie.jpg|movie.png) echo KEEP; return ;;
|
||
|
|
backdrop.jpg|backdrop.png|backdrop[0-9]*.jpg|backdrop[0-9]*.png) echo KEEP; return ;;
|
||
|
|
fanart.jpg|fanart.png|background.jpg|background.png|art.jpg|art.png) echo KEEP; return ;;
|
||
|
|
logo.png|logo.jpg|clearlogo.png|clearlogo.jpg|banner.jpg|banner.png) echo KEEP; return ;;
|
||
|
|
landscape.jpg|landscape.png|thumb.jpg|thumb.png|disc.png|disc.jpg|clearart.png|clearart.jpg) echo KEEP; return ;;
|
||
|
|
season[0-9]*-poster.jpg|season[0-9]*-poster.png|season[0-9]*.jpg|season[0-9]*.png) echo KEEP; return ;;
|
||
|
|
season-specials-poster.jpg|season-specials-poster.png) echo KEEP; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# Promo images masquerading as art
|
||
|
|
case "$lower" in
|
||
|
|
*compare*.png|*compare*.jpg|*compare*.jpeg|*compare*.webp|*compare*.gif) echo DELETE; return ;;
|
||
|
|
*sample*.png|*sample*.jpg|*sample*.jpeg) echo DELETE; return ;;
|
||
|
|
*screen*.png|*screen*.jpg|*preview*.png|*preview*.jpg) echo DELETE; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# Text-flavoured junk
|
||
|
|
case "$lower" in
|
||
|
|
*.txt|*.diz|file_id.diz) echo DELETE; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# Sample files
|
||
|
|
case "$lower" in
|
||
|
|
sample.mkv|sample.mp4|sample.avi|sample.m4v) echo DELETE; return ;;
|
||
|
|
*-sample.mkv|*-sample.mp4|*.sample.mkv|*.sample.mp4|*_sample.mkv|*_sample.mp4) echo DELETE; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# Torrent residue
|
||
|
|
case "$lower" in
|
||
|
|
*.torrent|*.magnet|*.parts|*.aria2|*.meta) echo DELETE; return ;;
|
||
|
|
*.pad|__padding_file_*|_____padding_file_*) echo DELETE; return ;;
|
||
|
|
*.sfv|*.md5|*.sha1|*.sha256) echo DELETE; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# NFO discriminator — KEEP if Jellyfin-compatible XML, else DELETE
|
||
|
|
case "$lower" in
|
||
|
|
*.nfo)
|
||
|
|
if head -c 4096 "$path" | tr -d '[:space:]' \
|
||
|
|
| grep -qE '<(movie|tvshow|episodedetails|artist|album|musicvideo|season)\b'; then
|
||
|
|
echo KEEP
|
||
|
|
else
|
||
|
|
echo DELETE
|
||
|
|
fi
|
||
|
|
return
|
||
|
|
;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
# Suspicious archives in a media folder
|
||
|
|
case "$lower" in
|
||
|
|
*.rar|*.r[0-9][0-9]|*.zip|*.7z|*.tar|*.tar.gz|*.iso|*.img) echo FLAG; return ;;
|
||
|
|
esac
|
||
|
|
|
||
|
|
echo FLAG
|
||
|
|
}
|
||
|
|
|
||
|
|
# ----- folder classification -----
|
||
|
|
# Returns one of: KEEP_AS-IS RENAME:<target> DELETE FLAG
|
||
|
|
classify_dir() {
|
||
|
|
local d="$1"
|
||
|
|
local lower
|
||
|
|
lower="$(basename "$d" | tr '[:upper:]' '[:lower:]')"
|
||
|
|
case "$lower" in
|
||
|
|
behind\ the\ scenes|deleted\ scenes|interviews|scenes|samples|shorts|featurettes|clips|other|extras|trailers|theme-music|backdrops)
|
||
|
|
echo "RENAME:$lower"; return ;;
|
||
|
|
bts|behind-the-scenes) echo "RENAME:behind the scenes"; return ;;
|
||
|
|
deleted-scenes|deleted_scenes) echo "RENAME:deleted scenes"; return ;;
|
||
|
|
bonus|bonus\ features|bonus\ material|special\ features|outtakes|bloopers|gag\ reel) echo "RENAME:extras"; return ;;
|
||
|
|
proof|screens|screenshots|caps|preview|previews) echo DELETE; return ;;
|
||
|
|
sample) echo DELETE; return ;;
|
||
|
|
.fseventsd|.spotlight-v100|.trashes|\$recycle.bin|system\ volume\ information) echo DELETE; return ;;
|
||
|
|
extrafanart) echo "RENAME:extrafanart"; return ;; # case stays, recognised
|
||
|
|
*) echo FLAG; return ;;
|
||
|
|
esac
|
||
|
|
}
|
||
|
|
|
||
|
|
# ----- main -----
|
||
|
|
APPLY=0
|
||
|
|
CONFIRM_DATE=""
|
||
|
|
SRC=""
|
||
|
|
|
||
|
|
while [[ $# -gt 0 ]]; do
|
||
|
|
case "$1" in
|
||
|
|
--apply) APPLY=1; shift ;;
|
||
|
|
--confirm-quarantine) CONFIRM_DATE="$2"; shift 2 ;;
|
||
|
|
-h|--help) sed -n '2,12p' "$0"; exit 0 ;;
|
||
|
|
-*) echo "unknown flag: $1" >&2; exit 1 ;;
|
||
|
|
*) SRC="$1"; shift ;;
|
||
|
|
esac
|
||
|
|
done
|
||
|
|
|
||
|
|
if [[ -n "$CONFIRM_DATE" ]]; then
|
||
|
|
if [[ -d "$QUARANTINE_ROOT/$CONFIRM_DATE" ]]; then
|
||
|
|
gio trash "$QUARANTINE_ROOT/$CONFIRM_DATE"
|
||
|
|
echo "Recycled $QUARANTINE_ROOT/$CONFIRM_DATE"
|
||
|
|
else
|
||
|
|
echo "No quarantine for $CONFIRM_DATE" >&2; exit 1
|
||
|
|
fi
|
||
|
|
exit 0
|
||
|
|
fi
|
||
|
|
|
||
|
|
[[ -n "$SRC" && -d "$SRC" ]] || { echo "usage: $0 [--apply] SRC" >&2; exit 1; }
|
||
|
|
|
||
|
|
RELEASE="$(basename "$SRC")"
|
||
|
|
STAGE="$STAGING_ROOT/$RELEASE"
|
||
|
|
QUAR="$QUARANTINE_ROOT/$TODAY/$RELEASE"
|
||
|
|
|
||
|
|
declare -i KEEP_N=0 DEL_N=0 FLAG_N=0
|
||
|
|
|
||
|
|
# Walk source, classify each entry
|
||
|
|
while IFS= read -r -d '' f; do
|
||
|
|
rel="${f#$SRC/}"
|
||
|
|
if [[ -d "$f" ]]; then
|
||
|
|
case "$(classify_dir "$f")" in
|
||
|
|
KEEP_AS-IS|RENAME:*) ;;
|
||
|
|
DELETE) printf 'DELETE %s/ [junk dir]\n' "$rel"; DEL_N+=1 ;;
|
||
|
|
FLAG) printf 'FLAG %s/ [unknown dir name]\n' "$rel"; FLAG_N+=1 ;;
|
||
|
|
esac
|
||
|
|
continue
|
||
|
|
fi
|
||
|
|
case "$(classify "$f")" in
|
||
|
|
KEEP) printf 'KEEP %s\n' "$rel"; KEEP_N+=1 ;;
|
||
|
|
DELETE) printf 'DELETE %s\n' "$rel"; DEL_N+=1 ;;
|
||
|
|
FLAG) printf 'FLAG %s\n' "$rel"; FLAG_N+=1 ;;
|
||
|
|
esac
|
||
|
|
done < <(find "$SRC" -mindepth 1 -print0)
|
||
|
|
|
||
|
|
echo "---"
|
||
|
|
echo "KEEP $KEEP_N"
|
||
|
|
echo "DELETE $DEL_N"
|
||
|
|
echo "FLAG $FLAG_N"
|
||
|
|
|
||
|
|
if (( FLAG_N > 0 )); then
|
||
|
|
echo "FLAG count > 0; review before re-running with --apply." >&2
|
||
|
|
(( APPLY == 0 )) || exit 3
|
||
|
|
fi
|
||
|
|
|
||
|
|
if (( APPLY == 0 )); then
|
||
|
|
echo "Dry run only. Re-run with --apply to quarantine."
|
||
|
|
exit 0
|
||
|
|
fi
|
||
|
|
|
||
|
|
# --- APPLY path: copy to staging, move DELETE to quarantine ---
|
||
|
|
mkdir -p "$STAGE" "$QUAR"
|
||
|
|
# rsync -a preserves perms and is idempotent
|
||
|
|
rsync -a --delete "$SRC/" "$STAGE/"
|
||
|
|
|
||
|
|
while IFS= read -r -d '' f; do
|
||
|
|
rel="${f#$STAGE/}"
|
||
|
|
if [[ -d "$f" ]]; then
|
||
|
|
res="$(classify_dir "$f")"
|
||
|
|
case "$res" in
|
||
|
|
RENAME:*)
|
||
|
|
target="${res#RENAME:}"
|
||
|
|
parent="$(dirname "$f")"
|
||
|
|
[[ "$(basename "$f")" == "$target" ]] || mv "$f" "$parent/$target"
|
||
|
|
;;
|
||
|
|
DELETE)
|
||
|
|
mkdir -p "$QUAR/$(dirname "$rel")"
|
||
|
|
mv "$f" "$QUAR/$rel"
|
||
|
|
;;
|
||
|
|
esac
|
||
|
|
continue
|
||
|
|
fi
|
||
|
|
case "$(classify "$f")" in
|
||
|
|
DELETE)
|
||
|
|
mkdir -p "$QUAR/$(dirname "$rel")"
|
||
|
|
mv "$f" "$QUAR/$rel"
|
||
|
|
;;
|
||
|
|
esac
|
||
|
|
done < <(find "$STAGE" -mindepth 1 -print0)
|
||
|
|
|
||
|
|
# Manifest
|
||
|
|
{
|
||
|
|
echo "{"
|
||
|
|
echo " \"release\": \"$RELEASE\","
|
||
|
|
echo " \"date\": \"$TODAY\","
|
||
|
|
echo " \"source\": \"$SRC\","
|
||
|
|
echo " \"staging\": \"$STAGE\","
|
||
|
|
echo " \"quarantine\": \"$QUAR\""
|
||
|
|
echo "}"
|
||
|
|
} > "$STAGE/.cleanup-manifest.json"
|
||
|
|
|
||
|
|
# Stdout: the staging path, for piping to doc 08's normalizer
|
||
|
|
echo "$STAGE"
|
||
|
|
```
|
||
|
|
|
||
|
|
### 9.1 Pipeline integration
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Full pre-import flow:
|
||
|
|
SRC="/home/admin/Downloads/futrama/Futurama Season 1 [1080p AI x265 10bit FS99 Joy]"
|
||
|
|
STAGING="$(cleanup-import.sh --apply "$SRC")"
|
||
|
|
# STAGING is now ~/.jellyfin-staging/Futurama Season 1.../ with junk gone.
|
||
|
|
# Hand off to doc 08:
|
||
|
|
normalize-filenames.sh "$STAGING"
|
||
|
|
# Then move to live media tree (manual; doc 05 confirms layout):
|
||
|
|
mv "$STAGING" "/home/user/media/tv/Futurama (1999)/Season 01"
|
||
|
|
```
|
||
|
|
|
||
|
|
The `mv` to the live tree is **deliberately manual**. Cleanup and rename
|
||
|
|
are reproducible from source; the move into `/home/user/media/` is the
|
||
|
|
point of no return and the user runs it consciously.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 10. What this doc explicitly does NOT do
|
||
|
|
|
||
|
|
- **Filename normalization** — that's doc 08. This doc only deletes; doc 08
|
||
|
|
renames `Futurama S01E01 Space Pilot 3000 [1080p x265 10bit Joy].mkv`
|
||
|
|
into the canonical `Futurama (1999) - S01E01 - Space Pilot 3000.mkv`.
|
||
|
|
- **Subtitle reconciliation** — doc 03 covers per-language naming; this
|
||
|
|
doc only deletes obsolete formats (`.smi`, `.rt`).
|
||
|
|
- **Library refresh** — after files land in `/home/user/media/`, run
|
||
|
|
`POST /Library/Refresh` on the Jellyfin API (doc 02 § 2). Cleanup never
|
||
|
|
touches the running container.
|
||
|
|
- **NFO writing** — doc 02 § 11 covers writing override NFOs. This doc
|
||
|
|
only filters incoming NFOs.
|
||
|
|
- **Source deletion** — never. The source download is read-only to this
|
||
|
|
pipeline; the user removes it manually post-import.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 11. TL;DR
|
||
|
|
|
||
|
|
| Step | What | Where |
|
||
|
|
|---|---|---|
|
||
|
|
| 1 | Audit (dry-run) | `cleanup-import.sh "$SRC"` |
|
||
|
|
| 2 | Apply (quarantine) | `cleanup-import.sh --apply "$SRC"` → prints staging path |
|
||
|
|
| 3 | Review quarantine | `ls ~/.jellyfin-quarantine/$(date +%F)/` |
|
||
|
|
| 4 | Normalize filenames | doc 08, takes staging path as input |
|
||
|
|
| 5 | Move to live tree | manual `mv "$STAGING" /home/user/media/...` |
|
||
|
|
| 6 | Refresh library | `POST /Library/Refresh` (doc 02) |
|
||
|
|
| 7 | Confirm quarantine | `cleanup-import.sh --confirm-quarantine YYYY-MM-DD` |
|
||
|
|
| 8 | Delete source | manual, only after Jellyfin shows the item correctly |
|
||
|
|
|
||
|
|
The hard rule, repeated: **the source download is never modified, the live
|
||
|
|
media tree is never written by cleanup, and Windows executables never
|
||
|
|
reach a Jellyfin user's browser.**
|