From 6a54cae82e57ac0c25d001bde524db77d7d1f09a Mon Sep 17 00:00:00 2001
From: s8n <admin@s8n.ru>
Date: Fri, 8 May 2026 17:47:28 +0100
Subject: [PATCH] doc 22: jellyfin runtime perf audit (read-only)

Server-runtime focus; supplements doc 13. Headline: 4 concurrent ffmpeg
processes for ONE viewer all transcoding 1080p->2160p with PGS subtitle
burn-in, on uncapped jellyfin container sharing 12-core host with
uncapped Forgejo BlueBuild CI runner (88-99 % CPU). Load avg 15.4 on 12
cores. Throttling+SegmentDeletion still off (doc 13 finding 03 now
non-optional). Top quick-win: enable transcode throttling + segment
deletion + cap RemoteClientBitrateLimit.
---
 docs/22-jellyfin-runtime-perf-audit.md | 517 +++++++++++++++++++++++++
 1 file changed, 517 insertions(+)
 create mode 100644 docs/22-jellyfin-runtime-perf-audit.md
diff --git a/docs/22-jellyfin-runtime-perf-audit.md b/docs/22-jellyfin-runtime-perf-audit.md
new file mode 100644
index 0000000..a34d975
--- /dev/null
+++ b/docs/22-jellyfin-runtime-perf-audit.md
@@ -0,0 +1,517 @@
+# 22 — Jellyfin Runtime Performance Audit (server scope)
+
+> Status: **read-only audit**, executed 2026-05-08 ~17:30–17:45 BST against
+> `https://arrflix.s8n.ru` (Jellyfin 10.10.3 on nullstone, container `jellyfin`).
+> Scope: server runtime — CPU, RAM, container limits, FFmpeg, scheduled
+> tasks, plugins. Network/edge, storage, color/HDR are out of scope (sibling
+> agents). Supplements doc 13 (2026-05-08, host-capacity scan); does not
+> repeat findings already in 13 unless the data has materially changed.
+> **No fixes applied. No state mutated. No container restart.**
+
+---
+
+## 1. Executive summary — top 3 perf culprits
+
+| # | Culprit | Severity | Evidence (one line) |
+|---|---|:-:|---|
+| 1 | **4 concurrent ffmpeg processes for ONE viewer**, each upscaling 1080p → 2160p with PGS subtitle burn-in, no throttling, no segment deletion | **CRITICAL** | `ps`: PIDs 1681949 (643 % CPU), 1685275 (135 %), 1685316 (133 %), 1685478 (132 %) — all transcoding `Rick and Morty S01E01.mkv`, all `-vf scale=3840:2160` + `[0:4]overlay` subtitle burn. Container CPU 690–876 % across 3 samples |
+| 2 | **Forgejo BlueBuild CI container running uncapped on the same 12-core host** (noisy neighbor) | **HIGH** | `docker stats`: `FORGEJO-ACTIONS-TASK-202_..._Build-push-OCI` 88–99 % CPU, 4.3 GiB RAM, 5 GB net-in. Both jellyfin and the build container have `Memory=0 NanoCpus=0 CpuQuota=0` (no limits). Aggregate load 15.43 / 14.61 / 8.85 on 12 cores |
+| 3 | **GPU acceleration still off** (already in doc 13 finding 02; quantified here) — every CPU transcode spawns one ffmpeg burning 6–8 cores per stream because of the 4K-upscale + sub-overlay filtergraph | **HIGH** | `HardwareAccelerationType=none`. Per-ffmpeg cost on this filtergraph: ~6.4 cores at `preset=veryfast`. 2 viewers transcoding = full host pegged |
+
+**Biggest quick-win:** turn on **transcode throttling + segment deletion**
+(doc 13 finding 03 already flags this; new evidence here makes it
+non-optional). The 4-stream pile-up in §3 is exactly what those two
+flags exist to prevent — without them, every client seek/reload spawns a
+fresh ffmpeg and the previous one keeps burning a core for up to 720 s
+(`SegmentKeepSeconds=720`). Two checkbox flips in Playback settings.
+
+---
+
+## 2. Resource snapshot (3 samples, 10 s apart)
+
+| Sample @time | jellyfin CPU% | jellyfin MEM | NET I/O | BLOCK I/O | PIDs |
+|---|---:|---:|---:|---:|---:|
+| t=0 | **834.3 %** | 2.635 GiB / 31.27 GiB (8.42 %) | 5.36 / 158 MB | 1.14 / 855 MB | 101 |
+| t=10s | **690.5 %** | 2.637 GiB | 5.37 / 158 MB | 1.22 / 894 MB | 102 |
+| t=20s | **876.7 %** | 2.646 GiB | 5.37 / 158 MB | 1.32 / 942 MB | 101 |
+
+**Container limits:** `Memory=0  NanoCpus=0  CpuQuota=0  CpuPeriod=0
+PidsLimit=<none>  RestartPolicy=unless-stopped`. **No CPU or RAM cap on
+the jellyfin container.** Same for the Forgejo build container.
+
+**Host (nullstone, 12-core AMD Ryzen 5 2600X, 32 GiB RAM, 24 GiB swap):**
+- `uptime`: load avg **15.43 / 14.61 / 8.85** — 1-min load 28 % above
+  core count. 5-min trend confirms sustained load. Doc 13 logged 11.40 /
+  9.59 / 6.19 ~13 h ago, so the host has been getting *worse*, not better.
+- `free -h`: 31 GiB total, 10 GiB used, 8.2 GiB free, 13 GiB buff/cache;
+  swap **7.8 GiB / 24 GiB used** (32 %). `SwapCached=771 MB` (kernel is
+  actively servicing swap-in from cache — i.e. swap-thrash signature).
+- `vmstat 1 5`: `r=3–27`, `cs=30 K–41 K/s` (very high context switch
+  rate), `si≤24 KB/s so≈0` (paging-in but not out — recovering, not
+  thrashing right this second), `us=70–72 % sy=10–13 % id=16–18 %
+  wa=0 %`.
+- `iostat -x`: `nvme0n1` w/s ≈ 38–433, `wkB/s` ≈ 364–2 272, util `0.4 %–
+  0.9 %`. **Disk is not the bottleneck — CPU is.**
+
+**All-container CPU% (sorted, top 5):**
+
+| Container | CPU% | MEM | Notes |
+|---|---:|---:|---|
+| jellyfin | **773–876** | 2.6 GiB | this audit's target |
+| FORGEJO-ACTIONS-TASK-202_..._Build-push-OCI | **88–99** | 4.3 GiB | uncapped CI build, see §3 culprit 2 |
+| traefik | 9 | 48 MiB | routine reverse proxy |
+| forgejo | 9 | 207 MiB | git web |
+| minecraft-mc | 7 | 4 GiB | racked.ru server |
+| (28 other containers) | < 5 % combined | | none material |
+
+The two CPU monsters together (jellyfin + bluebuild) account for **~90 %
+of the 12-core host's user time** during this audit window.
+
+---
+
+## 3. Active sessions + active transcodes
+
+**Sessions (within last 600 s):** **1**
+
+| User | Client | Device | RemoteIP | NowPlaying | PlayMethod | Pos |
+|---|---|---|---|---|---|---|
+| s8n | Jellyfin Web | Chrome | 192.168.0.10 | Rick and Morty S01E01 — Pilot | DirectPlay (claimed) / **Transcoding** (actual) | 8 s |
+
+**TranscodingInfo on the active session:**
+
+```
+VideoCodec  → h264 (libx264, preset=veryfast, crf=23)
+AudioCodec  → aac (libfdk_aac, 256 kbps stereo, +6 dB volume gain)
+Resolution  → 3840 × 2160 (UPSCALE — source is 1080p)
+Bitrate     → 13.8 Mbps
+Container   → fmp4 / hls
+HW          → none
+Reasons     → VideoCodecNotSupported, AudioCodecNotSupported, SubtitleCodecNotSupported
+Direct      → IsVideoDirect=False, IsAudioDirect=False
+Completion  → 0 % (just started)
+```
+
+**Active ffmpeg processes on host: 4** (all for the same viewer, same
+file — see §5).
+
+The session reports `PlayMethod=DirectPlay` while *also* presenting a
+`TranscodingInfo` block — Jellyfin's TS DTO returns the last-set state,
+so this is the client navigating into the page; the actual decision was
+**transcode** (the 4 ffmpeg's confirm). The HLS player sometimes flips
+`PlayMethod=Transcode` only after the first segment downloads; pre-roll
+state matches the 4-process pile-up in §5.
+
+---
+
+## 4. Scheduled tasks
+
+All tasks **Idle**. None in progress. Last-run durations are tiny — no
+scheduled task is the culprit. Library scan runs every 6 h (last
+`14:14:04`, 0.3 s wall — only 187 items so it converges instantly).
+
+| Name | State | Last end (UTC+1) | Last duration | Trigger |
+|---|---|---|---:|---|
+| Audio Normalization | Idle | 2026-05-08T00:58 | 0.0 s | IntervalTrigger |
+| Clean Cache Directory | Idle | 2026-05-08T00:58 | 0.1 s | IntervalTrigger |
+| Clean Log Directory | Idle | 2026-05-08T00:58 | 0.0 s | IntervalTrigger |
+| Clean Transcode Directory | Idle | 2026-05-08T16:22 | 0.0 s | StartupTrigger |
+| Clean up collections and playlists | Idle | 2026-05-08T16:22 | 0.0 s | StartupTrigger |
+| Download missing lyrics | Idle | 2026-05-08T00:58 | 0.1 s | IntervalTrigger |
+| Download missing subtitles | Idle | 2026-05-08T00:58 | 0.0 s | IntervalTrigger |
+| Extract Chapter Images | Idle | 2026-05-08T01:00 | 0.0 s | DailyTrigger |
+| Generate Trickplay Images | Idle | 2026-05-08T02:00 | 0.1 s | DailyTrigger |
+| Media Segment Scan | Idle | 2026-05-08T14:14 | 0.0 s | IntervalTrigger |
+| Optimize database | Idle | 2026-05-08T00:58 | 0.2 s | IntervalTrigger |
+| Refresh Guide | Idle | 2026-05-08T00:58 | 3.2 s | IntervalTrigger |
+| Refresh People | Idle | 2026-05-08T00:58 | 0.3 s | IntervalTrigger |
+| Scan Media Library | Idle | 2026-05-08T14:14 | 0.3 s | IntervalTrigger |
+| TasksRefreshChannels | Idle | 2026-05-08T00:58 | 0.1 s | IntervalTrigger |
+| Update Plugins | Idle | 2026-05-08T16:22 | 1.2 s | StartupTrigger |
+| Clean Activity Log / Keyframe Extractor / Migrate Trickplay Image Location | Idle | (never run) | — | — |
+
+**Container restarted at 16:22:06 today** (StartupTrigger task end-times
+imply a restart — last audit had `StartedAt=02:13:01`, doc 13 finding 30
+expected 0 restarts). Operator likely restarted the container ~17 h ago,
+not material to perf but worth noting.
+
+**Verdict:** culprit (a) "scheduled task hogging CPU" → **ruled out**.
+
+---
+
+## 5. FFmpeg processes on host (snapshot)
+
+**4 simultaneous ffmpeg processes, all transcoding the same source for
+the same viewer.** This is the smoking gun. Process tree from the
+container shows just `1 jellyfin` (parent) + `1579 ffmpeg` + `1725
+ffmpeg` (the others are still spawning); host `ps -ef` shows 4
+ffmpeg's owned by `user` uid 1000.
+
+| PID | %CPU | %MEM | RSS | etime | What | Subs filter |
+|---:|---:|---:|---:|---:|---|---|
+| 1681949 | **643** | 6.9 | 2.27 GB | 53 s | `-ss 33s` HLS seek | **yes** — `[0:4]scale,scale=3840:2160:fast_bilinear[sub] ; [0:0]scale=3840:2160 [main] ; overlay` |
+| 1685275 | **135** | 4.4 | 1.45 GB | 6 s | `-ss 15s` HLS seek | yes — same chain |
+| 1685316 | **133** | 4.4 | 1.45 GB | 6 s | full transcode (no -ss) | no — plain `setparams + scale + format=yuv420p` |
+| 1685478 | **132** | 3.9 | 1.29 GB | 4 s | full transcode `-canvas_size 1920x1080` | yes — same chain |
+| 1669243 (earlier sample, then died) | ~759 | 7.0 | 2.30 GB | 254 s | full transcode | no |
+
+**What every ffmpeg is doing:**
+- Decoding source 1080p H.265 (or H.264 — Pilot is x264 Bluray rip).
+- **Upscaling video to 3840×2160 with `scale=...:fast_bilinear`.**
+- **Burning PGS subtitle stream `0:4` ALSO upscaled to 3840×2160 onto
+  the video.** This is the heaviest overlay path the JF filtergraph
+  produces.
+- Re-encoding to H.264 `libx264 preset=veryfast crf=23 high@L5.1` with
+  `maxrate=13.5 Mbps`.
+- `-threads 0` (= use all cores), `-max_muxing_queue_size 2048`.
+- HLS fmp4 segments to `/cache/transcodes/<sessionId><n>.mp4`.
+
+**Why 4 of them at once for one user:** every time the client seeks or
+reloads, JF starts a new ffmpeg with a new sessionId and a new segment
+file prefix. Because `EnableThrottling=false` and
+`EnableSegmentDeletion=false` (doc 13 findings 03/05), the old ffmpeg
+keeps producing segments to its own cache prefix and **does not exit
+until `SegmentKeepSeconds=720` elapses**. Three observed cache prefixes
+right now: `8e8a8538…`, `ef1caecc…` (already produced segments 0–30 →
+~73 MiB), `3ba3fce4…`, `b6f150cb…`, `fcc6137e…` — five session-IDs
+across the last ~5 minutes for one viewer.
+
+**Why each ffmpeg is so expensive:**
+- 1080p → 4K upscale ≈ 4× pixel volume.
+- PGS subtitle stream is also being scaled to 4K and overlaid (alpha
+  blend) every frame.
+- `libfdk_aac` 256 kbps is fine, the cost is essentially all video.
+- On 12 logical cores at `preset=veryfast`, this filtergraph clocks
+  **6.4 cores of headroom per ffmpeg** (643 % observed). Two
+  simultaneous transcodes = full host. Four = swap thrash + the load
+  avg of 15.
+
+**Why is it upscaling to 4K at all?** Likely the client requested a
+profile that picked the "max bitrate / max-resolution" capability of
+the device (a desktop Chrome will report 4K-capable). The Jellyfin
+ladder is either (a) "always pick highest profile" or (b) the user's
+client is set to "Auto" with no max-resolution cap. **No client-side
+bitrate cap is set on this user** (doc 13 reported
+`RemoteClientBitrateLimit=0`). Combine that with PGS subs the client
+can't render → forced burn-in → the 4K-overlay tax kicks in.
+
+**ffprobe storms:** at 13:31 the log shows **7 simultaneous ffprobe
+calls** (Mandalorian S2 episodes, all at once); at 17:37 **another 7
+simultaneous ffprobes** (Mandalorian S3). Each ffprobe with
+`-analyzeduration 200M -probesize 1G` reads up to 1 GiB into RAM. Cause:
+operator clicked into the season 2/3 page → JF kicks subtitle-search
+for every episode at once because `LibraryMetadataRefreshConcurrency=0`
+(= 12). Doc 13 finding 14 already calls the concurrency-cap fix; this
+audit confirms the symptom.
+
+**Verdict:** the **single biggest user-visible "loads kinda slow"** is
+the 4K-upscale subtitle-burn pile-up.
+
+---
+
+## 6. Plugin status
+
+All 6 plugins **Active**. None in Faulted/Restart. No exception loops in
+log from plugin assemblies.
+
+| Name | Version | Status |
+|---|---|---|
+| AudioDB | 10.10.3.0 | Active |
+| MusicBrainz | 10.10.3.0 | Active |
+| OMDb | 10.10.3.0 | Active |
+| Open Subtitles | 20.0.0.0 | Active *(but mis-configured — see §7)* |
+| Studio Images | 10.10.3.0 | Active |
+| TMDb | 10.10.3.0 | Active |
+
+**Verdict:** culprit (e) "plugin throwing repeated exceptions in log
+spam loop" → **partially confirmed for OpenSubtitles** (it throws on
+every probe — 234 today already), but the cost is per-probe RTT not
+sustained CPU. Fix is doc 13 finding 04.
+
+---
+
+## 7. Log error / warning summary (last 24 h, today's `log_20260508.log`)
+
+`/config/log/log_20260508.log` is **3 968 lines**. Filtered tally:
+
+| Pattern | Count today | Notes |
+|---|---:|---|
+| `[ERR]` total | **266** | |
+| `[WRN]` total | **124** | |
+| `Error downloading subtitles from "Open Subtitles"` | **234** | doc 13 finding 04 — `Username/Password` empty, throws `AuthenticationException` per file probed; **88 % of all errors today are this one cause** |
+| `No space left on device : '/config/metadata/library/...'` | **2** | at 13:53:10 — transient ENOSPC during a metadata write; disk now 62 % full (146 GiB free), so this is a moving-target burst (probably caused by 73 MiB+ of transcode segments accumulating in `/cache/transcodes` while a metadata write tried to extend a small file). **Worth watching** but not the current bottleneck |
+| `Invalid username or password entered` (auth fail) | 5 | three distinct minutes — looks like a user retrying creds, not a brute-forcer |
+| `WS ... error receiving data` (websocket abrupt close) | ~25 | normal: clients closing tabs / dropping carrier. Noise, not a defect |
+| `Compiling a query which loads related collections...` (EF Core warning, slow query) | 1 | EF Core's `QuerySplittingBehavior` warning — Jellyfin upstream issue, harmless on this dataset |
+| `task was canceled` on `/videos/.../hls1/main/-1.mp4` | 1 (17:41) | client gave up mid-segment-init — same 499 family as doc 13's evidence |
+| `SQLITE_BUSY` / `database is locked` | **0** | culprit (d) DB lock contention → **ruled out** |
+
+**Verdict:**
+- culprit (e) "plugin log spam" → confirmed (234 OS errors / day = a
+  scan or page-into-season triggers a loop of failures).
+- culprit (d) "DB lock contention" → ruled out (0 SQLITE_BUSY).
+- the **2 ENOSPC errors are NEW vs doc 13** and warrant tracking — see
+  §9 culprit 4.
+
+---
+
+## 8. DB and cache sizes
+
+```
+/config/data/jellyfin.db        288 K   (was 208 K in doc 13 — fine)
+/config/data/library.db        3.4 M   (was 3.3 M — fine)
+/config/data/library.db-wal    6.2 M   (was 4.4 M — STILL LARGER THAN MAIN, see below)
+/config/data/library.db-shm     32 K
+/config/metadata                99 M   (was 92 M — fine)
+/config/log                    4.2 M   (was 1.3 M — 3× growth in 14 h driven by §7 OS spam)
+/cache/transcodes               84 M / 43 files (snapshot)
+/cache total                  not measurable from in-container du (mount appears empty due to bind layout)
+```
+
+**library.db-wal (6.2 MB) is now ~1.8× the main `.db` (3.4 MB).** Doc 13
+finding 08 already raised this — the situation is slightly worse now
+(WAL grew faster than main during 14 h). Cause: SQLite checkpoints on
+*idle*, but with continuous transcode + ffprobe activity from two
+viewers and library refreshes there is rarely an idle moment to
+checkpoint. **Manual `Optimize database` will collapse the WAL into
+the main file.**
+
+**`/cache/transcodes` 84 MB / 43 files** is the residue of three+
+abandoned ffmpeg sessions. Without `EnableSegmentDeletion=true`, this
+will grow unbounded for `SegmentKeepSeconds=720` per session. Worst
+case at 1 viewer × 4 zombie sessions × 720 s × 13 Mbps = **~5.6 GiB
+transient cache** per minute of pile-up. **This is exactly how the
+13:53 ENOSPC happened** (cache + metadata fighting for the same
+146-GiB free pool).
+
+---
+
+## 9. Concrete remediation list (ranked impact / effort)
+
+### 9.1 Quick-wins (rank 1 → 4 — all are minutes of work, all read-only-safe to apply)
+
+1. **Cap two transcode flags** (Settings → Playback):
+   - `EnableThrottling = true`
+   - `EnableSegmentDeletion = true`
+   *Effect:* zombie ffmpeg from a stale session is killed instead of
+   producing 720 s of segments after the client has moved on. **This
+   single change directly addresses §5's 4-process pile-up.** Doc 13
+   already noted this; the new evidence escalates it from "S effort,
+   cleanup" to **"non-optional"**.
+
+2. **Cap concurrency knobs** (Settings → Server / Library):
+   - `LibraryScanFanoutConcurrency = 4`
+   - `LibraryMetadataRefreshConcurrency = 4`
+   - `ParallelImageEncodingLimit = 4`
+   *Effect:* 7-up ffprobe burst at 13:31 / 17:37 (§5) is capped to 4
+   parallel probes, not 12. Doc 13 already noted this as S effort.
+
+3. **Set `RemoteClientBitrateLimit`** (Dashboard → Playback → Streaming
+   → "Internet streaming bitrate limit"):
+   - Suggest `8 Mbps` (covers 1080p Bluray rips, kills 4K-upscale
+     decisions on remote sessions). LAN clients that want full-bitrate
+     can be flagged via per-user policy.
+   *Effect:* the 13.8 Mbps maxrate-on-WAN session becomes a 8-Mbps
+   session that **doesn't need the 4K-upscale path** — JF stops asking
+   ffmpeg to produce 3840×2160. **This is what makes §5's per-stream
+   cost drop by ~70 %.** Independent of GPU.
+
+4. **Disable Open Subtitles plugin OR populate creds** (already in
+   doc 13 finding 04). Removes 234 ERR/day, restores log signal,
+   removes the per-probe RTT.
+
+### 9.2 Investments (rank 5 → 7 — half-day to multi-day, structural)
+
+5. **Add CPU + memory limits to BOTH `jellyfin` and the Forgejo
+   `BlueBuild` build container in compose** — currently both are
+   uncapped, fighting for the same 12 cores. Suggest:
+   - `jellyfin`: `cpus: 8.0`, `mem_limit: 12G`, `mem_reservation: 4G`
+   - `forgejo-runner` build pods: `cpus: 4.0`, `mem_limit: 8G`
+   *Effect:* a noisy CI build cannot drag interactive playback
+   latency to the floor; viewer always has 8 cores even when
+   BlueBuild is hot. Note that the BlueBuild container is short-lived
+   (forgejo-actions spawns it per job) so the limit goes on the
+   runner's `container_options` in the runner config, not on a static
+   compose service.
+
+6. **Re-enable GPU transcoding on host** (doc 13 finding 02 — L
+   effort). With H.264 NVENC at preset `p4` the same filtergraph
+   collapses from ~6.4 CPU cores to ~0.3 CPU cores + GPU. Without
+   GPU, the §5 quick-wins are the cap; with GPU, the host can
+   serve 4 simultaneous viewers comfortably.
+
+7. **Cap the maximum supported resolution in client policy** (Dashboard
+   → Users → each user → Playback → "Maximum allowed video bitrate" /
+   "Maximum allowed video resolution"). Set non-admin users to
+   `1080p` max. Closes the foot-gun where any client says "I can do
+   4K" and Jellyfin obliges with a 4K-upscale CPU bomb.
+
+### 9.3 Watch-list (no immediate action, monitor next audit)
+
+- ENOSPC at 13:53 (only 2 occurrences, host has 146 GiB free now, so
+  it was a transient pressure burst). Re-check post-quick-wins (1+2
+  remove the cache pile-up that caused it).
+- `library.db-wal` 1.8× main db — manual `Optimize database` after the
+  above tasks finish, or tighten its schedule from 24 h to 6 h.
+- Container restart at 16:22 (was 02:13 in doc 13) — was this operator-
+  initiated or did `unless-stopped` re-spin a crash? Check
+  `docker logs jellyfin --since 6h` for `panic`/`crash` next time.
+
+---
+
+## 10. Quick-win vs investment summary
+
+| Bucket | Action | Effort | Expected impact |
+|---|---|---|---|
+| **Quick-win** | Throttling + SegmentDeletion ON | 2 clicks | Kills §5 zombie ffmpegs immediately; expected load avg drop 40–50 % under one active viewer |
+| **Quick-win** | Concurrency caps 12 → 4 | 3 fields | Removes the 7-up ffprobe bursts at season-page navigation |
+| **Quick-win** | RemoteClientBitrateLimit = 8 Mbps | 1 field | Stops Jellyfin choosing 4K-upscale paths for WAN clients; ~70 % drop in per-stream CPU |
+| **Quick-win** | OpenSubs disable / cred | 30 sec | 234 ERR/day → 0; cleaner log; faster library scans |
+| **Investment** | Compose CPU/MEM caps for jellyfin + bluebuild | 30 min compose + 1 restart per container | Removes noisy-neighbor head-of-line blocking by the CI runner |
+| **Investment** | GPU transcode reactivation | days (driver work, host) | 20× per-stream CPU efficiency on the 1080p-and-up paths |
+| **Investment** | Per-user max-resolution policy | 5 min × N users | Prevents admin foot-gun and any future invitee from triggering the 4K-upscale path |
+
+---
+
+## Appendix — raw evidence
+
+### Container limits (the absence is the finding)
+
+```
+docker inspect jellyfin --format '{{.HostConfig.Memory}} {{.HostConfig.NanoCpus}}
+                                  {{.HostConfig.CpuQuota}} {{.HostConfig.CpuPeriod}}
+                                  {{.HostConfig.PidsLimit}} {{.HostConfig.RestartPolicy.Name}}'
+→ 0 0 0 0 <no value> unless-stopped
+```
+
+### Host CPU + load + memory
+
+```
+nproc:           12
+lscpu Model:     AMD Ryzen 5 2600X Six-Core Processor (6c / 12t, NUMA0=0–11)
+uptime:          17:42:14 up 4 days 17:59,  2 users,  load average: 15.43, 14.61, 8.85
+free -h:         total 31Gi, used 10Gi, free 8.2Gi, buff/cache 13Gi
+                 swap total 24Gi, used 7.8Gi (32 %), SwapCached 789 976 kB
+vmstat 1 5 (us / sy / id / wa, last sample): 71 / 13 / 16 / 0
+                 r=11, b=1, cs ≈ 30 K/s
+iostat (nvme0n1): 38–433 w/s, 364–2 272 wkB/s, util 0.4–0.9 % — disk idle
+```
+
+### Top hosts on host (snapshot)
+
+```
+ps -eo pid,user,pcpu,pmem,rss,etimes,args --sort=-pcpu | head:
+1681949 user  643 %  6.9 %  2.30 GB   53 s  ffmpeg [Rick & Morty S01E01, 4K-upscale + sub burn]
+1662267 root   52 %  0.1 %       —   426 s  fuse-overlayfs (BlueBuild rootfs mount)
+1661952 root   36 %  0.1 %       —   431 s  fuse-overlayfs (BlueBuild rootfs)
+1485847 git    8  %  0.8 %  266 MB     —    gitea web (forgejo)
+ 364785 user   8  %  2.6 %  867 MB     —    openclaw-gateway
+1901802 java   8  % 12.7 %  4.2 GB     —    minecraft jvm (-Xmx14336M)
+1660709 root   7  %  0.3 %  100 MB   442 s  buildah build (BlueBuild)
+1626511 user   4  %  1.6 %  544 MB     —    /jellyfin/jellyfin (server proc)
+```
+
+### All 4 active ffmpeg's (full filter chain shown for the heaviest one)
+
+```
+PID 1681949 (643 % CPU):
+  -ss 33s -noaccurate_seek -canvas_size 1920x1080
+  -i Rick.and.Morty.S01E01.mkv
+  -threads 0 -map 0:0 -map 0:1 -map -0:0
+  -codec:v libx264 -preset veryfast -crf 23 -maxrate 13546858 -bufsize 27093716
+  -profile:v high -level 51
+  -filter_complex
+     [0:4]scale,scale=3840:2160:fast_bilinear[sub] ;
+     [0:0]setparams=color_primaries=bt709:color_trc=bt709:colorspace=bt709,
+          scale=trunc(min(max(iw,ih*a),min(3840,2160*a))/2)*2
+                :trunc(min(max(iw/a,ih),min(3840/a,2160))/2)*2,
+          format=yuv420p[main] ;
+     [main][sub]overlay=eof_action=pass:repeatlast=0
+  -codec:a libfdk_aac -ac 2 -ab 256000 -af volume=2
+  -f hls -hls_time 3 -hls_segment_type fmp4 ...
+```
+
+### Sessions API (exactly 1 user, mismatched `PlayMethod` vs `TranscodingInfo`)
+
+```
+GET /Sessions?activeWithinSeconds=600  →  1 session
+  user=s8n  client=Jellyfin Web/Chrome  remote=192.168.0.10
+  PlayMethod=DirectPlay (claimed)
+  TranscodingInfo:
+    VideoCodec=h264  AudioCodec=aac  Container=fmp4/hls
+    3840x2160 @ 13.8 Mbps   HW=none   IsVideoDirect=False  IsAudioDirect=False
+    Reasons = [VideoCodecNotSupported, AudioCodecNotSupported, SubtitleCodecNotSupported]
+    Completion = 0.0 %
+```
+
+### Scheduled tasks (none in progress)
+
+(Full table in §4 — every task is `Idle`, last-run durations 0–3.2 s.)
+
+### Plugins (all 6 Active, no faulted)
+
+```
+AudioDB         10.10.3.0  Active
+MusicBrainz     10.10.3.0  Active
+OMDb            10.10.3.0  Active
+Open Subtitles  20.0.0.0   Active   ← 234 ERR/day from auth-empty creds (doc 13 finding 04)
+Studio Images   10.10.3.0  Active
+TMDb            10.10.3.0  Active
+```
+
+### Log tally (today's `log_20260508.log`, 3 968 lines)
+
+```
+[ERR] lines:                                        266
+[WRN] lines:                                        124
+"Error downloading subtitles from Open Subtitles":  234   ← 88 % of all ERR
+"No space left on device":                            2   ← 13:53:10, transient
+"Invalid username or password entered" (login):       5
+"WS ... error receiving data":                       ~25  ← noise
+"task was canceled" / 499:                            1   ← 17:41
+"SQLITE_BUSY" / "database is locked":                 0
+EF Core "QuerySplittingBehavior" warning:             1   ← upstream JF
+```
+
+### Disk (host vs container view)
+
+```
+host df -h /home:    399G  233G  146G  62 %   (was 90 % in doc 13 — improved)
+host df -i /home:     ~1.49M used / ~26.6M       6 %    inodes healthy
+container df -h /config /cache /media: same FS, same 146G free
+```
+
+### Items / counts
+
+```
+GET /Items/Counts → MovieCount=2  SeriesCount=6  EpisodeCount=181
+                    ArtistCount=0  ProgramCount=0  TrailerCount=0
+                    SongCount=0  AlbumCount=0  MusicVideoCount=0
+```
+
+### Container restart (StartedAt today)
+
+```
+Implied from ScheduledTasks where Trigger=StartupTrigger:
+  Clean Transcode Directory             → end 16:22:06   ← container start ≈ 16:22:05
+  Clean up collections and playlists    → end 16:22:06
+  Update Plugins                        → end 16:22:07
+(doc 13 had StartedAt = 02:13:01)
+```
+
+### Forgejo BlueBuild container (noisy neighbor, no limits)
+
+```
+docker stats:    CPU 88–99 %   MEM 4.3 GiB   NET in 5 GB   BLOCK in/out 296 MB / 35.3 GB
+docker inspect:  Memory=0   NanoCpus=0   CpuQuota=0   ← uncapped
+```
+
+---
+
+## Sign-off
+
+- Audit: 2026-05-08, read-only, ~15 min wall.
+- No fixes applied. No state mutated. No container restart. No plugins
+  reloaded. No tasks executed.
+- Scope respected: server runtime only. Color/HDR, edge/network, and
+  storage findings deferred to sibling agents.
+- Next audit due: **2026-08-08** (quarterly, paired with doc 13).