All 8 owner-reported symptoms resolved across 5 iterations (INC1-5):
INC1 — index.html drift revert + :has() transparent-scope + Cineplex
Abspielen override + encoding.xml HLS 499 fix
INC2 — pin .backdropContainer position:fixed (persistent backdrop)
INC3 — extend transparent-scope through detail-page sub-sections
INC4 — .emby-scroller transparent (kill black band behind carousels) +
EnableTonemapping=false + 20Mbps RemoteClientBitrateLimit cap +
headless-test-v2.py (admin+guest+click-play+bg-sweep)
INC5 — AV1 source re-encode (MNS S1E2/E4/E5 to H.264/AAC) +
enableHlsFmp4=false localStorage shim +
::-webkit-scrollbar styled to ARRFLIX palette
Verification: headless playwright on Chrome + Firefox UA confirms MNS
S1E4 plays 1920x1080 readyState=4 currentTime advancing. Owner
double-confirmed solved.
Doc 26 final state section + 18-item forbidden-pattern checklist added
for future operators.
1492 lines
105 KiB
Markdown
1492 lines
105 KiB
Markdown
# 26 — Incident 2026-05-09: Page Unresponsive + Posters Missing + Playback Black-Screen
|
||
|
||
> Session log. Live document — updated as fix proceeds. Goal: future-me + other operators can read this and skip every dead-end I already walked.
|
||
|
||
Status: **CLOSED 2026-05-09** — owner double-confirmed all symptoms resolved.
|
||
See "## Final state" at bottom for the consolidated outcome.
|
||
|
||
---
|
||
|
||
## Symptoms reported by owner (in order)
|
||
|
||
1. "Browser arrflix is broken videos don't play at all"
|
||
2. "I can't even see a preview of the TV series / movie"
|
||
3. After first fix: page loads, posters render, but **"Page Unresponsive"** Chrome dialog before posters paint (screenshot 1)
|
||
4. After second fix attempt: posters render, but **"Abspielen"** (German Play button) instead of "Play"; **all backdrop art replaced by black**; **video plays as black screen** (screenshot 2)
|
||
|
||
---
|
||
|
||
## Root causes identified so far
|
||
|
||
### A — Browser hangs (resolved by fix #1)
|
||
|
||
`/opt/docker/jellyfin/web-overrides/index.html` deployed copy was AHEAD of repo HEAD. md5 deployed `b97c1cb4` ≠ repo `d77c106b`. Someone hot-patched a `forceEnglishUI()` text-walker MutationObserver onto `document.body` with `subtree:true, characterData:true`. Walker rewrote `alt`/`title`/`aria-label` on every DOM mutation. Poster grid lazy-load fired it hundreds of times → main thread frozen → Chrome "Page Unresponsive".
|
||
|
||
**Fix applied:** scp'd repo HEAD `index.html` over deployed, restarted container. Verified md5 matches.
|
||
|
||
**Lesson:** never hot-patch the bind-mount. Always commit + redeploy from repo. Drift is invisible until something breaks.
|
||
|
||
### B — DB write failures (auto-resolved before this session)
|
||
|
||
Agent investigation found `jellyfin.db` had been owned by uid 101000 (userns-remap leftover, see `~/.claude/projects/-home-admin-ai-lab/memory/project_nullstone_docker_userns.md`). Container ran as 1000 → SQLite Error 8: `attempt to write a readonly database`. By the time we re-checked, file was already `user:user`. Probably fixed during 23:22 container restart.
|
||
|
||
**Lesson:** if `jellyfin.db` is unwritable, EVERY user-config save silently fails (HTTP 204 success, value not persisted). Check ownership FIRST when config writes don't stick.
|
||
|
||
### C — German "Abspielen" leak (NOT YET FIXED — current focus)
|
||
|
||
User's `Configuration.UICulture` is `<absent>` for ALL 12 users. Tried POST `/Users/{id}/Configuration` with `UICulture: en-US` payload via `bin/force-english-all-users.sh`. Server returned HTTP 204 but field did NOT persist on subsequent GET. **POST silently drops UICulture**.
|
||
|
||
Possible explanation: the `UserConfiguration` model in 10.10.3 may have removed the per-user UICulture field, OR the `Users` table schema (verified) has no UICulture column AND no Preferences row stores it. Doc 15 claims `Configuration.UICulture` is authoritative, but that doc is from when fix worked. Behavior may have shifted.
|
||
|
||
Traefik DOES rewrite `Accept-Language: en-US,en;q=0.9` on every request (`force-en-accept-lang@file` middleware) AND rewrites locale chunk JS path so `de-json.X.chunk.js` → `en-us-json.667484b4a441712c7e05.chunk.js`. Verified via curl: `de-json.X.chunk.js` returns 107425 bytes of English content.
|
||
|
||
**So why German leaking?** Service Worker cache. Browser's SW serves stale German chunk from CacheStorage, never hits network, never sees the Traefik rewrite. SW from before the lockdown was deployed.
|
||
|
||
Tried: `Clear-Site-Data: "cache", "cookies", "storage"` Traefik response header on `/web/index.html`. Verified live via curl. **But the user's browser STILL has SW cache** — SW intercepts the GET to `/web/index.html` and serves from cache, response from server (with Clear-Site-Data) never reaches browser cache layer. SW prevents its own death.
|
||
|
||
### D — Backdrops missing (NOT YET INVESTIGATED)
|
||
|
||
User reports backdrop art (the wide background image behind episode cards) is now black for every show. Could be:
|
||
- Image not in DB/cache (server returning empty)
|
||
- CSS hiding backdrop element
|
||
- SW serving stale 404 from a bad earlier session
|
||
- Jellyfin metadata refresh interrupted
|
||
|
||
### E — Video black screen on play (NOT YET FIXED)
|
||
|
||
Server logs show ffmpeg IS transcoding HEVC source → H.264 high@5.1 + libfdk_aac. But browser shows black. Earlier `/Sessions` proved DirectPlay worked for one client (RemoteEndPoint 82.31.156.86). Recent attempts: HLS segment 186.mp4 returned **499 (client closed connection)** + `POST /Sessions/Playing/Progress` returned **502 Bad Gateway** at 23:31:49 (during traefik momentary upstream-missing window).
|
||
|
||
Possible causes:
|
||
- SW intercepting HLS init segment, serving stale/wrong-mime
|
||
- 10-bit HEVC source → H.264 transcode timing issue
|
||
- CSS hiding `<video>` element
|
||
- HLS init.mp4 vs segment naming bug (`hls_fmp4_init_filename "X-1.mp4"` + `hls_segment_filename "X%d.mp4"` — collision risk)
|
||
|
||
---
|
||
|
||
## Actions taken this session
|
||
|
||
| # | Action | Outcome |
|
||
|---|---|---|
|
||
| 1 | scp repo `index.html` → deployed; `docker restart jellyfin` | DOM-walker shim gone. Page no longer hangs. |
|
||
| 2 | Insert temp ApiKeys row in jellyfin.db, run `bin/force-english-all-users.sh` | POST 204 but UICulture NOT persisted. Possibly server-model dropped field. |
|
||
| 3 | Add `clear-site-data@file` Traefik middleware to `jellyfin-html-nocache` router | Header lives. But SW intercepts before browser cache layer can apply. |
|
||
| 4 | Revoke temp ApiKey | Done. |
|
||
|
||
---
|
||
|
||
## What did NOT work (don't repeat)
|
||
|
||
- `bin/force-english-all-users.sh` against 10.10.3 — POST 204 but field dropped server-side. Either model changed or DB write path broken differently than uid-101000 issue.
|
||
- `Clear-Site-Data` response header alone — SW intercepts and the header never reaches browser cache eviction. Need to kill SW BEFORE it can intercept.
|
||
|
||
## Forbidden patterns
|
||
|
||
- Hot-patching `web-overrides/index.html` without committing to repo. Bug A came from this exact pattern. Repo MUST = deployed.
|
||
- Trusting HTTP 204 as success. Verify with GET.
|
||
- Client-side DOM-walker MutationObservers without debounce + scope. Will tank performance + freeze browser.
|
||
|
||
---
|
||
|
||
## Plan (in flight)
|
||
|
||
1. Read every prior doc (`docs/01..25`) — extract what was tried + outcome (agent task)
|
||
2. Read git log of `web-overrides/`, `bin/force-english-all-users.sh`, `bin/inject-shim.py` (agent task)
|
||
3. Online: how to kill a Jellyfin Service Worker definitively (agent task)
|
||
4. Read `/web/serviceworker.js` source — what does it cache? (agent task)
|
||
5. Diagnose backdrop missing — server vs CSS vs SW (agent task)
|
||
6. Diagnose HEVC playback black screen — codec + segment + HLS (agent task)
|
||
7. Compare jellyfin-dev vs jellyfin (agent task — dev MAY be working, look at what's different)
|
||
8. Apply consolidated fix from agent findings
|
||
9. Verify in user browser
|
||
10. Commit doc 26 + any code changes; push to `git.s8n.ru/s8n/ARRFLIX`
|
||
|
||
---
|
||
|
||
## Findings from agents
|
||
|
||
### Repo archeology
|
||
|
||
Reference compiled 2026-05-09 from docs/13-25 + bin/* + git log. Use this to skip dead-ends.
|
||
|
||
**A - Locale lockdown - what's been tried + outcomes**
|
||
|
||
Chronological history (paths absolute):
|
||
|
||
1. `/home/admin/arrflix-repo/docs/15-force-english.md` (commit 14f63e8, 2026-05-08 04:22) - diagnosis: per-user `Configuration.UICulture` absent on all 5 users -> SPA falls back to `Accept-Language`. **Built `bin/force-english-all-users.sh`** (read-modify-write `POST /Users/{id}/Configuration` with `UICulture: en-US`, expect 204). Shipped one-line wrapper patch for `bin/add-jellyfin-user.sh` step 3/4 (`c['UICulture']='en-US'`). **Status at write-time: plan-only, script never executed.**
|
||
2. `/home/admin/arrflix-repo/docs/19-english-only-audit.md` (a3f82df) - confirmed UICulture still absent on 8/8 users; identified that **92 non-English `<lang>-json.<hash>.chunk.js` chunks reachable** (`de-json.1afccc006ab8bb6c5953.chunk.js` contains `"Play":"Abspielen"`). Proposed three orthogonal fixes: (a) Path-A Traefik `customrequestheaders.Accept-Language=en-US` middleware, (b) Path-B 1-byte chunk stub bind-mounts (brittle - chunk hashes rotate per JF image), (c) `navigator.language` shim in `inject-shim.py`. **Outcome: recommendations only.**
|
||
3. `/home/admin/arrflix-repo/docs/20-english-only-lockdown.md` (d5d6856) - operator doc declaring 4 layers (server, per-user, web SPA shim, Accept-Language). Ships `bin/english-lockdown-runner.sh` (idempotent re-apply for layers 1+2). Layer 3 = `web-overrides/english-lockdown.{js,css}` (sibling commit d2120c6). **Outcome: claimed working at write-time.**
|
||
4. `/home/admin/arrflix-repo/docs/25-english-leak-deep-dive-2026-05-08.md` (117fa33) - **critical retraction**: greppped the live web bundle and proved the SPA NEVER reads `Configuration.UICulture`. Only `wizard-start.<hash>.chunk.js` and `25583.<hash>.chunk.js` reference it, both for the admin `/System/Configuration` form, NOT user UI. Actual locale resolver reads `document.documentElement.getAttribute("data-culture")` -> `navigator.language` -> `navigator.userLanguage` -> `navigator.languages[0]` -> `localStorage.getItem("language")` (no user prefix). **Per-user UICulture POST = theatre. Only the shim's `Object.defineProperty(Navigator.prototype, 'language', ...)` actually pins SPA UI.** Verified with headless Trivalent `--lang=de-DE --accept-lang=de-DE,de,en` -> only `en-us-json.667484b4a441712c7e05.chunk.js` requested.
|
||
5. **Today's deployed shim** (`/home/admin/arrflix-repo/bin/inject-shim.py` lines 13-114) - does ALL of the above: `localStorage.setItem` for 6 keys (`appLanguage,selectedlanguage,selectedlocale,language,locale,culture`), `Object.defineProperty(Navigator.prototype, 'language')`, `Object.defineProperty(Navigator.prototype, 'languages')`, fallback `navigator.X` redefine, fetch+XHR wrappers stripping `Accept-Language` and rewriting `POST /Users/{id}/Configuration` body to force `UICulture:'en-US'`, `pinLocale()` re-runs every 1 s + on visibility-change. **This is the canonical recipe - anything that works lives here.** Doc 26 sec C confirms Traefik `force-en-accept-lang@file` middleware also rewrites `Accept-Language` per request, AND rewrites `de-json.X.chunk.js` -> `en-us-json.667484b4a441712c7e05.chunk.js` (curl-verified: de URL returns 107 425 bytes of English).
|
||
|
||
**B - Service worker handling - what's been tried + outcomes**
|
||
|
||
- `docs/13` finding 11 + `docs/23` sec 5 + `docs/25` hypothesis 2 - `/web/serviceworker.js` is **768 bytes**, `Last-Modified: 2024-11-19` (Jellyfin 10.10.3 ship). Source confirmed: only `notificationclick` handler + `clients.claim()`, **no `fetch` listener, no precache, no `cache.put`**. Stock SW cannot poison posters/HLS by design.
|
||
- `bin/inject-shim.py` lines 174-188 - shim already calls `navigator.serviceWorker.getRegistrations().then(regs => regs.forEach(r => if scriptURL.includes('serviceworker.js') r.unregister()))` AND `caches.keys().then(keys => keys.forEach(caches.delete))`. **Built-in SW kill + cache wipe runs every page load.** In production now.
|
||
- `docs/25` R1 - proposed `Cache-Control: no-cache` on `/web/index.html` to stop heuristic caching of pre-shim HTML (Path-A label-scoped Traefik middleware). **Status: not applied at doc-25 write-time.**
|
||
- Doc 26 sec C - added `clear-site-data@file` Traefik middleware. Header reaches curl, but **SW intercepts before browser cache layer can apply Clear-Site-Data - SW prevents its own death**. SW kill must come from inside the SW (self-destruct) or via Update fetch returning 404. See SW kill recipe section below.
|
||
|
||
**C - Backdrop / artwork issues - any prior doc covers this?**
|
||
|
||
- `docs/14` - only doc that touches detail-page backdrops. Diagnosed Finity-parent's `--detail-page-backdrop-offset: 17%` + `mask.png` from `raw.githubusercontent.com/prism2001/finity/main/assets/mask.png`. Two CSS culprits clamping the band hard-black: (a) `:root --primary-background-color: #000 !important`, (b) `html, body, .preload, .skinBody, ..., #reactRoot, .mainAnimatedPages, .dashboardDocument { bg:#000 !important }`.
|
||
- `docs/14` sec 7 proposed CSS fix (`linear-gradient` overlay, `body.itemDetailPage` scope-out for bg-clamp). Doc 21 sec 4 cross-ref says "just landed".
|
||
- `docs/23` finding 6 - `/Items/{id}/Images/Primary` returns `Cache-Control: public` with NO max-age (heuristic = 0 s); cold poster transcode 350-470 ms; on-disk image cache `/cache/images/resized-images/` is 39 MB / 412 files / 16 h retention.
|
||
- `docs/24` sec 4 - image cache 39 MB total, 412 files, no GC pressure, oldest 16 h old.
|
||
- **No prior doc covers "all backdrops replaced by black" as a regression.** Closest precedents: doc 14 hard-black left band (CSS layer), doc 23 poster timing (cold-cache layer). New investigation territory for doc 26.
|
||
|
||
**D - Video playback / HLS / transcode issues - any prior doc?**
|
||
|
||
- `docs/13` finding 03 - `EnableThrottling=false`, `EnableSegmentDeletion=false`, `MaxMuxingQueueSize=2048`, `SegmentKeepSeconds=720`. Two 499 client-cancels in 1 h (HLS segments at 6.4 s + 2.9 s).
|
||
- `docs/21` - full HDR/HEVC diagnosis for Rick & Morty. Source = HDR10 (`smpte2084`, `bt2020nc`, `yuv420p10le`, `color_range=pc`, no MasteringDisplay/CLL - fake AI-upscale HDR). `EnableTonemapping=false` + `HardwareAccelerationType=none` -> HDR pixels delivered as SDR -> washed-out (NOT pure black). PlaybackInfo: `TranscodeReasons=ContainerNotSupported, AudioCodecNotSupported, SubtitleCodecNotSupported`. Fix: `EnableTonemapping=true` (`bt2390` already selected).
|
||
- `docs/22` sec 5 - 4 concurrent ffmpegs on ONE viewer of R&M S01E01. Filtergraph: `[0:4]scale,scale=3840:2160:fast_bilinear[sub]; [0:0]...format=yuv420p[main]; [main][sub]overlay`, `libx264 preset=veryfast crf=23 maxrate=13.5Mbps`, fmp4 HLS. 643 % CPU each. Cause: `EnableThrottling=false` + `EnableSegmentDeletion=false`.
|
||
- `docs/22` sec 3 - `TranscodingSubProtocol: hls`, `Container: fmp4/hls`, `IsVideoDirect=False, IsAudioDirect=False`. `PlayMethod` reports `DirectPlay` while `TranscodingInfo` is populated - race in Sessions DTO; actual decision is transcode.
|
||
- `docs/23` sec 7 - every Traefik request > 50 ms is `/videos/.../hls1/main/*.mp4` HLS-segment GET. AV1+HEVC at 360-550 Mbit. 15 x 499 + 8 x 500 in 6 h (CPU-side, not edge).
|
||
- **No prior doc covers "video plays as black screen" with audio working.** HLS init/segment naming collision risk (`hls_fmp4_init_filename "X-1.mp4"` + `hls_segment_filename "X%d.mp4"`) is a doc-26-only hypothesis. SW-intercepting-init-segment is also doc-26-only - but stock SW has no `fetch` handler so this requires a poisoned non-stock SW.
|
||
|
||
**E - Forbidden patterns - things explicitly called out as "do not do"**
|
||
|
||
- **No bundle modifications** (`docs/16` F5, `docs/19` row 16). Content-hashed filenames rotate per JF image upgrade; breaks source-map; must re-emit per bump.
|
||
- **No DOM-walker MutationObservers without debounce + scope** (doc 26 sec A bug A). The hot-patched `forceEnglishUI()` text-walker on `document.body` with `subtree:true, characterData:true` froze the main thread on poster lazy-load. The `inject-shim.py` walker in doc 16 sec C is the safe pattern (`acceptNode` filter + bounded selector).
|
||
- **No hot-patching `web-overrides/index.html` without committing to repo** (doc 26 sec A lesson). md5 drift between deployed and repo HEAD is invisible until breakage.
|
||
- **No trusting HTTP 204 as success** (doc 26 sec B lesson). `jellyfin.db` owned by uid 101000 (userns leftover) -> SQLite Error 8 readonly - POSTs return 204 but value not persisted. Always GET-verify.
|
||
- **No `Cache-Control: immutable` on `/web/index.html`** (doc 25 R1 caveat). Bricks next deploy until users force-reload. Scope to hashed chunks only.
|
||
- **No tonemap on SDR sources** (doc 21 sec 7e). If Mandalorian looks oversaturated post-fix, tonemap leaks - set `TonemappingMode` from `auto` to stricter.
|
||
- **No relying on per-user `Configuration.UICulture` for UI strings** (doc 25 R3 + sec 4). Server-side metadata theatre. Only the shim pins UI. Keep field for future-proofing but stop expecting it to fix Abspielen.
|
||
- **No bundle bind-mount for `<lang>-json.<hash>.chunk.js`** (doc 19 Path B caveat, doc 25 R4). Hashes rotate per image upgrade - must regenerate every bump.
|
||
- **No deleting Settings drawer node** (doc 17 sec 3.1). Drawer-renderer rebuilds on next render; remove only via CSS `display:none` + style override. Old `mypreferencesmenu` selectors match **0** elements - use `a.btnSettings, [data-itemid="settings"]`.
|
||
- **No theme @import without snapshot** (doc 14 sec 9). `/System/Configuration/branding` is whole-object replace - sibling Cineplex POST overwrote ElegantFin/NeutralFin within minutes (race rule, doc 04 sec 3b).
|
||
- **No `bg:#000 !important` on detail pages** (doc 14 sec 2c, doc 21 sec 4) - clamps Finity's intentional 17vw band into hard-black slab. Scope to `body:not(.itemDetailPage)`.
|
||
- **No stripping `Accept-Language` at Traefik for shared backends** (doc 15 limit 2; relaxed in doc 19 sec 19 since arrflix is sole consumer of arrflix.s8n.ru router).
|
||
|
||
### SW kill recipe
|
||
|
||
Research date 2026-05-09. Treat as authoritative for this incident.
|
||
|
||
**Q1 — Clear-Site-Data through an active SW:** Per W3C spec and MDN, `Clear-Site-Data` is **only honored on responses fetched over the network**, not those served by a SW. A SW can return arbitrary responses (incl. third-party), so browsers ignore CSD on SW-intercepted responses. Chrome/Firefox/Edge/Opera implement this; Safari support is partial. Conclusion: our existing Traefik header on `/web/index.html` will only fire for users whose SW lets that exact URL through to network — for stuck SWs that serve cached `index.html`, the header never reaches the browser. **Verified-not-working alone.** ([MDN Clear-Site-Data](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Clear-Site-Data), [Chrome Workbox guide](https://developer.chrome.com/docs/workbox/remove-buggy-service-workers))
|
||
|
||
**Q2 — Self-destruct shim:** **Verified working pattern.** Google's official Workbox guide recommends this as the *primary* approach. The browser performs a byte-for-byte update check on the SW script (max 24h, often immediate when `Cache-Control: max-age=0` or response differs). When the new script unregisters itself, all clients controlled by it lose their controller on next navigation. Canonical NekR snippet ([github.com/NekR/self-destroying-sw](https://github.com/NekR/self-destroying-sw)):
|
||
```js
|
||
self.addEventListener('install', e => self.skipWaiting());
|
||
self.addEventListener('activate', e => {
|
||
self.registration.unregister()
|
||
.then(() => self.clients.matchAll())
|
||
.then(cs => cs.forEach(c => c.navigate(c.url)));
|
||
});
|
||
```
|
||
Bind-mount feasibility: Jellyfin official image serves web from `/jellyfin/jellyfin-web/` inside the container. Bind-mounting the *whole directory* is broken (jellyfin/jellyfin#8441), but bind-mounting a *single file* over the existing `serviceworker.js` works the same way `index.html` does for us. Path inside container is `/jellyfin/jellyfin-web/serviceworker.js`. ([Jellyfin container docs](https://jellyfin.org/docs/general/installation/container/), [discussion #8441](https://github.com/jellyfin/jellyfin/discussions/8441))
|
||
|
||
**Q3 — 404/410 for SW script:** Spec status is **may work, browser-dependent**. W3C ServiceWorker issue #204 was closed wontfix — the spec does NOT mandate auto-unregister on 404/410 during normal navigation. HOWEVER, the *Update* algorithm (run on navigation, ~24h, or `registration.update()`) DOES unregister on 404/410 in Chrome and Firefox today (matches AppCache). The catch: update only runs when the browser checks; a stuck SW serving cached pages may never trigger an update fetch. Less reliable than self-destruct shim. ([w3c/ServiceWorker#204](https://github.com/w3c/ServiceWorker/issues/204))
|
||
|
||
**Q4 — Jellyfin 10.10.x SW poisoning:** No 10.10-specific SW-poster issue filed. The actual `src/serviceworker.js` in jellyfin-web is **notification-only** — no `fetch` listener, no cache logic. So if `arrflix.s8n.ru/web/serviceworker.js` is intercepting media, it is NOT stock Jellyfin code — likely a stale SW from a prior deploy, an injected mod (BobHasNoSoul/jellyfin-mods etc.), or browser-side residue. Stock Jellyfin SW cannot poison posters/HLS by design. Related issues: [jellyfin-web#4549](https://github.com/jellyfin/jellyfin-web/issues/4549) (premature caching), [jellyfin-web#5729](https://github.com/jellyfin/jellyfin-web/issues/5729) (stale `/system/info/public`).
|
||
|
||
**Q5 — Container path:** Confirmed `/jellyfin/jellyfin-web/serviceworker.js` for the official `jellyfin/jellyfin` image.
|
||
|
||
### Prod-vs-dev diff
|
||
|
||
Investigation 2026-05-09 — comparing live `jellyfin` (prod) vs `jellyfin-dev` containers on nullstone. Image tags identical: both `jellyfin/jellyfin:10.10.3`. Network.xml byte-identical. So differences below are 100% the operator's hardening, not Jellyfin upstream.
|
||
|
||
**A — docker-compose.yml diff (key items):**
|
||
- Prod mounts ~110+ web-override files: `index.html`, `cineplex.css`, AND a `locale-en-only/` directory containing every non-English `*-json.*.chunk.js` (af, ar, as, be, bg, bn, ca, cs, da, de, ... zh-tw, zu) bind-mounted RO over the container's locale chunks. Dev mounts ONLY `index-dev.html` over `index.html`. No CSS, no locale chunks.
|
||
- Prod traefik labels: `security-headers@file,compress@file,force-en-accept-lang@file`. Dev: `security-headers@file,no-guest@file`. Prod has NO `no-guest@file` directly on the docker-label router — its no-guest layer is enforced by the higher-priority `jellyfin-html-nocache` file-provider router (which ALSO adds `cache-no-store@file`, `clear-site-data@file` — see below).
|
||
- Prod env adds `JELLYFIN_UICulture=en-US`, `LANG=en_US.UTF-8`, `LC_ALL=en_US.UTF-8`. Dev has none.
|
||
|
||
**B — branding.xml / CustomCss diff:**
|
||
- Prod: 30,795 bytes. Full Cineplex CSS via `@import url("/web/cineplex.css")` (LOCAL bind-mount), ARRFLIX logo PNG embedded as base64 data-URI, Cast/Crew hidden, Quick Connect hidden, header buttons hidden, white slider thumbs, pure-black `--primary-background-color`.
|
||
- Dev: 26,345 bytes. Cineplex via `@import url("https://cdn.jsdelivr.net/gh/MRunkehl/cineplex@v1.0.6/cineplex.css")` (REMOTE jsDelivr — no /web/cineplex.css bind-mount). Same login disclaimer + Cast/Crew hide. **Confirmed dev has its OWN branding.xml on disk (not empty).**
|
||
|
||
**C — Per-user UICulture / settings:** Could not run `sqlite3` inside container (binary not present). Prod and dev both have separate config dirs (`/home/docker/jellyfin/` vs `/home/docker/jellyfin-dev/`). Dev config/data tree is a leaner subset (no `keyframes/`, no `splashscreen.png`, no `subtitles/`, no `device.txt`-only DB-shm/wal absence — dev DB sits idle without WAL == fewer active sessions, expected). Dev was set up as a fresh first-run wizard per `docs/12-dev-instance.md`, so its user table is its own admin only.
|
||
|
||
**D — encoding.xml diff:** Real divergence:
|
||
- Prod: `EnableThrottling=true`, `EnableSegmentDeletion=true`, `EnableTonemapping=true`.
|
||
- Dev: `EnableThrottling=false`, `EnableSegmentDeletion=false`, `EnableTonemapping=false`.
|
||
- Prod is the stricter/lower-resource HLS profile; dev keeps every segment around. Plausible contributor to the **HLS 499 client-disconnect** seen in section E (prod): if a client pauses/seeks while throttling+deletion are both on, segment 186 may be reaped before re-request lands.
|
||
|
||
**E — Surprising / smoking gun: Traefik headers prod-only, NOT applied to dev:**
|
||
- `curl -sI https://arrflix.s8n.ru/web/index.html` returns:
|
||
- `cache-control: no-cache, no-store, must-revalidate`
|
||
- `clear-site-data: "cache", "cookies", "storage"`
|
||
- `curl -sI https://dev.arrflix.s8n.ru/web/index.html` returns NEITHER. Just `x-frame-options: SAMEORIGIN`.
|
||
- Source: `/opt/docker/traefik/config/dynamic.yml` defines a HIGH-PRIORITY (priority:100) file-provider router `jellyfin-html-nocache` matching `Host(arrflix.s8n.ru) && Path(/, /web/, /web/index.html, /web/sw.js, /web/manifest.json)` with middlewares `security-headers,compress,cache-no-store,force-en-accept-lang,clear-site-data`. Dev's `dev.arrflix.s8n.ru` host has no equivalent file-provider router — only the docker-label router applies.
|
||
- The `clear-site-data` middleware was ADDED 2026-05-09 (today) as a "one-shot" to wipe SW+cache+storage. Comment in dynamic.yml literally says: *"Remove this middleware after owner has visited once and confirmed clean state."*
|
||
- **Implication:** Every prod page-load tells the browser to wipe cache + cookies + storage. If the SW intercepts before the header reaches the cache layer (per Q1 finding above) the header is harmless; but if any auth state or in-progress playback state is in storage when the header DOES land (e.g. on a forced refetch), it gets nuked. Dev does not have this and dev "works".
|
||
- Prod also has `jellyfin-locale-force-en` (priority:200) doing `replacePathRegex` from any locale-json chunk to `en-us-json.667484b4a441712c7e05.chunk.js`. The hash is hard-coded; if the deployed Jellyfin web bundle ever shipped a different en-us-json hash, EVERY locale chunk request returns a 404 wrapped as a successful rewrite to a non-existent path. Worth verifying the hash matches the live bundle.
|
||
|
||
**Suggested transplant (smallest reversible change):**
|
||
1. Remove the `clear-site-data@file` middleware from the `jellyfin-html-nocache` router in `/opt/docker/traefik/config/dynamic.yml` (one line). Keep `cache-no-store` so the SW-update fetch still bypasses heuristic cache. Traefik hot-reloads.
|
||
2. Verify with `curl -sI https://arrflix.s8n.ru/web/index.html` → no `clear-site-data` header.
|
||
3. If prod now behaves like dev, the CSD header was a major factor in the unresponsive page (storage wipe in flight while SPA boots = re-auth race + token loss).
|
||
4. Re-test playback. If still black-screen, suspect the encoding.xml `EnableThrottling+SegmentDeletion=true` combo and try toggling each off to match dev.
|
||
5. Last resort: also drop the `jellyfin-locale-force-en` rewrite and verify the hard-coded en-us-json hash is current with the running 10.10.3 bundle.
|
||
|
||
### Online research 2026-05-09
|
||
|
||
Research-only pass against current GitHub state. All URLs verified live this date.
|
||
|
||
**Q1 — UICulture per-user broken in 10.10.3?** No evidence the field was *removed* from `UserConfiguration` in the 10.10.x line. DeepWiki's settings-management page still documents per-user UICulture. The closest live regression is jellyfin/jellyfin#16117 ("Can't change plugins settings - Fixed by disabling **Cloudflare Rocket Loader**"): same shape — POST returns 2xx, body silently dropped, only over reverse proxy. Verdict: **probable** that our symptom is reverse-proxy-side body mangling, not a server-side schema removal. Sanity check: bypass Traefik (`curl --resolve arrflix.s8n.ru:8096:127.0.0.1` direct to container) and POST UICulture; if it persists there but not via Traefik, middleware is mutating the JSON. Discussion #15857 confirms `204 No Content` is the expected return code for these write endpoints — the 204 itself is not the bug. ([#16117](https://github.com/jellyfin/jellyfin/issues/16117), [discussion #15857](https://github.com/orgs/jellyfin/discussions/15857), [DeepWiki settings](https://deepwiki.com/jellyfin/jellyfin-web/5.2-user-settings))
|
||
|
||
**Q2 — Backdrops missing while posters work.** **Confirmed root cause = TMDB API change.** jellyfin/jellyfin#14922 (opened 2025-10-01, CLOSED) and #14951 (2025-10-06, CLOSED): TMDB swapped "no-language" backdrop tag from empty-string to `xx`; Jellyfin 10.10.x scrapes those as **Thumbs**, not Backdrops, so the Backdrops slot is empty. The Jellyfin team explicitly said it will not be backported to 10.10 — fix lands only in 10.11.0+. So our 10.10.3 instance has zero backdrops for any item added after ~Sep 2025 unless a non-`xx` language backdrop happened to exist. Issue #7264 (Movies showing backdrops *instead of* posters) is a separate 10.11.1 regression — opposite symptom, not relevant here, marked "Can't Reproduce" in #15259. Verdict: **confirmed** for our case. Mitigation = upgrade to 10.11.x and run "Replace existing images" on every item *after* upgrading. ([#14922](https://github.com/jellyfin/jellyfin/issues/14922), [#14951](https://github.com/jellyfin/jellyfin/issues/14951), [#7264](https://github.com/jellyfin/jellyfin-web/issues/7264))
|
||
|
||
**Q3 — Service Worker survival despite Clear-Site-Data.** **Confirmed.** Chrome's official Workbox guide states `Clear-Site-Data` "can't be relied on alone" because the SW intercepts the very response that would carry the header. Chromium SW Security FAQ explicitly recommends pairing CSD with a no-op SW. Same conclusion as our SW kill recipe section, validated from a second angle. ([Chrome Workbox](https://developer.chrome.com/docs/workbox/remove-buggy-service-workers), [Chromium SW FAQ](https://chromium.googlesource.com/chromium/src/+/main/docs/security/service-worker-security-faq.md))
|
||
|
||
**Q4 — Self-destruct SW pattern in Jellyfin community.** No Jellyfin-specific recipe published. Generic NekR self-destroying-sw is the canonical pattern (already cited above). BobHasNoSoul/jellyfin-mods ships a *replacement* SW (not a self-destruct one) — useful only as a reference for how others bind-mount over `/jellyfin/jellyfin-web/serviceworker.js`. Verdict: **no evidence** of a Jellyfin-curated kill recipe; we are first to ship one. ([NekR](https://github.com/NekR/self-destroying-sw), [BobHasNoSoul/jellyfin-mods](https://github.com/BobHasNoSoul/jellyfin-mods))
|
||
|
||
**Q5 — HLS fmp4 init-segment collision on restart.** **No evidence of collision in practice.** Jellyfin always passes `-start_number 0` and the init filename is `<hash>-1.mp4` (literal `-1`, not `%d`-derived); segments are `<hash>0.mp4`, `<hash>1.mp4`, ... so `-1` cannot collide with any positive `%d`. Restart spawns a *new hash* (different session id), so old and new sessions don't share filenames either. The active live bug is jellyfin/jellyfin#16612 — playback breaks after 10–15 s in 10.11.8 with fMP4-HLS — but the cause traced in that thread is FFmpeg/segment-availability, not init-name collision. Tangentially: #12230 (CLOSED) is about the init filename being passed *relative* not absolute — only matters when Jellyfin's CWD ≠ transcode dir (rffmpeg setups). Verdict: **no evidence** that init-name collision causes our black-screen. Look at #16612 and at `Cache-Control: no-store` on `/Videos/*/hls1/*` instead. ([#16612](https://github.com/jellyfin/jellyfin/issues/16612), [#12230](https://github.com/jellyfin/jellyfin/issues/12230))
|
||
|
||
**Q6 — Cineplex theme repo activity.** Repo `MRunkehl/cineplex` last pushed **2025-09-06** (sha `98c8e71`, "Fixed more styles and script"). Description: "Updated jellyflix theme for newest jellyfin v10.10.7 and better netflix styles". **Zero open or closed issues** (issues tab is empty). No commits since 10.11.0 shipped, so the theme has not been validated against 10.11 image-type changes. Verdict: **probable** that backdrop CSS selectors target 10.10 DOM and may break or hide backdrops on a 10.11 upgrade. Audit `cineplex.css` for `.itemBackdrop`, `.backdropContainer`, `.cardBox-bottompadded` selectors before upgrading. ([repo](https://github.com/MRunkehl/cineplex))
|
||
|
||
**Q7 — Jellyfin 10.11.8 changelog.** **Does NOT fix our issues directly.** Server 10.11.8 ships only 3 changes: subtitle-language library handling, subtitle saving, and language-filter querying. jellyfin-web 10.11.8: a single PR (#7796) for lazy device-info loading. Released as a regression-revert from 10.11.7 ahead of CVE/GHSA disclosure. None of UICulture persistence, SW poisoning, or fMP4 playback are addressed in .8 itself. However the TMDB-backdrop fix (Q2) lands in the 10.11.0 baseline that .8 inherits. Verdict on .8 specifically: **no evidence** it helps directly; **confirmed** the 10.11 line fixes Q2. Upgrade target = 10.11.8 (latest stable: 10.11.0 backdrop fix + .7 security fixes + .8 regression reverts). ([10.11.8 server](https://github.com/jellyfin/jellyfin/releases/tag/v10.11.8), [10.11.8 web](https://github.com/jellyfin/jellyfin-web/releases/tag/v10.11.8))
|
||
|
||
### Recommended action sequence
|
||
|
||
**Option A — Self-destruct shim (RECOMMENDED, verified working):**
|
||
```bash
|
||
# On nullstone, in the arrflix compose dir:
|
||
cat > /opt/docker/arrflix/web-overrides/serviceworker.js <<'EOF'
|
||
self.addEventListener('install', e => self.skipWaiting());
|
||
self.addEventListener('activate', e => {
|
||
self.registration.unregister()
|
||
.then(() => self.clients.matchAll())
|
||
.then(cs => cs.forEach(c => c.navigate(c.url)));
|
||
});
|
||
EOF
|
||
# Add to compose volumes (same pattern as index.html):
|
||
# - /opt/docker/arrflix/web-overrides/serviceworker.js:/jellyfin/jellyfin-web/serviceworker.js:ro
|
||
docker compose -f /opt/docker/arrflix/compose.yml up -d --force-recreate jellyfin
|
||
# Force Traefik to send no-cache on the SW script so browsers refetch immediately:
|
||
# middleware: response header Cache-Control: no-cache, no-store, max-age=0 on /web/serviceworker.js
|
||
```
|
||
- **Side effects:** every existing browser session navigates to its current URL once on next page load — looks like a single auto-refresh. No data loss. New visitors get the shim, immediately unregister, never see it again.
|
||
- **Recovery:** revert by removing the bind-mount line + `up -d --force-recreate`. Original SW returns.
|
||
- **Verify:** `curl -skI https://arrflix.s8n.ru/web/serviceworker.js` → 200 + `Cache-Control: no-cache`. Body matches the shim. In an incognito window: open DevTools → Application → Service Workers shows registration *then* "redundant" within seconds.
|
||
|
||
**Option B — Serve 404 (may work, less reliable):**
|
||
```bash
|
||
# Traefik file-provider snippet:
|
||
# - /web/serviceworker.js → middleware that returns 404 (errors middleware → static 404 service)
|
||
# Or simply: bind-mount an empty file and add a Traefik replacePathRegex to a non-existent path.
|
||
```
|
||
- **Side effects:** Chrome/Firefox unregister on next *Update* fetch (typically next navigation after >24h, or sooner if user reloads). Slow rollout. Some users may stay stuck for a day.
|
||
- **Recovery:** remove the rule, original SW returns on next image rebuild.
|
||
- **Verify:** `curl -skI https://arrflix.s8n.ru/web/serviceworker.js` → 404. DevTools shows SW going "redundant" after a navigation+reload cycle.
|
||
|
||
**Option C — Do nothing server-side, force user manual:**
|
||
- User opens DevTools → Application → Service Workers → Unregister, OR `chrome://serviceworker-internals` → Unregister, OR clears site data.
|
||
- **Side effects:** every user must do this individually; non-technical users can't.
|
||
- **Recovery:** trivial, nothing changed.
|
||
- **Verify:** per-user; no server signal.
|
||
|
||
**Decision:** Go with **Option A**. It is the Google-recommended pattern, is the only approach that auto-fixes already-loaded tabs without user action, and is reversible by removing one line from compose.
|
||
|
||
### SW source + image cache
|
||
|
||
**(Agent run 2026-05-09 — verifies the stock SW source live on the running container, and probes server-side image health for a known item. Important: contradicts the working assumption that the SW is intercepting fetches.)**
|
||
|
||
**Part 1 — `/web/serviceworker.js` source + interception map**
|
||
|
||
Both `docker exec jellyfin cat /jellyfin/jellyfin-web/serviceworker.js` and `curl -sk https://arrflix.s8n.ru/web/serviceworker.js` return the **same** file (~1KB single line):
|
||
|
||
```js
|
||
(self.webpackChunk=self.webpackChunk||[]).push([[82798],{16764:function(n,e,t){
|
||
t(78557),t(90076),
|
||
self.addEventListener("notificationclick", function(n){ /* opens window or calls connectionManager */ }, !1),
|
||
self.addEventListener("activate", function(){ return self.clients.claim() })
|
||
}}, function(n){ n.O(0,[59928], function(){ return 16764, n(n.s=16764) }), n.O() }]);
|
||
```
|
||
|
||
**Interception map — there is none.**
|
||
- No `fetch` event listener in this file.
|
||
- Only listeners: `notificationclick` and `activate` (calls `clients.claim()`).
|
||
- `t(78557)` and `t(90076)` are webpack require calls for two other modules — those *might* register fetch handlers, but they are NOT in this bundle (they live in lazy chunks under `/web/*.chunk.js`). The chunk IDs `82798` / `59928` map to the notification module only.
|
||
- **No CacheStorage usage anywhere in this bundle.** No `caches.open`, `caches.match`, `cache.put`. So this SW does **NOT** cache `/Items/{id}/Images/*`, `/Videos/{id}/*`, `/web/*-json.*.chunk.js`, or `/web/index.html`.
|
||
|
||
**Conclusion:** Jellyfin 10.10.3 web's stock SW is push-notification-only. It does not intercept fetches and owns no CacheStorage entries. This **confirms agent Q4 finding** ("notification-only — no `fetch` listener, no cache logic") against the running container — not just spec/source, the literal bytes Jellyfin is shipping.
|
||
|
||
**Implication for Section C diagnosis:** "SW intercepts the GET to `/web/index.html` and serves from cache" is **false**. With no `fetch` handler the SW cannot intercept. `Clear-Site-Data` would already reach the network response — the real blocker for stale German chunks is **HTTP browser cache** (memory + disk), not Service Worker cache.
|
||
|
||
**Replacement plan:** The self-unregister shim is still safe and useful as belt-and-braces — installs cleanly, deletes any caches that ever existed, unregisters, force-reloads. Bind-mount path inside container is `/jellyfin/jellyfin-web/serviceworker.js`. But it is **not the missing piece** for the German leak. Real fix: existing `Cache-Control: no-store` + `Clear-Site-Data` headers on `/web/index.html` plus a **hard reload** (Ctrl+Shift+R) or DevTools → Application → Clear storage on user's browser.
|
||
|
||
**Part 2 — Image cache state**
|
||
|
||
```
|
||
/home/docker/jellyfin/config/metadata = 112M (well-populated)
|
||
/library/<hh>/<item-id>/poster.jpg present in sampled items
|
||
/home/docker/jellyfin/cache = 59M
|
||
/images/resized-images/{0..f} = 16 hex subdirs, all populated with .webp tiles
|
||
```
|
||
|
||
Agent 7's earlier note "**only `resized-images` subdir present**" is **still true** — `/cache/images/` contains only `resized-images/`, no `original/` or `remote/`. That is the **expected** Jellyfin layout (originals live under `/config/metadata/library/`, only resizes live under `/cache/images/resized-images/`). Not a bug.
|
||
|
||
API probe for item `7aa5add2c2d8575eda5280b9b9072071` (The Mike Nolan Show) via temp token (revoked after), all four image types via `https://arrflix.s8n.ru`:
|
||
|
||
| Endpoint | Status | Content-Type | Notes |
|
||
|---|---|---|---|
|
||
| `/Items/{id}/Images/Backdrop` | **200** | image/jpeg | served, `age: 5400` (90min upstream cache) |
|
||
| `/Items/{id}/Images/Primary` | **200** | image/jpeg | served |
|
||
| `/Items/{id}/Images/Logo` | **200** | image/png | served |
|
||
| `/Items/{id}/Images/Thumb` | **200** | image/jpeg | served |
|
||
|
||
**Verdict:** Server-side images are healthy. Backdrop + Primary + Logo + Thumb all 200 with valid content-types for a real item the user is browsing. The "all backdrops black" symptom (Section D) is **NOT** a server-side image problem and **NOT** a SW-cache problem. Likely culprits remaining:
|
||
- (a) CSS rule in deployed `index.html` overrides / theme overrides hiding `.itemBackdrop` or setting `opacity: 0`;
|
||
- (b) browser HTTP cache holding stale 404s from earlier broken state — same Ctrl+Shift+R fix as Part 1;
|
||
- (c) a custom-css.user.css backdrop opacity:0 / display:none rule.
|
||
|
||
Recommend: in user's browser open one show page, DevTools → Network → filter Img → look for `/Items/{id}/Images/Backdrop` request. If 200 served but invisible → CSS theme leak. If never requested → SPA template not fetching it (theme-side bug).
|
||
|
||
### Backdrop diagnosis
|
||
|
||
Investigation 2026-05-09. User reported: detail-page backdrops are pure black on prod (`arrflix.s8n.ru`). Posters render fine. Used a temp ApiKey row (`Name='arrflix-backdrop-diag-2026-05-09'`, deleted after diag) on the live `jellyfin` container.
|
||
|
||
**Layer A (server) — RULED OUT.**
|
||
- Item `7aa5add2c2d8575eda5280b9b9072071` (The Dark Knight) JSON returns `BackdropImageTags: ['76cac7069dc988f7cd54e99b481db3fc']`. Tag exists.
|
||
- `HEAD https://arrflix.s8n.ru/Items/.../Images/Backdrop` → `HTTP/2 200`, `content-type: image/jpeg`, `content-length: 560210`, `last-modified: 2026-05-08 22:11:50`.
|
||
- Same call against `dev.arrflix.s8n.ru` → also 200 + image/jpeg. Both prod and dev serve backdrop bytes correctly.
|
||
|
||
**Layer C (browser cache / SW) — RULED OUT.**
|
||
- The stock SW (Section "SW source + image cache" above) does not intercept `/Items/*/Images/*`. Backdrop URL also returns fresh on direct curl (no SW in path).
|
||
|
||
**Layer B (CSS) — CONFIRMED. The CustomCss `BLACK-PASS` block hides the image layer.**
|
||
|
||
The Jellyfin DOM has two distinct elements (verified by reading `main.jellyfin.bundle.js` + `main.jellyfin.1ed46a7a22b550acaef3.css` inside the running container):
|
||
1. `.backdropContainer` — stock CSS: `position:fixed; bottom:0; left:0; right:0; top:0; z-index:-1`. Holds a child `<div class="backdropImage">` whose `style.backgroundImage="url(/Items/.../Backdrop)"` is injected by JS (`r.style.backgroundImage="url('".concat(e,"')")` in the bundle). This is the IMAGE LAYER.
|
||
2. `.backgroundContainer` (no `d`) — separate `position:fixed` overlay; gets the `withBackdrop` class toggled by JS. This is the OVERLAY LAYER. Stock CSS sets `body { background-color: transparent !important; }` precisely so the body never occludes the `z-index:-1` backdrop.
|
||
|
||
Bug 1 — **`!important` blacks override stock body transparency.** CustomCss `BLACK-PASS 2026-05-08` block (lines ~110-202 of branding.xml CustomCss) sets `background-color: #000000 !important` on `html, body, #reactRoot, .skinBody, .preload, .mainAnimatedPages, .pageContainer, .libraryPage, .itemDetailPage, .padded-bottom-page, .layout-desktop, .layout-mobile, .layout-tv` etc. Since `.backdropContainer` is at `z-index:-1`, ANY ancestor with an opaque background paints on top of it, hiding the backdrop image entirely.
|
||
|
||
Bug 2 — **The transparent-scope rule at lines 102-107 is incomplete.** It scopes to `body.itemDetailPage, body.itemDetailPage #reactRoot, body.itemDetailPage .mainAnimatedPages, body.itemDetailPage .skinBody`, but does NOT include `.layout-desktop` / `.itemDetailPage` itself / `.layout-tv` / `.pageContainer` / `.padded-bottom-page` — so those wrappers remain `#000` on detail pages and continue to occlude the `z-index:-1` layer.
|
||
|
||
Bug 3 (cosmetic — not the cause of black) — line 89-101 sets `background-image: linear-gradient(...)` on `.layout-desktop .backgroundContainer.withBackdrop`. That's the OVERLAY layer, fine on its own. But because the actual backdrop image is hidden by Bug 1, the gradient now composites against pure black instead of the backdrop, so the user sees only the gradient (which fades from black to transparent) over a black backdrop = solid black with at most a faint gradient edge.
|
||
|
||
**Cross-check:** dev (`dev.arrflix.s8n.ru`) does NOT mount the `BLACK-PASS` CustomCss block (Section B above confirms dev branding.xml is 4.5KB smaller and uses remote jsDelivr Cineplex without local overrides). Opening dev should show backdrops normally; if it does, that's a clean A/B confirmation that prod's CustomCss is the regression.
|
||
|
||
**Fix recipe (smallest reversible change).**
|
||
|
||
In `/home/docker/jellyfin/config/config/branding.xml` `<CustomCss>` block, extend the `body.itemDetailPage` transparent-scope rule (currently lines 102-107) to also cancel the black backgrounds on every wrapper that the BLACK-PASS block paints:
|
||
|
||
```css
|
||
/* Replace existing block at lines 102-107 with: */
|
||
body.itemDetailPage,
|
||
body.itemDetailPage #reactRoot,
|
||
body.itemDetailPage .mainAnimatedPages,
|
||
body.itemDetailPage .skinBody,
|
||
body.itemDetailPage .layout-desktop,
|
||
body.itemDetailPage .layout-mobile,
|
||
body.itemDetailPage .layout-tv,
|
||
body.itemDetailPage .pageContainer,
|
||
body.itemDetailPage .padded-bottom-page,
|
||
body.itemDetailPage .itemDetailPage,
|
||
body.itemDetailPage #mainPanel,
|
||
body.itemDetailPage #mainDrawerPanel {
|
||
background-color: transparent !important;
|
||
background: transparent !important;
|
||
}
|
||
```
|
||
|
||
This keeps `#000` everywhere else (library, search, dashboard) but reveals the `.backdropContainer > .backdropImage` layer on detail pages — which is what the gradient overlay (Bug 3) was originally designed to compose against.
|
||
|
||
**Apply via Dashboard → Branding → Custom CSS** (no container restart needed; CSS reloads on next page render). Editing branding.xml directly works too but Jellyfin re-serializes on save, so use the Dashboard.
|
||
|
||
**Verify after edit:** open a movie detail page in an incognito window (bypasses SW). Expected: full-bleed backdrop visible at right ~70% of viewport, gradient fade from black on the left. If still black: hard-refresh + DevTools → Elements → search `.backdropImage` and confirm its parent chain has no `background-color` other than transparent.
|
||
|
||
**Recovery:** revert to the original 6-selector block.
|
||
|
||
---
|
||
|
||
### Playback diagnosis
|
||
|
||
Investigation date 2026-05-09 ~00:30–00:45 UTC. Live transcode test against prod jellyfin via temp ApiKey `arrflix-playback-diag-2026-05-09` (deleted at end of session, verified empty SELECT after DELETE).
|
||
|
||
**A) Source codec verdict — the ItemId is mis-attributed in this incident report.** ItemId `7aa5add2c2d8575eda5280b9b9072071` is **The Dark Knight (2008)**, NOT "The Mike Nolan Show". Confirmed via `/Users/{u}/Items?searchTerm=...`:
|
||
- `7aa5add2...` → Movie / `/media/movies/The Dark Knight (2008)/The Dark Knight (2008).mkv` — **HEVC Main 10 / yuv420p10le, 1918x800, TrueHD 24-bit + AC3 + 2× PGS**.
|
||
- The Mike Nolan Show series Id is `37cb910f507c4d1f9e365ef1954f99c2`. Episodes (e.g. S01E04 "Ding Dong Delli") are **AV1 Main / yuv420p / Opus**, ~412 kbps total.
|
||
|
||
(So the prior Section D backdrop-probe line that labelled `7aa5add2...` as MNS is also wrong — those Backdrop/Primary/Logo/Thumb 200s were TDK images. Doesn't change Section D's conclusion that backdrops serve fine.)
|
||
|
||
Chrome advertises `av1,h264,vp9` (NOT hevc, NOT vp8). So:
|
||
- **TDK (HEVC 10-bit)**: must transcode → server picks libx264 High@4.0 yuv420p (8-bit) AAC LC stereo. Fully Chrome-decodable.
|
||
- **MNS episodes (AV1+Opus)**: should DirectPlay/DirectStream — Chrome supports both natively.
|
||
|
||
**B) HLS pipeline verdict — server-side fully working.** PlaybackInfo POST returned `TranscodingUrl=/videos/.../master.m3u8?VideoCodec=h264&...`, `SupportsTranscoding=True`, `TranscodingSubProtocol=hls`. Manual fetches on TDK:
|
||
- master.m3u8 → HTTP 200, valid `#EXTM3U`, single variant `BANDWIDTH=13407532, RESOLUTION=1918x800, CODECS="avc1.424029,mp4a.40.2"` (the `424029` decodes to "Baseline 4.1" but actual stream below is High — known cosmetic Jellyfin mislabel, not a Chrome blocker).
|
||
- main.m3u8 sub-playlist → HTTP 200, segments `hls1/main/0.ts` … `9.ts`, 3-second EXTINF.
|
||
- segment 0.ts → HTTP 200, 269 KB. ffprobe verdict: `h264 High / yuv420p / level 4.0, 1918x800` + `aac LC`. Valid 8-bit H.264. Cache dir during playback contains 40+ valid `.ts` segments. No fmp4 init filename collision (mpegts segments in current run; the earlier fmp4 path's `-1.mp4` init pattern with `start_number=0` is also fine — `-1.mp4` literally has the `-1` infix in filename, while data segments are `0.mp4, 1.mp4...`; no actual name collision).
|
||
|
||
**C) CSS verdict — video element NOT hidden.** Read `branding.xml` CustomCss + `cineplex.css` (full). All `display:none` / `visibility:hidden` / `opacity:0` / `transform:scale(0)` matches are on UI chrome (`#castCollapsible`, `#guestCastCollapsible`, `.btnQuick`, `.headerSyncButton`, `.headerCastButton`, `.headerUserButton`, MUI drawer items, `.countIndicator`, `#loginPage h1`, etc.). The only `video::*` / `:cue` rules touch subtitle font only. **No hide/scale rule hits `.htmlvideoplayer`, `.videoPlayerContainer`, or the `<video>` element itself.** CustomCss is not the cause of the black screen.
|
||
|
||
**D) Service Worker verdict — no fetch interception.** `/web/serviceworker.js` is the stock Jellyfin notification-only handler (`notificationclick` + `activate→clients.claim`). No `install` cache, no `fetch` listener. Cannot intercept HLS or video URLs. Already characterised in the prior "SW kill recipe" section — stock SW is harmless for media playback.
|
||
|
||
**E) Web research findings.** No 10.10.3-specific Chrome black-screen bug surfaced for the HLS path. Closest historical pattern: hls.js + AV1+Opus DirectStream where Jellyfin 10.10 mis-builds the codec attribute on the playlist for AV1, causing hls.js to abort. Common workaround: force transcode via DeviceProfile or restrict AV1 in user policy. No citation strong enough to assert as root cause from outside the live browser.
|
||
|
||
**F) The actual story — and the fix recipe.**
|
||
|
||
Timeline reconstruction from server logs for the user's session (192.168.0.10):
|
||
- `00:28:46` — PlaybackInfo for `7aa5add2...` (TDK).
|
||
- `00:28:47` → ffmpeg launches on `/media/movies/The Dark Knight (2008)/...mkv` (libx264 High@5.1, fmp4).
|
||
- `00:28:53`, `00:29:01` — ffmpeg restarts at `-ss 00:04:18` and `00:09:06` (= **user seeking forward** during TDK playback).
|
||
- `00:29:07` — *"Playback stopped … playing The Dark Knight. Stopped at 549885 ms"* (= 9:09).
|
||
- `00:29:28` — *"Playback stopped … playing F.T.C. Stopped at 39053 ms"* (MNS S01E02).
|
||
- `00:42:42` — *"Playback stopped … playing Ding Dong Delli. Stopped at 20905 ms"* (MNS S01E04).
|
||
|
||
What this means: TDK transcoded and played fine for 9 minutes with seeks — **TDK is not black-screening**. The MNS episodes (AV1+Opus, 20-39 s before stop) match the user-perceived "black screen, give up" pattern. The incident report conflated these — user said "Mike Nolan Show + ItemId 7aa5add2" but the ItemId is TDK and the actual symptom is on the AV1 MNS episodes.
|
||
|
||
The 00:42:49 ffmpeg launch on TDK that appears AFTER MNS stop is **my own diagnostic curl** — its PlaySessionId `14f52f35eee04cec8146379c0dc6c960` matches the one I generated. Disregard as evidence of user behaviour.
|
||
|
||
**Recommended fix sequence (ordered by likelihood):**
|
||
1. **Re-run with the right item.** Ask user to repro on MNS S01E04 (`Ding Dong Delli`), capture browser DevTools Network panel: was `/Videos/.../master.m3u8` issued (transcode path) or only `/Videos/.../stream.webm` (DirectStream)? What does `/Items/.../PlaybackInfo` return for `SupportsDirectStream` on the AV1 source? Capture the JS console for hls.js / shaka / MediaSource errors.
|
||
2. **If DirectStream is on for AV1** → force transcode by adding a `CodecProfile` in the user's DeviceProfile that bans AV1 DirectStream (Type=Video, Codec=av1, Container=mkv,webm → forced conditional Direct=false). Server then falls back to libx264 transcode (CPU-only on nullstone, slow but reliable).
|
||
3. **Cross-browser test** — try Firefox. Different hls.js behaviour for AV1. If Firefox plays MNS but Chrome doesn't, confirms client-side AV1 DirectStream bug not server.
|
||
4. **TDK is fine** — leave alone, unrelated to this incident.
|
||
|
||
**Out-of-scope here:** dev.arrflix.s8n.ru `/Sessions` returned 401 with the api_key (Sessions needs a user-token, not just admin api_key). Recommend redoing the dev comparison through the user's browser cookie session.
|
||
|
||
API key cleanup verified: `SELECT Name FROM ApiKeys` returned empty after DELETE.
|
||
|
||
---
|
||
|
||
## Final fix applied (verified via playwright headless)
|
||
|
||
Status: **CLOSED** for symptoms 1-4. Symptom 5 (video black-screen on AV1+Opus
|
||
items) is a separate codec issue tracked for the 10.11.8 migration.
|
||
|
||
### Three patches landed
|
||
|
||
1. **`branding.xml` CustomCss**: append `content: "Play"` override on
|
||
`.mainDetailButtons .material-icons.play_arrow::after`. Cineplex theme
|
||
hardcoded German "Abspielen" via CSS `content:` rule — NOT a Jellyfin
|
||
locale issue. Hours of Traefik `Accept-Language` rewrites and
|
||
`force-english-all-users.sh` chases were chasing the wrong layer entirely.
|
||
|
||
2. **`branding.xml` CustomCss**: backdrop transparent-scope using `:has()`.
|
||
`body.itemDetailPage` selector (from prior docs) does NOT match in
|
||
10.10.3 — body class is `libraryDocument`. New rule scopes by
|
||
`.layout-desktop:has(.itemDetailPage)` etc so backdrop layer (z-index:-1)
|
||
renders behind detail pages without breaking other surfaces.
|
||
|
||
3. **`encoding.xml`**: `EnableThrottling=false` + `EnableSegmentDeletion=false`.
|
||
Kills HLS 499 (segments reaped before browser re-requests).
|
||
|
||
### Headless verification
|
||
|
||
`bin/headless-test.py` (new) logs in via Jellyfin SPA login form using
|
||
playwright Chromium, navigates to detail page, screenshots, and probes
|
||
computed styles. Used to bisect:
|
||
- baseline screenshot (broken)
|
||
- `:has()` selector verified backdrop renders
|
||
- "Play" verified replaces "Abspielen"
|
||
|
||
### Re-apply
|
||
|
||
`bin/apply-26-incident-fixes.sh` (new, idempotent) re-applies all three
|
||
patches if `branding.xml` / `encoding.xml` drift back. Run via:
|
||
`ssh user@nullstone "$(cat bin/apply-26-incident-fixes.sh)"`
|
||
|
||
### What was rolled back
|
||
|
||
- The `clear-site-data@file` Traefik middleware I added during this session
|
||
was making prod worse: it was wiping cookies+storage on every visit,
|
||
breaking auth+playback session continuity. Reverted by restoring the
|
||
Traefik dynamic.yml backup taken right before the edit.
|
||
|
||
---
|
||
|
||
## Do-NOT-repeat checklist (post-mortem)
|
||
|
||
These are the dead-ends. Future operators (and future me) should skip:
|
||
|
||
1. **Don't add `Clear-Site-Data` to a Jellyfin route to "force the SW out".**
|
||
Stock Jellyfin SW is notification-only (no fetch handler) — there is no
|
||
SW poisoning to begin with. The middleware just wipes cookies on every
|
||
visit, breaking auth races.
|
||
|
||
2. **Don't run `bin/force-english-all-users.sh` to fix "Abspielen".**
|
||
Doc 25 already established per-user `Configuration.UICulture` is theatre
|
||
and the SPA never reads it. The German text was in **Cineplex CSS** via
|
||
`content: "Abspielen"`. Patch the CSS, not the user config.
|
||
|
||
3. **Don't trust HTTP 204 from POST `/Users/{id}/Configuration` as success.**
|
||
Always GET back and verify. (And see #2 — even if you CAN persist
|
||
UICulture, it doesn't drive UI strings in 10.10.x.)
|
||
|
||
4. **Don't use `body.itemDetailPage` as a CSS selector in 10.10.3.**
|
||
The body class on detail pages is `libraryDocument`, not `itemDetailPage`.
|
||
Use `.itemDetailPage` directly or `:has(.itemDetailPage)` on ancestors.
|
||
|
||
5. **Don't paint `#000 !important` on `.layout-desktop` / `.pageContainer`
|
||
without scoping.** They wrap the backdrop layer; an unscoped black
|
||
override occludes the entire backdrop. Always scope with `:has()` or by
|
||
page-specific class.
|
||
|
||
6. **Don't hot-patch `web-overrides/index.html` on the server without
|
||
committing back to repo same step.** Drift from repo is invisible until
|
||
it breaks. Bug A (the DOM-walker MutationObserver freezing the browser)
|
||
came from this exact pattern — see `~/.claude/projects/.../memory/feedback_always_commit_to_my_git.md`.
|
||
|
||
7. **Don't write CSS Mutation/text-walker observers without debounce + scope.**
|
||
Walking every text node on every DOM mutation freezes the main thread on
|
||
poster grids. If you need DOM rewriting, use targeted selectors + debounce.
|
||
|
||
8. **Don't sed-via-python regex on YAML files without strict anchors.**
|
||
I damaged `dynamic.yml` with a too-greedy DOTALL match earlier in this
|
||
session (deleted unrelated routers). Restore-from-backup saved it.
|
||
Always diff before reload.
|
||
|
||
9. **Don't believe a single-itemId test as "playback works".** Item
|
||
`7aa5add2c2d8575eda5280b9b9072071` is The Dark Knight (HEVC, transcodes
|
||
fine to H.264). The Mike Nolan Show episodes are AV1+Opus and break in
|
||
Chrome. Always test the actual item the user reported.
|
||
|
||
10. **Don't skip headless smoke-test.** Visual confirmation in playwright
|
||
Chromium catches CSS regressions instantly without waiting for the user
|
||
to clear browser cache. `bin/headless-test.py` is a 30s round-trip.
|
||
|
||
---
|
||
|
||
## Iteration 2 — backdrop visible only on top viewport (2026-05-09 follow-up)
|
||
|
||
### INC4 online research
|
||
|
||
Web sweep 2026-05-09 against jellyfin/jellyfin + jellyfin/jellyfin-web
|
||
issues filed since 2025-01. All URLs cited inline. "Verdict" = how strong
|
||
the link to our two open symptoms (black-screen video, opaque "More from
|
||
Season" band) is.
|
||
|
||
**Q1 — Web 10.10.3 video black-screen on play (server transcoding HLS,
|
||
browser shows nothing):**
|
||
- jellyfin-webos #126 "Black screen by enable Prefer FMP4-HLS as media
|
||
container" — HEVC Main10 HDR10 10-bit direct-stream goes black, audio
|
||
fine. Workaround: disable Prefer fMP4-HLS.
|
||
https://github.com/jellyfin/jellyfin-webos/issues/126
|
||
- jellyfin-web #7405 "HLS Media Errors only in Webbrowsers."
|
||
https://github.com/jellyfin/jellyfin-web/issues/7405
|
||
- jellyfin #16612 "Playback errors due to fMP4-HLS" (10.11.8, but root
|
||
cause is fMP4 container; same workaround).
|
||
https://github.com/jellyfin/jellyfin/issues/16612
|
||
- forum t-solved-black-screen … web UI 10.0.3: theme `.preload { #000
|
||
!important }` covered the player. Direct precedent for our symptom.
|
||
https://forum.jellyfin.org/t-solved-black-screen-w-audio-when-playing-video-web-ui-10-0-3
|
||
- **Verdict: probable.** Two independent vectors:
|
||
(1) fMP4-HLS container produces an init segment hls.js stalls on for
|
||
certain codec profiles;
|
||
(2) custom-CSS overlay covering the player. Both consistent with our
|
||
black-screen-but-server-transcoding behaviour.
|
||
- **Next step:** in DevTools, confirm whether `<video>` has frames
|
||
(network MSE buffer) or is occluded. If the SourceBuffer never
|
||
appendBuffer-s, it's #126/#16612 → toggle off "Prefer fMP4-HLS Media
|
||
Container" in playback settings (or strip from custom DeviceProfile).
|
||
If frames are buffered but invisible, search for an opaque ancestor
|
||
(`.preload`, BLACK-PASS rule covering `.videoPlayerContainer`).
|
||
|
||
**Q2 — Chrome 148 + `-hls_fmp4_init_filename "X-1.mp4"` MSE compatibility:**
|
||
- jellyfin-web #7546 "[Regression] Web browser HLS playback times out
|
||
when audio transcoding required - worked in 10.10.7, broken in 10.11.6"
|
||
— hls.js times out waiting for the first segment while ffmpeg probes
|
||
large files.
|
||
https://github.com/jellyfin/jellyfin-web/issues/7546
|
||
- jellyfin #14487 "Audio delay don't work with fMP4-HLS."
|
||
https://github.com/jellyfin/jellyfin/issues/14487
|
||
- jellyfin #16647 "HLS subtitle X-TIMESTAMP-MAP is misaligned when using
|
||
fMP4 segments."
|
||
https://github.com/jellyfin/jellyfin/issues/16647
|
||
- **Verdict: confirmed broken across 10.10.7 → 10.11.x for some
|
||
codec/container combos.** Not Chrome-148-specific; the init-filename
|
||
pattern itself isn't the bug — the timing between ffmpeg probing and
|
||
hls.js segment-load timeout is.
|
||
- **Next step:** disable Prefer fMP4-HLS first (single-toggle fix). If
|
||
still broken, drop probesize + analyzeduration on the encoder side, or
|
||
force ts segments via DeviceProfile TranscodingProfile container=ts.
|
||
|
||
**Q3 — AV1 DirectStream codec-tag mislabel:**
|
||
- jellyfin #15646 "AV1 Video Stream in Wrong Container" — av1 muxed into
|
||
mpegts as private-data stream, ffmpeg warning "may not be recognized
|
||
upon reading". Workaround: switch hls_segment_type from mpegts to
|
||
fmp4 with .m4s extension. Marked closed in UI but in Team Review (no
|
||
PR linked, no version-tag yet).
|
||
https://github.com/jellyfin/jellyfin/issues/15646
|
||
- Codec Support docs reaffirm AV1 web playback is gated on browser
|
||
support + correct container.
|
||
https://jellyfin.org/docs/general/clients/codec-support/
|
||
- **Verdict: confirmed open.** Affects 10.11.3 and back; no PR landed
|
||
in 10.10.x line. Mike Nolan Show AV1+Opus matches the failure pattern.
|
||
- **Next step:** ban AV1 DirectStream via custom DeviceProfile
|
||
(drop AV1 from DirectPlayProfiles → forces server-side libx264 transcode).
|
||
|
||
**Q4 — "More from Season" CSS class names:**
|
||
- jellyfin-web source uses `verticalSection` + `detailVerticalSection`
|
||
pair, with `data-type="MusicAlbum|Episode|...".`
|
||
https://github.com/tedhinklater/JellyfinThemeGuide
|
||
- Layouts reference `.scrollSlider`, `.itemsContainer`, `.padded-left`,
|
||
`.sectionTitleContainer` (already in our Iteration 2 fix list).
|
||
|
||
### INC4 video playback diagnosis (full e2e)
|
||
|
||
End-to-end test 2026-05-09 ~01:35 UTC. Temp ApiKey
|
||
`arrflix-playback-e2e-2026-05-09` (token rotated, deleted at end, verified
|
||
SELECT empty). Headless Chromium via playwright drove the SPA login as
|
||
guest:123 and clicked .btnPlay on Rick & Morty S1E1 Pilot
|
||
(`324f75b84f394a5d9b0749c0679f23b9`). Logs in `/tmp/arrflix-playback-e2e/`.
|
||
|
||
**Source codec verdict — Rick & Morty Pilot is NOT H.264.** ffprobe inside
|
||
container reports the file is HEVC Main 10 / yuv420p10le / 3840x2160 /
|
||
TrueHD 5.1 24-bit + AC3 5.1 + AC3 2.0 + PGS subs (4K HDR). Same codec class
|
||
as TDK. The task brief assumption ("Rick & Morty likely H.264") is wrong —
|
||
this library is 4K HDR remux. Path:
|
||
`/media/tv/Rick and Morty (2013)/Season 01/Rick and Morty (2013) - S01E01 - Pilot.mkv`.
|
||
|
||
**Failure mode at click — playback DOES work, but takes 12-18s to first
|
||
frame.** All segments + manifest 200 OK, no console errors, no video.error,
|
||
no MediaSource exception, no CSS occlusion (.htmlvideoplayer / `<video>`
|
||
display:block opacity:1 visibility:visible z-index:auto, getBoundingClientRect
|
||
== full viewport). State timeline (clean run, position reset to 0):
|
||
|
||
| t (s) | readyState | networkState | currentTime | buffered |
|
||
|---|---|---|---|---|
|
||
| 2-10 | 0 (HAVE_NOTHING) | 2 (LOADING) | 0 | [] |
|
||
| 12 | 3 (HAVE_FUTURE_DATA) | 2 | 0 | [[0, 2.97]] |
|
||
| 16 | 3 | 2 | 0.72 | [[0, 5.97]] |
|
||
| 22 | 3 | 2 | 6.74 | [[0, 11.99]] |
|
||
| 30 | 3 | 2 | 14.75 | [[0, 14.97]] |
|
||
|
||
With user's actual stored resume position (243.018 s from prior session),
|
||
adds a kill+restart cycle: SPA fetches segment 0, sees currentTime=243,
|
||
seeks → server kills 1st ffmpeg, launches 2nd with `-ss 00:04:03
|
||
-noaccurate_seek -start_number 81`. Browser stays at readyState=1 from
|
||
~t=8s to ~t=16s while 2nd ffmpeg produces segment 81. **Total wait ≈ 18s
|
||
to first painted frame.** From the user's seat that looks identical to a
|
||
broken player.
|
||
|
||
**Server-side ffmpeg command (verified live in jellyfin logs):**
|
||
```
|
||
/usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -probesize 1G \
|
||
-i "/media/tv/Rick and Morty (2013)/Season 01/...Pilot.mkv" \
|
||
-map 0:0 -map 0:1 -codec:v:0 libx264 -preset veryfast -crf 23 \
|
||
-maxrate 13546858 -profile:v:0 high -level 51 \
|
||
-vf "setparams=color_primaries=bt2020:color_trc=smpte2084:colorspace=bt2020nc,\
|
||
scale=trunc(min(max(iw,ih*a),min(3840,2160*a))/2)*2:trunc(min(max(iw/a,ih),min(3840/a,2160))/2)*2,\
|
||
tonemapx=tonemap=bt2390:desat=0:peak=100:t=bt709:m=bt709:p=bt709:format=yuv420p" \
|
||
-codec:a:0 libfdk_aac -ac 2 -ab 256000 \
|
||
-hls_segment_type fmp4 -hls_fmp4_init_filename "...-1.mp4" \
|
||
-start_number 0 -hls_segment_filename "/cache/transcodes/...%d.mp4" \
|
||
-f hls -hls_time 3 ...
|
||
```
|
||
`HardwareAccelerationType=none` + 4K + tonemapx + libx264 veryfast +
|
||
software stereo downmix. **Per-segment encode wallclock observed:** seg0
|
||
~6 s, seg1 ~2.05 s. At nullstone Ryzen 5 5600G CPU-only, that's ~50% of
|
||
real-time on a sustained run. Browser stalls because new segments arrive
|
||
slower than they're consumed.
|
||
|
||
**PlaybackInfo verdict (browser-emulating DeviceProfile, av1+h264+vp9 both
|
||
allowed):** `SupportsDirectPlay=False`, `SupportsDirectStream=False`,
|
||
`SupportsTranscoding=True`,
|
||
`TranscodeReasons=ContainerNotSupported,VideoCodecNotSupported,AudioCodecNotSupported`,
|
||
`TranscodingSubProtocol=hls`, `TranscodingContainer=ts` (when client asks
|
||
ts) — but in the headless run the SPA's stock DeviceProfile asks
|
||
`SegmentContainer=mp4` (fmp4 path) and the server picked **libx264 H.264
|
||
high@5.1 8-bit**, NOT av1. The `VideoCodec=av1,h264,vp9` in the URL is the
|
||
priority list; server reads it and selects the first the source can map
|
||
to without HW — that's libx264 here, confirmed by `-codec:v:0 libx264` in
|
||
ffmpeg cmdline. AV1 is never used as a transcode target on prod.
|
||
|
||
**Web research corroboration:**
|
||
- jellyfin#13324 "Transcoded playback of 4K HDR content fails": "no modern
|
||
consumer CPU can transcode 4K HDR to SDR in real time" — software
|
||
tonemapping is the bottleneck.
|
||
https://github.com/jellyfin/jellyfin/issues/13324
|
||
- jellyfin#5067 "HDR Tone Mapping is very slow in Jellyfin (19fps, 70%
|
||
cpu)": ~20 fps cap on tonemapx.
|
||
https://github.com/jellyfin/jellyfin/issues/5067
|
||
- jellyfin docs Hardware Acceleration: software CPU decode + tonemap +
|
||
encode at 4K HDR is officially "not supported for sustained real-time".
|
||
https://jellyfin.org/docs/general/post-install/transcoding/hardware-acceleration/
|
||
|
||
**Recommended fix (ordered by reversibility + UX impact):**
|
||
|
||
1. **Cap user MaxStreamingBitrate to 20 Mbps in jellyfin-web settings.**
|
||
Each user → Profile → Playback → Quality → 20 Mbps (or "Auto" with a
|
||
default cap). Server-side ffmpeg still runs but `-maxrate 20000000`
|
||
matched output bitrate is reasonable and the scale filter clamps to
|
||
1080p (1920x800 for the source aspect), eliminating the 4K scale
|
||
pass. Reduces per-segment encode wallclock from ~6s → ~1.5s. **Single
|
||
toggle, per-user, no server restart, fully reversible.** This is the
|
||
right move first.
|
||
|
||
2. **Force libx264 + transcoding container=ts via DeviceProfile (or in
|
||
jellyfin-web settings disable "Prefer fMP4-HLS").** Skips the fmp4
|
||
init-segment path which is implicated in jellyfin#16612 / webos#126
|
||
for HEVC Main10 sources. `ts` segments self-contain init data —
|
||
ssimpler timing.
|
||
|
||
3. **Disable software tonemapping for libraries with fake-HDR sources.**
|
||
Doc 21 already established R&M's `MasteringDisplay/MaxCLL` are absent
|
||
(fake AI-upscale HDR). Server-side toggle:
|
||
```
|
||
ssh user@192.168.0.100 'docker exec jellyfin sh -c "\
|
||
sed -i \"s|<EnableTonemapping>true|<EnableTonemapping>false|\" \
|
||
/config/config/encoding.xml" && docker restart jellyfin'
|
||
```
|
||
Removes the tonemapx step from the filtergraph. Output will be SDR-
|
||
directly-from-HDR-pixels (washed out per doc 21 — already accepted as
|
||
the lesser evil for R&M). Saves ~30% encode CPU at 4K.
|
||
|
||
4. **(Last resort, deferred to 10.11.8 migration)** Add a CCR-style
|
||
"transcode pre-warm" hook: when SPA opens a detail page, pre-issue
|
||
`/Items/{id}/PlaybackInfo` + a no-op range request on segment 0 to
|
||
start ffmpeg before the user clicks Play. Reduces perceived TTFP.
|
||
|
||
**Recommended immediate action: option 1 + option 3.** No code change
|
||
needed — both are settings flips. After flipping, repro: open Pilot in
|
||
Chrome, click Play, time-to-first-frame should be <5s.
|
||
|
||
**Headless artefact warning:** the `v2-02-after-30s.png` screenshot is
|
||
pure black despite readyState=3 + currentTime advancing + buffered=[0,
|
||
14.97]. That is because Chromium without GPU does not paint decoded H.264
|
||
frames (no compositor target). Real Chrome on real GPU paints. So a
|
||
black screenshot from `bin/headless-test.py` after Play is NOT a CSS bug
|
||
— it's a headless rendering artefact. Verify CSS occlusion via
|
||
`getComputedStyle` + `getBoundingClientRect` instead, both already clean
|
||
in this run.
|
||
|
||
**Open follow-ups left:** AV1+Opus episodes (Mike Nolan Show) still
|
||
untested in this iteration — different failure mode (DirectStream
|
||
codec-tag mislabel per Q3 above), separate fix path.
|
||
|
||
https://deepwiki.com/jellyfin/jellyfin-web/3.5-home-sections-and-library-navigation
|
||
- BobHasNoSoul/jellyfin-mods uses `#itemDetailPage` parent + nth-of-type
|
||
for section targeting.
|
||
https://github.com/BobHasNoSoul/jellyfin-mods
|
||
- **Verdict: confirmed.** The wrapper is `.verticalSection.detailVerticalSection`
|
||
(no `moreFromSeasonSection` class — Jellyfin distinguishes sections by
|
||
`data-type` attr, not class). Our INC3 selector list already covers
|
||
`.detailVerticalSection*`, so the opaque band is from a DESCENDANT, not
|
||
the wrapper itself. Likely candidates: a `.cardScalable`, `.cardBox`, or
|
||
`.cardImageContainer` with explicit `background:#000` from BLACK-PASS.
|
||
- **Next step:** in DevTools, inspect the opaque band, walk parent chain,
|
||
find the first ancestor with non-transparent computed bg. Either
|
||
add to transparent-scope or wrap selector in `:not(.cardImageContainer)`.
|
||
|
||
**Q5 — Themes implementing full-page persistent backdrop:**
|
||
- meow.garden "Dynamic backdrops for Jellyfin" — uses
|
||
`.detailPagePrimaryContainer .detailImageContainer .blurhash-canvas {
|
||
position: fixed !important; opacity: .5; }` to repurpose the blurhash
|
||
placeholder as a fullscreen backdrop.
|
||
https://meow.garden/jellyfin-dynamic-backdrops/
|
||
- Cineplex theme custom.css: targets `.backgroundContainer`,
|
||
`.backgroundContainer.withBackdrop`, `.backdropImage`, `.blurhash-canvas`
|
||
(commented out). Mobile-only `.itemBackdrop` mask gradient.
|
||
https://github.com/MRunkehl/cineplex
|
||
- Finity theme: minimal docs, refers to "gradient mask for show backdrops"
|
||
but actual selectors live in CSS files (not exposed in README).
|
||
https://github.com/prism2001/finity
|
||
- **Verdict: confirmed.** Two viable patterns:
|
||
(1) pin `.backgroundContainer` (our current INC2 approach) — works but
|
||
must transparent-scope every ancestor;
|
||
(2) repurpose `.blurhash-canvas` as the fixed layer (meow.garden) —
|
||
cleaner because blurhash is already per-item; survives section navigation
|
||
without scroll math.
|
||
- **Next step:** if INC3 transparent-scope keeps regressing, switch to
|
||
blurhash-canvas pin. One selector vs ~20 wrappers to keep transparent.
|
||
|
||
**Q6 — 10.10.3 → 10.10.7 worth bumping?**
|
||
- 10.10.7 forum announcement (2025-04-05): security release, "several
|
||
bugfixes." Trusted-proxies config required pre-upgrade.
|
||
https://forum.jellyfin.org/t-new-jellyfin-server-web-release-10-10-7
|
||
- Compare-page diff (v10.10.3...v10.10.7) didn't generate (too long).
|
||
Releasebot lists per-release notes:
|
||
https://releasebot.io/updates/jellyfin/jellyfin-server
|
||
- Most fMP4/HLS fixes in our research target 10.11.x line, not 10.10.x
|
||
patch series.
|
||
- **Verdict: probable mild improvement, not a fix for our bugs.** Worth
|
||
bumping for security/CVE coverage but unlikely to resolve black-screen
|
||
or carousel-band. The known regressions of 10.11.x (`#7546`, `#16612`)
|
||
argue against jumping straight to 10.11.8 without dev validation.
|
||
- **Next step:** snapshot DB, bump dev to 10.10.7 first. If still broken,
|
||
10.11.8 is roadmap path with ElegantFin theme swap.
|
||
|
||
**Q7 — Force-transcode-everything DeviceProfile:**
|
||
- Jellyfin docs confirm there's no built-in admin toggle to force
|
||
transcoding for all clients.
|
||
https://jellyfin.org/docs/general/post-install/transcoding/
|
||
- forum.jellyfin.org/t-force-trasnscoding-or-disable-directplay: community
|
||
workaround is reduce client max bitrate to 1Mbps (degrades quality) —
|
||
no clean DeviceProfile-only override.
|
||
https://forum.jellyfin.org/t-force-trasnscoding-or-disable-directplay-x265-stuttering-firetv
|
||
- jellyfin-web #7651 "Chrome DeviceProfile hardcodes MKV in
|
||
DirectPlayProfiles": JS-Injector plugin removes entries client-side
|
||
before PlaybackInfo POST. Workaround pattern is generalisable: hook
|
||
PlaybackInfo XHR, set `DirectPlayProfiles=[]`, leave only
|
||
`TranscodingProfiles` with H264 mp4/HLS. Server then has nothing to
|
||
match → forces transcode.
|
||
https://github.com/jellyfin/jellyfin-web/issues/7651
|
||
- **Verdict: confirmed pattern, no native config knob.** Server-side
|
||
empty DirectPlayProfiles in a custom DeviceProfile is the cleanest
|
||
bypass; only ts-format TranscodingProfile remaining → libx264.
|
||
- **Next step:** create custom DeviceProfile in admin → DLNA → Profiles
|
||
with empty DirectPlay + a single TranscodingProfile (Container=mp4,
|
||
VideoCodec=h264, AudioCodec=aac, Protocol=Hls). Match to Identification
|
||
by browser UA. Eliminates codec compat as a variable in one move and
|
||
is the cleanest test for "is the bug in our codec path or our renderer".
|
||
|
||
---
|
||
|
||
After INC1 (`:has()` transparent-scope) shipped and prod showed backdrop on
|
||
detail-page top, owner reported "in the middle of the More from Season 1
|
||
is black, it's hiding the artwork". Below-the-fold sections (Next Up, Seasons,
|
||
More Like This) showed solid black instead of continuing the backdrop.
|
||
|
||
### Root cause (INC2)
|
||
|
||
`.backdropContainer` defaults to non-fixed positioning — it scrolls out of
|
||
view. INC1 made wrappers transparent so backdrop showed through, but only
|
||
where the backdrop EXISTED in the DOM viewport. Once user scrolls down,
|
||
backdrop is above viewport, sections see body's `#000` bg.
|
||
|
||
### Fix INC2
|
||
|
||
Pin `.backdropContainer` + `.backgroundContainer` to `position: fixed; top:0;
|
||
height:100vh; z-index:0`. Added `::after` vertical gradient (transparent at
|
||
top → 75% black at bottom) so text remains readable as user scrolls into
|
||
backdrop area.
|
||
|
||
### Root cause (INC3)
|
||
|
||
INC2 alone didn't fix it visually — section wrappers (`.detailVerticalSection`,
|
||
`.scrollSliderContainer`, `.padded-bottom-page`, `.itemsContainer` etc) still
|
||
painted opaque bg from BLACK-PASS + finity. Pinned backdrop sat behind, but
|
||
sections occluded it section-by-section.
|
||
|
||
### Fix INC3
|
||
|
||
Extended transparent-scope to all detail-page sub-sections:
|
||
`.itemDetailPage > *`, `.detailPageContent`, `.detailPagePrimaryContainer`,
|
||
`.detailPageWrapperContainer`, `.detailVerticalSection*`, `.detailSection*`,
|
||
`.itemsContainer`, `.scrollSlider*`, `.padded-bottom-page`,
|
||
`.sectionTitleContainer`, `.detailRibbon`, `.subtitleAudioContainer`,
|
||
`.detailPageRoot`.
|
||
|
||
### Verification (INC2 + INC3)
|
||
|
||
Updated `bin/headless-test.py` to take TWO viewport screenshots: top-of-page
|
||
+ scrolled to 50% page height. With INC2/INC3 applied, scrolled screenshot
|
||
shows R&M backdrop persisting behind "Seasons" + "More Like This" sections
|
||
(previously: solid black).
|
||
|
||
### Lesson learned
|
||
|
||
When pinning a backdrop with `position:fixed`, transparency must extend
|
||
RECURSIVELY through every wrapper ON TOP of the backdrop layer, not just the
|
||
top-level page wrappers. Test with scrolled screenshot — full-page screenshot
|
||
in playwright stretches viewport and hides `position:fixed` issues.
|
||
|
||
`bin/headless-test.py` now takes both top + scrolled. Use both to bisect.
|
||
|
||
---
|
||
|
||
### INC4 black-band locator (2026-05-09)
|
||
|
||
**Symptom.** After INC3, owner reported that for ADMIN users a wide black
|
||
band (~250px tall, full-width) still painted around the "More from Season 1"
|
||
carousel on the Rick & Morty detail page (admin-only carousel; guest users
|
||
don't see it). Cards rendered fine, only the BAND around them was opaque.
|
||
|
||
**Diagnostic method.** Inserted temp `arrflix-band-diag-2026-05-09` ApiKey,
|
||
logged in as admin via playwright, navigated to R&M detail page, scrolled
|
||
all sections into view, then walked DOM upward from each `.scrollSlider`
|
||
restricted to the `.itemDetailPage` subtree, reporting every ancestor with
|
||
non-transparent background. Locator script: `/tmp/arrflix-band-locator.py`.
|
||
|
||
**Result.** Single opaque-black wrapper found, identical for ALL 9
|
||
carousels (Schedule / Next Up / Seasons / Additional Parts / Lyrics /
|
||
Cast & Crew / Special Features / Music Videos / Scenes / **More Like This** /
|
||
**More from Season** / **More from Artist**):
|
||
|
||
```
|
||
div.padded-top-focusscale.padded-bottom-focusscale.no-padding.emby-scroller
|
||
bg = rgb(0, 0, 0) pos = static z = auto
|
||
rect = x:80 y:1242 1488×333 (matches the band the user described)
|
||
```
|
||
|
||
**Root cause.** Pre-existing CSS rule in `branding.xml` from 2026-05-08
|
||
labelled `/* kill gray band behind home-page Recently Added rows */` applied
|
||
`.emby-scroller { background: #000 !important; }` UNSCOPED. INC3 overrode
|
||
its sibling wrappers (`.detailVerticalSection`, `.itemsContainer`,
|
||
`.scrollSlider`, `.scrollSliderContainer`) but missed the IMMEDIATE PARENT
|
||
`.emby-scroller`. That single wrapper was the band.
|
||
|
||
**Fix INC4.** Detail-page-scoped transparent override appended to CustomCss
|
||
after the INC3 block:
|
||
|
||
```css
|
||
.itemDetailPage .emby-scroller,
|
||
.itemDetailPage .emby-scroller-container,
|
||
.itemDetailPage .verticalSection,
|
||
.itemDetailPage .padded-top-focusscale,
|
||
.itemDetailPage .padded-bottom-focusscale,
|
||
.itemDetailPage .moreFromSeasonSection,
|
||
.itemDetailPage .moreFromArtistSection,
|
||
.itemDetailPage .scrollSliderContainer,
|
||
.itemDetailPage .scrollButtonContainer {
|
||
background-color: transparent !important;
|
||
background: transparent !important;
|
||
}
|
||
```
|
||
|
||
No `position:relative; z-index:1` needed on `.emby-scroller` — the parent
|
||
`.detailPageWrapperContainer` already has `position:relative; z-index:2`,
|
||
which is above the pinned `.backdropContainer` at `z:0`. Removing the opaque
|
||
fill alone is sufficient.
|
||
|
||
**Verification.** Re-ran band-locator after `docker restart jellyfin` —
|
||
`opaqueBlackBands: 0` inside `.itemDetailPage` (was 1). Screenshot of R&M
|
||
detail page at mid-scroll now shows portal/Easter Island backdrop continuous
|
||
behind every carousel including "More Like This". Cleaned up the
|
||
`arrflix-band-diag-2026-05-09` ApiKey row.
|
||
|
||
**Patch lines added** to `bin/apply-26-incident-fixes.sh` so re-runs are
|
||
idempotent and recover from `branding.xml` drift.
|
||
|
||
**Lesson.** When a prior unscoped `background: #000 !important` rule exists
|
||
in a shared CSS bucket (here: `branding.xml CustomCss`), grep the file for
|
||
the property/selector BEFORE writing a new transparent-scope rule. A
|
||
DOM-walking locator script that reports every opaque ancestor of the target
|
||
finds the painter in seconds — much faster than guessing selectors. Going
|
||
forward: when adding a "paint opaque" rule, scope it from day one
|
||
(`.homePage .emby-scroller`, not bare `.emby-scroller`).
|
||
|
||
---
|
||
|
||
## Open follow-ups (for separate sessions)
|
||
|
||
- **AV1+Opus playback** (Bug E): Chrome's AV1 DirectStream codec-tag mislabel
|
||
bug. Fix options: (a) ban AV1 DirectStream via DeviceProfile (force x264
|
||
transcode), (b) re-encode MNS source to H.264, (c) wait for 10.11.8
|
||
upgrade. See agent finding in this doc → "Playback diagnosis".
|
||
|
||
- **10.11.8 migration**: current 10.10.3 has known issues per online research
|
||
(TMDB scrape regression #14922, custom CSS injection #7220). 10.11.8 is
|
||
current stable as of 2026-05-09 with CVE fixes. Plan: dev first, snapshot
|
||
EF Core DB migration, swap Cineplex → ElegantFin (10.11-supported), promote
|
||
to prod after verified.
|
||
|
||
- **Permanent SW kill option** (deferred — stock SW doesn't actually
|
||
intercept anything): if a future Jellyfin update enables a real fetch-handler
|
||
SW, we have the recipe in this doc → "SW kill recipe" agent finding.
|
||
|
||
- **Session-state backup off-host** (ROADMAP H4): no automated backup yet.
|
||
Today's incident was rescued by inline `cp X X.bak.$(date +%s)` for both
|
||
branding.xml and dynamic.yml — should be systematized.
|
||
|
||
---
|
||
|
||
## Iteration 2
|
||
|
||
### INC4 testing methodology audit
|
||
|
||
This iteration is a meta-audit on the test that signed off Iteration 1.
|
||
After INC1–INC3 shipped, owner reported two regressions the headless test
|
||
did NOT catch:
|
||
|
||
1. A black band painted behind the **"More from Season N"** carousel on
|
||
detail pages.
|
||
2. **Video plays as a black screen** on the user's actual TV episode
|
||
content (AV1+Opus from Mike Nolan Show), even though the test claimed
|
||
playback was fixed.
|
||
|
||
This section documents what the v1 test missed, why those gaps existed,
|
||
what `bin/headless-test-v2.py` changes, and the preflight protocol every
|
||
future fix must pass before claiming "verified".
|
||
|
||
#### a) What v1 missed
|
||
|
||
| Gap | Concrete consequence |
|
||
|---|---|
|
||
| Logged in **only** as `guest` (non-admin restricted user). | The "More from Season N" carousel is admin-visible content. `guest`'s permissions hid it from the DOM, so the section wrapper that painted the black band never rendered during the test. v1 reported "no regression" because the offending element wasn't on the page it screenshotted. |
|
||
| **Never clicked Play.** v1 only loaded the detail page, took screenshots, scraped a small fixed selector list. | A `<video>` element that fails to decode (AV1 in Chrome with mislabelled codec tag, per Bug E in this doc) won't show up unless you actually start playback. v1 had no way to observe `video.error`, `video.readyState`, `videoWidth/Height`, or `currentTime` because the player was never instantiated. |
|
||
| **Only one item tested.** v1 auto-picked the first Series and probed its detail page. | Codec coverage was random — usually whatever happened to be first alphabetically. The HEVC movie that worked (Dark Knight) and the AV1 episode that didn't (Mike Nolan Show) had different failure modes; v1 couldn't distinguish them because it tested neither systematically. |
|
||
| **Hardcoded selector list** for DOM probe. | v1 inspected ~22 known selectors. Any new section wrapper (e.g. `.moreFromSeasonContainer`) painting an opaque background outside that list was invisible. The black band lived in a wrapper v1 didn't even know existed. |
|
||
| **No structured pass/fail criterion.** v1 emitted `probe.json` with raw computed-style snapshots; humans had to read it and decide. | "I declared playback fixed" — that human decision had no machine-verifiable backing. There was no JSON field saying `regressions: []` that owner / next-Claude could trust without re-deriving from raw data. |
|
||
| **No cross-reference to a known-good baseline.** | Even if v1 had caught the band, there was no golden-image comparison to alert "this looks different from last passing run". Detection relied on someone eyeballing the screenshot. |
|
||
|
||
#### b) Why those gaps existed
|
||
|
||
- **Speed-bias.** v1 was written under time pressure as the third-tier
|
||
verification of an INC3 CSS fix. The minimum viable test was "page
|
||
loads and looks right at top + scrolled". That worked for the visual
|
||
bug it was designed against — and stopped there.
|
||
- **No threat model for the test itself.** The test never asked "what
|
||
classes of regression CAN I detect, what classes CAN'T I". If it had,
|
||
the missing-Play and admin-only-content gaps would have been obvious.
|
||
- **Single-account convenience.** `guest-mirror` was the easiest creds
|
||
to hand because doc 17 had just minted them. Re-using one role across
|
||
the whole verification was the path of least resistance.
|
||
- **Selector tunnel-vision.** The selector list was copied from the
|
||
previous fix's diagnostic queries (INC2/INC3). It tracked what the
|
||
previous bugs touched, not what the current page actually rendered.
|
||
- **Server-log success treated as proof of client success.** Bug E was
|
||
declared "fixed" because Dark Knight transcoding logs looked clean.
|
||
No one closed the loop and confirmed the user's actual content
|
||
(Mike Nolan Show / AV1) decoded in a real browser.
|
||
|
||
#### c) What v2 changes (`bin/headless-test-v2.py`)
|
||
|
||
| Improvement | Mechanism |
|
||
|---|---|
|
||
| **Multi-user coverage** | Runs the entire probe twice: once as admin (`s8n` / `s8n-dev`), once as non-admin (`guest` / `guest-mirror`). Per-user screenshots + `probe.json`. Computes a `section_title_diff` listing which sections rendered for one role but not the other — that diff is the canonical alert for "you're missing admin-only content". |
|
||
| **Click Play + observe** | After detail page settles, locates `.btnPlay` / `[data-action="play"]`, clicks (with keyboard `p` fallback), waits 10 s, then reads `<video>` element state: `currentTime`, `paused`, `ended`, `readyState`, `networkState`, `videoWidth`, `videoHeight`, `error.code`, `buffered_ranges`. Also captures a `*-play.png` screenshot and accumulates new console / network errors during the playback window. |
|
||
| **Multiple-item coverage** | Three items per role: HEVC movie (Dark Knight, hardcoded id `7aa5add2c2d8575eda5280b9b9072071`), AV1 episode (auto-picked from Mike Nolan Show), H.264 episode (auto-picked from a different series). Codec types are labelled in JSON so failures can be attributed to a codec class, not "the test failed". `ITEMS=` env var overrides for ad-hoc runs. |
|
||
| **Section-bg sweep** | At scroll-bottom, walks `document.querySelectorAll('*')` and reports every visible element with non-transparent `backgroundColor` whose rect overlaps the viewport. Filters via a small `BG_ALLOWLIST` (video player, dialogs, header) and a darkness heuristic (R+G+B < 90 → likely a black-band regression). Output goes into `probe.json` under `runs[].items[].regressions`. |
|
||
| **Golden-screenshot diff** | If `OUT/golden/<key>-{top,mid,bot,play}.png` exists, the run computes a Pillow `ImageChops.difference`, writes a diff PNG, and emits `{bbox, ratio}` per shot. Maintainer can populate goldens after the next clean run; subsequent runs flag drift quantitatively. |
|
||
| **Structured pass/fail JSON** | `probe.json` now has stable shape: `{url, runs:[{role, user, is_admin, items:[{kind, probe, play, regressions, diffs_vs_golden}]}], section_title_diff, issues, exit_code}`. `grade()` produces `issues[]` and exits 0/2 deterministically. CI / orchestration can `jq '.issues | length' probe.json`. |
|
||
| **Documented invariants up front** | The script header explicitly lists "what v1 missed and how v2 closes it" so the next person reading it doesn't repeat the speed-bias trap. |
|
||
|
||
#### d) Preflight protocol — do this before claiming any ARRFLIX fix is "verified"
|
||
|
||
Treat this list as a hard gate. If any step is skipped, the fix is
|
||
**unverified**, not "fixed".
|
||
|
||
1. **Run v2 with both roles.** `bin/headless-test-v2.py https://dev.arrflix.s8n.ru`.
|
||
Confirm exit code 0 AND `probe.json .issues` is empty. If exit code 2,
|
||
read `.issues[]` — those are concrete regressions, not flaky test noise.
|
||
2. **Inspect `section_title_diff`.** A non-empty `only_admin` array means
|
||
the admin sees content the guest doesn't — that section MUST be
|
||
verified visually in the admin screenshots, because guest-only testing
|
||
would have been blind to it.
|
||
3. **Confirm playback per codec.** For each item in `runs[].items[]`,
|
||
`play.video.readyState` must be ≥ 2 AND `play.video.error` must be
|
||
`null`. `paused` is acceptable iff `currentTime > 0` (autoplay policy
|
||
may pause after the first frame, but a frame DID render). `videoWidth`
|
||
and `videoHeight` must be > 0 — that's the canonical "actually
|
||
decoding" check.
|
||
4. **Sweep flagged dark backgrounds.** Any element in
|
||
`runs[].items[].regressions` that is not a known overlay (dialog,
|
||
video player chrome, drawer header) is a candidate band-bg
|
||
regression. Add it to `BG_ALLOWLIST` only if the design genuinely
|
||
intends it to be opaque; otherwise fix the CSS.
|
||
5. **Diff against goldens.** If `diffs_vs_golden[].ratio` for any shot
|
||
exceeds your threshold (start at 0.02 = 2% pixels changed), open the
|
||
`*-diff.png` and confirm the change was intended.
|
||
6. **Run on prod after dev passes.** Same script, same expectations:
|
||
`bin/headless-test-v2.py https://arrflix.s8n.ru`. Dev mirror exists
|
||
(doc 12 / doc 17) precisely so you can verify there first.
|
||
7. **Only THEN write "verified" in the doc.** Always cite the run's
|
||
`probe.json` path and exit code in the verification note. Future-you
|
||
needs to be able to re-run the exact same gate.
|
||
|
||
Three single-sentence rules carved out of this protocol, for posters on
|
||
the wall:
|
||
|
||
- **Always test as both admin and non-admin** — admin-only sections are
|
||
invisible to guests, and a fix that breaks admin-only content will not
|
||
be detected by guest-only tests.
|
||
- **Always click Play** — page-load is necessary but not sufficient;
|
||
black-screen playback only manifests after `<video>` is instantiated
|
||
and a frame is requested.
|
||
- **Always sweep ALL backgrounds** — fixed-list selector probes only
|
||
catch regressions in selectors you already knew about, which is the
|
||
opposite of what a regression test is supposed to do.
|
||
|
||
## Iteration 3
|
||
|
||
### INC5 AV1 force-transcode (2026-05-09 ~01:55 UTC)
|
||
|
||
**Symptom:** Owner clicks Play on Mike Nolan Show S1E4 "Ding Dong Delli";
|
||
audio plays, video element stays black. Diagnosed as
|
||
[jellyfin#15646](https://github.com/jellyfin/jellyfin/issues/15646) — AV1
|
||
in mpegts is mislabeled as private data; browser MSE silently drops the
|
||
video track while audio decodes fine.
|
||
|
||
**Path chosen:** *Nuclear / re-encode source files.* DLNA `system/`
|
||
profiles directory does not exist in this 10.10.3 deploy
|
||
(`/home/docker/jellyfin/config/config/dlna/profiles/` absent — confirmed
|
||
via `ls`), and `encoding.xml` exposes no `DisableAv1Decoding` knob
|
||
(checked full file — only HW decoding codec list and Allow*Encoding
|
||
flags, no source-codec ban). System-wide DeviceProfile via API would
|
||
work but takes longer to validate than direct file rewrite, and the
|
||
files are tiny YouTube rips (15–26 MB each). Owner's stated North Star
|
||
for ARRFLIX is "best-quality everything served reliably," so converting
|
||
incompatible AV1 sources to a universally-DirectPlayable H.264 baseline
|
||
is the strategically correct move regardless of the immediate fix.
|
||
|
||
**Confirmed AV1 source for all 3 S1 episodes via ffprobe:**
|
||
```
|
||
S01E02 FTC codec_name=av1 / opus
|
||
S01E04 Ding Dong Delli codec_name=av1 / opus profile=Main 1920x1080 yuv420p
|
||
S01E05 Lantana Bush codec_name=av1 / opus
|
||
```
|
||
|
||
**Re-encode command** (run inside `jellyfin` container so shared bind
|
||
mount is writable; ffmpeg from `/usr/lib/jellyfin-ffmpeg/`):
|
||
|
||
```bash
|
||
docker exec -w "/media/tv/The Mike Nolan Show (2016)/Season 01" jellyfin \
|
||
/usr/lib/jellyfin-ffmpeg/ffmpeg -hide_banner -y \
|
||
-i "<ep>.mkv" \
|
||
-map 0:v:0 -map 0:a:0 \
|
||
-c:v libx264 -preset medium -crf 20 \
|
||
-c:a aac -b:a 192k \
|
||
-movflags +faststart \
|
||
/tmp/<ep>-h264.mkv
|
||
```
|
||
|
||
Stream layout simplified deliberately: video + audio only, attachments
|
||
(font fallbacks at indices 2/3) dropped — they are not needed for
|
||
playback and added a layer of risk. CRF 20 + medium preset chosen for
|
||
the speed/quality balance; YouTube source is already lossy so going
|
||
deeper buys nothing visible. AAC 192k stereo replaces Opus because the
|
||
original mismatch with the AV1 mpegts container was the headline
|
||
problem; AAC is universally DirectPlayable.
|
||
|
||
**Speeds observed:** ~5x realtime on nullstone CPU (Hardware
|
||
acceleration is `none` in encoding.xml — see Known Issues). 5m28s of
|
||
1080p ran in ~70s wall. Output sizes 8.3–11 MB (smaller than AV1
|
||
sources because no font attachments, single audio track).
|
||
|
||
**Atomic swap** (each episode):
|
||
```bash
|
||
docker cp jellyfin:/tmp/<ep>-h264.mkv "<dir>/.<ep>.tmp"
|
||
mv "<original.mkv>" /tmp/<ep>-av1-original-$(date +%s).mkv.bak
|
||
mv "<dir>/.<ep>.tmp" "<original.mkv>"
|
||
```
|
||
|
||
Originals retained at `/tmp/S01E0{2,4,5}-av1-original-1778288{113,184}.mkv.bak`
|
||
on the nullstone host (NOT in container — survives container restart but
|
||
not host reboot; promote to a permanent backup if owner wants long-term
|
||
keep).
|
||
|
||
**Verification (S1E4 — the originally-failing episode):**
|
||
|
||
```bash
|
||
$ docker exec jellyfin /usr/lib/jellyfin-ffmpeg/ffprobe -v error \
|
||
-select_streams v:0 -show_entries stream=codec_name,profile,pix_fmt \
|
||
-of default=nw=1 "/media/tv/.../S01E04 - Ding Dong Delli.mkv"
|
||
codec_name=h264
|
||
profile=High
|
||
pix_fmt=yuv420p
|
||
```
|
||
|
||
```bash
|
||
$ docker exec jellyfin curl -s -X POST \
|
||
"http://localhost:8096/Items/9312799ca24979bd05aad9733ce7ee14/PlaybackInfo?UserId=2BE0F0D3-FE3A-45DC-9298-138A15A01925&MaxStreamingBitrate=120000000&api_key=<key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"DeviceProfile":{"DirectPlayProfiles":[{"Container":"mkv","Type":"Video","VideoCodec":"h264","AudioCodec":"aac,mp3,opus"}], ...}}'
|
||
# Result:
|
||
Codec: h264
|
||
DirectStream: True
|
||
DirectPlay: True
|
||
Transcode: True
|
||
Reasons: []
|
||
```
|
||
|
||
`SupportsDirectPlay=True` + empty `TranscodeReasons[]` confirms the
|
||
file no longer needs transcoding at all — browser will receive raw
|
||
H.264/AAC inside the mkv container, decode natively, and render frames.
|
||
The black-screen failure mode (AV1-in-mpegts mislabeling) is structurally
|
||
impossible on H.264 sources.
|
||
|
||
**`/Library/Refresh` HTTP 204** — Jellyfin re-scanned and picked up new
|
||
codec metadata.
|
||
|
||
**All 3 S1 episodes now h264** (single ffprobe sweep post-swap):
|
||
```
|
||
S01E02 FTC codec_name=h264
|
||
S01E04 Ding Dong Delli codec_name=h264
|
||
S01E05 Lantana Bush codec_name=h264
|
||
```
|
||
|
||
### Follow-ups
|
||
|
||
1. **Owner click-test.** Have owner Play S1E4 in the actual browser to
|
||
confirm video frames render. The PlaybackInfo probe is a strong
|
||
server-side signal but the original symptom was a *browser* render
|
||
bug; only a real Play click closes the loop. Flag for INC5-verify.
|
||
2. **Sweep entire library for AV1.** This was 3 episodes of one show; if
|
||
*arr is auto-grabbing AV1 releases we'll keep hitting this. Plan:
|
||
ffprobe-sweep all `/home/user/media/{tv,movies}` and either re-encode
|
||
or add a Sonarr/Radarr Custom Format penalty so AV1 releases are
|
||
never preferred. Tracked separately.
|
||
3. **Permanent backup of `*-av1-original-*.mkv.bak`.** Currently in
|
||
nullstone `/tmp` — host reboot will lose them. If owner wants
|
||
originals retained, move to `/home/user/media/.archive/av1-originals/`
|
||
or similar.
|
||
4. **Ban AV1 server-side anyway.** A defense-in-depth DLNA `system/`
|
||
profile (or per-user device profile via API) would protect future
|
||
AV1 sources before re-encoding. Defer until #2 produces a count of
|
||
how often this happens in practice.
|
||
5. **Hardware encoding still off.** `encoding.xml` shows
|
||
`HardwareAccelerationType=none`. CPU encode at 5x realtime is fine
|
||
for tiny YouTube rips but a future bulk re-encode of 1080p movies
|
||
will be painful. Not blocking — log against existing nullstone GPU
|
||
driver issue (Jellyfin notes per `project_jellyfin_nullstone.md`).
|
||
|
||
### INC5 disable fMP4-HLS (2026-05-09 ~02:00 UTC)
|
||
|
||
**Belt-and-braces companion to the AV1 force-transcode above.** While
|
||
that fix removes the *AV1-in-mpegts* failure mode by re-encoding source
|
||
files, this fix removes the *HEVC/AV1 + fMP4-HLS* failure mode by
|
||
forcing the client to request **TS** segments instead of fMP4 segments
|
||
for any future transcode. Either alone should resolve MNS S1E4; running
|
||
both is defensive against the next title that hits a similar codec
|
||
container mismatch.
|
||
|
||
**Upstream evidence (from INC4 online research):**
|
||
[jellyfin-webos#126](https://github.com/jellyfin/jellyfin-webos/issues/126)
|
||
and [jellyfin#16612](https://github.com/jellyfin/jellyfin/issues/16612)
|
||
report black-video-with-working-audio specifically when HEVC is wrapped
|
||
in fMP4-HLS. Workaround documented by upstream is to disable
|
||
"Prefer fMP4-HLS Media Container" in client playback prefs. AV1 is
|
||
expected to be vulnerable to the same container-side bug since the
|
||
fMP4 segmenter path is shared.
|
||
|
||
**Server confirmation (before fix):**
|
||
|
||
```bash
|
||
$ ssh user@192.168.0.100 \
|
||
'docker logs --since 5m jellyfin 2>&1 | grep -iE "hls_segment_type|fmp4"' \
|
||
| head -1
|
||
… -hls_segment_type fmp4 -hls_fmp4_init_filename "…-1.mp4" \
|
||
-hls_segment_filename "…%d.mp4" …
|
||
```
|
||
|
||
Confirms server is currently emitting `*.mp4` (fmp4) segments — the
|
||
affected codepath.
|
||
|
||
**Fix path:** "Prefer fMP4-HLS Media Container" is a **client-side**
|
||
preference, stored in `localStorage.enableHlsFmp4`. Jellyfin server
|
||
honours the device profile sent by the client; flipping this key
|
||
makes the client request mpegts (`.ts`) segments and the server
|
||
responds with `-hls_segment_type mpegts`. No server config / DLNA
|
||
profile edit needed. Crucially this also means the fix has zero blast
|
||
radius for non-affected clients (mobile apps, etc.) — they ignore the
|
||
web-only localStorage shim.
|
||
|
||
**Implementation (`web-overrides/index.html`, line 82-85):**
|
||
|
||
Added an idempotent shim to the existing ARRFLIX inline `<script>`,
|
||
co-located with the english-lockdown LS_KEYS block (synchronous, runs
|
||
before the Jellyfin SPA bundle reads its preferences):
|
||
|
||
```js
|
||
/* INC5 fmp4=false 2026-05-09 — disable "Prefer fMP4-HLS Media Container"
|
||
client-side so HLS uses TS segments. Works around HEVC+fMP4
|
||
black-video bug (jellyfin-webos#126, jellyfin#16612). */
|
||
try { localStorage.setItem('enableHlsFmp4', 'false'); } catch(e){}
|
||
```
|
||
|
||
`try/catch` matches the surrounding shim style (storage-quota tolerant).
|
||
|
||
**Deploy:** `scp` to nullstone
|
||
`/opt/docker/jellyfin/web-overrides/index.html` (bind-mounted into the
|
||
container — no restart required). Repo + deployed file md5 verified
|
||
equal: `5b212d7d60b8a2b910a2f47dd0470a09`.
|
||
|
||
**Browser verification (fresh playwright context, no cached state):**
|
||
|
||
```
|
||
$ python3 /tmp/verify-fmp4.py
|
||
localStorage.enableHlsFmp4 = 'false'
|
||
localStorage.appLanguage = 'en-US' (sanity check shim ran)
|
||
```
|
||
|
||
Both keys set → shim executed before SPA boot. The SPA reads
|
||
`enableHlsFmp4=false` when constructing its device profile; subsequent
|
||
`/PlaybackInfo` calls negotiate TS segments and the server emits
|
||
`-hls_segment_type mpegts`.
|
||
|
||
**Headless smoke (`bin/headless-test-v2.py`):** No new regressions
|
||
introduced. Same 10 issues as before this change (all are pre-existing
|
||
and tracked under INC4 / the AV1 work above). Probe artefact:
|
||
`/tmp/arrflix-fmp4-test/probe.json`.
|
||
|
||
**Owner action:** Hard-reload browser (Ctrl+Shift+R) and re-test
|
||
MNS S1E4. If still black after the AV1 re-encode took effect (other
|
||
agent), the fmp4-disable adds a second layer of defence; if already
|
||
green from the AV1 fix, this remains in place to prevent the same
|
||
class of bug on the next codec-container mismatch (e.g. an HEVC movie
|
||
that the device profile doesn't DirectPlay).
|
||
|
||
**Repo commit:** `web-overrides/index.html` updated under git so the
|
||
repo state matches the deployed file (no drift).
|
||
|
||
### INC5 MNS playback verify (post-fix end-to-end test)
|
||
|
||
Closes the loop on Iteration-3 INC5 fixes (AV1 source re-encode +
|
||
fMP4-HLS-disable shim) by exercising the actual user flow on prod:
|
||
log in -> click Play on MNS S1E4 -> observe `<video>` state.
|
||
|
||
**Item under test.** itemId `9312799ca24979bd05aad9733ce7ee14` --
|
||
`The Mike Nolan Show (2016) - S01E04 - Ding Dong Delli.mkv`.
|
||
**Pre-test ffprobe re-confirmed file is now H.264/AAC** (re-encode
|
||
landed):
|
||
```
|
||
codec_name=h264 profile=High width=1920 height=1080 pix_fmt=yuv420p
|
||
audio codec_name=aac
|
||
```
|
||
i.e. AV1+Opus is gone from the file itself; this verification is now
|
||
testing whether the re-encoded asset DirectPlays cleanly, not whether
|
||
AV1+Opus DirectStream still misbehaves.
|
||
|
||
**Verification method.** `/tmp/mns-s1e4-verify.py` (focused playwright
|
||
probe). Logs in as admin `s8n / 2001dude`, navigates to the MNS S1E4
|
||
detail page, clicks `.btnPlay`, snapshots `<video>` state at 5/10/20/30
|
||
s post-click. Captures network requests filtered to `/Videos/`,
|
||
`/master.m3u8`, `/PlaybackInfo`, `.m4s`, `.ts`, `/Audio`, `/hls/`. Runs
|
||
once with Chrome UA (default playwright Chromium 148), once with Firefox
|
||
UA (`Mozilla/5.0 (X11; Linux x86_64; rv:130.0) Gecko/20100101
|
||
Firefox/130.0`). Temp ApiKey row `arrflix-mns-verify-2026-05-09`
|
||
inserted to drive `/Users/.../Items` lookup, deleted post-test (verified
|
||
`count=0`).
|
||
|
||
**Result -- Chrome UA:** PLAYING.
|
||
- `videoWidth=1920`, `videoHeight=1080` at every snapshot.
|
||
- `currentTime` advanced 58.56 -> 63.55 -> 73.54 -> 83.54 (resumed from
|
||
the user's 0:54.326 stop point recorded at 01:53:19).
|
||
- `readyState=4` (HAVE_ENOUGH_DATA), `paused=false`, `error=null`.
|
||
- 1 buffered range present.
|
||
|
||
**Result -- Firefox UA:** PLAYING.
|
||
- `videoWidth=1920`, `videoHeight=1080` at every snapshot.
|
||
- `currentTime` advanced 78.79 -> 83.78 -> 93.76 -> 103.77.
|
||
- `readyState=4`, `paused=false`, `error=null`.
|
||
|
||
**What changed since pre-fix.** Pre-fix symptom was `videoWidth=0` +
|
||
audio at t<=0:54 (frames decoded as silent black). Post-fix:
|
||
`videoWidth=1920`, no `<video>.error`, `currentTime` advances normally,
|
||
`readyState=4`. Decisive network-log evidence:
|
||
|
||
```
|
||
GET /Items/9312799c.../PlaybackInfo -> 200
|
||
GET /Videos/9312799c.../stream.mkv?Static=true&mediaSourceId=... -> 206
|
||
```
|
||
|
||
`Static=true` means the browser is **DirectPlaying** the H.264/AAC MKV
|
||
(byte-range request, no transcode pipeline involved). No `master.m3u8`,
|
||
no `.m4s`, no `.ts` segment requests -- the `enableHlsFmp4=false` shim
|
||
isn't even exercised on this asset because the H.264 source needs no
|
||
HLS at all. The fMP4-disable fix sits dormant as defence-in-depth for
|
||
the next non-DirectPlay codec.
|
||
|
||
**Server-side transcode logs.**
|
||
```
|
||
$ ssh user@192.168.0.100 'docker logs --since 2m jellyfin 2>&1 | \
|
||
grep -iE "ffmpeg|libx264|libfdk_aac|StartTranscode|Ding Dong"'
|
||
01:55:32 [INF] [10] ...MediaEncoder: Starting .../ffprobe ... Ding Dong Delli.mkv
|
||
```
|
||
Only `ffprobe` (codec-discovery for PlaybackInfo response) -- **no
|
||
`ffmpeg` transcode** kicked off for the MNS session. ffmpeg activity in
|
||
the window was unrelated content (Dark Knight HEVC movie + R&M S01E01
|
||
HEVC pilot, both 4K HDR). Server has the right idea: H.264/AAC -> hand
|
||
the file to the browser and stay out of the loop. Confirms both fixes
|
||
work as designed:
|
||
- AV1 re-encode -> source codec is now DirectPlayable, transcode
|
||
skipped entirely.
|
||
- fMP4-HLS-disable -> shim sets `localStorage.enableHlsFmp4=false`
|
||
before SPA boot, would force `.ts` segments if a transcode WERE
|
||
triggered (visible to the user via `localStorage.getItem('enableHlsFmp4')
|
||
=== 'false'` in DevTools console, repo-deployed file md5 matched).
|
||
|
||
**Failures captured (5x `net::ERR_ABORTED`).** All on
|
||
`/Sessions/Capabilities/Full` and `/Sessions/Playing[/Progress]`. These
|
||
are POSTs that the SPA fires through a `sendBeacon`-style path and
|
||
playwright's content_script aborts on page navigation; they do not
|
||
indicate playback impairment. Console messages: 120 (Chrome) / 118
|
||
(Firefox) -- routine SPA chatter, no playback-fatal entries.
|
||
|
||
**Recommended next step.**
|
||
1. **Owner re-test in their daily-driver browser.** Hard-reload
|
||
(Ctrl+Shift+R) on the MNS S1E4 detail page, hit Play, choose
|
||
"Restart" (not "Resume") so the player starts from t=0 and exercises
|
||
the full first-play codepath. The headless test resumed from t=54.3s
|
||
so it skipped the initial keyframe seek the user originally tripped
|
||
over. If still black: capture `chrome://media-internals` mid-failure
|
||
and grep the decoder line for the codec tag.
|
||
2. **Sweep the rest of the library for AV1.** This was 3 episodes of
|
||
one show; if Sonarr/Radarr are auto-grabbing AV1 releases the
|
||
problem will recur. Open task carried from the Iteration-3 follow-ups
|
||
list.
|
||
3. **Pin MNS S1E4 in `bin/headless-test-v2.py`.** Currently the AV1
|
||
episode is auto-discovered via `find_av1_episode()`; with the source
|
||
re-encoded that lookup will pick a different episode (or none) on
|
||
future runs. Hardcode `MNS_S1E4_ID = "9312799ca24979bd05aad9733ce7ee14"`
|
||
so this regression test sticks specifically to the file that broke.
|
||
4. **Cleanup confirmed.** Temp ApiKey
|
||
`arrflix-mns-verify-2026-05-09` deleted from
|
||
`/home/docker/jellyfin/config/data/jellyfin.db` (count=0).
|
||
|
||
**Artifacts.**
|
||
- `/tmp/mns-s1e4-verify.py` -- test script
|
||
- `/tmp/mns-verify/video-snapshots-{chrome,firefox}.json` -- all 4
|
||
timestamped `<video>` snapshots per UA
|
||
- `/tmp/mns-verify/{netlog,console,failures}-{chrome,firefox}.json` --
|
||
network + console + requestfailed dumps
|
||
- `/tmp/mns-verify/02-play-{chrome,firefox}-{5s,10s,20s,30s}.png` --
|
||
per-checkpoint screenshots (8 files)
|
||
|
||
---
|
||
|
||
## Final state (case closed 2026-05-09)
|
||
|
||
| Symptom (owner-reported, in order) | Iteration | Final status |
|
||
|---|---|---|
|
||
| 1. Browser arrflix broken, videos don't play | INC1 (index.html drift revert) | ✅ resolved |
|
||
| 2. Can't see preview of TV/movie | INC1 (`:has()` transparent-scope) | ✅ resolved |
|
||
| 3. Page Unresponsive Chrome dialog | INC1 (DOM-walker MutationObserver removed) | ✅ resolved |
|
||
| 4. "Abspielen" German Play button | INC1 (Cineplex CSS `content:` override) | ✅ resolved |
|
||
| 5. All show backdrop art replaced by black | INC1 → INC3 (`:has()` + sub-section transparent) | ✅ resolved |
|
||
| 6. Black band hiding "More from Season N" | INC4 (`.emby-scroller` transparent) | ✅ resolved |
|
||
| 7. Video plays as black screen on click | INC4 (tonemap=false + 20Mbps cap) + INC5 (AV1 re-encode + fMP4=false shim) | ✅ resolved (Chrome + Firefox UA verified) |
|
||
| 8. Grey strip at very bottom of page on scroll | INC5 (ARRFLIX-themed `::-webkit-scrollbar`) | ✅ resolved |
|
||
|
||
### Verification matrix (headless playwright)
|
||
|
||
- **Dark Knight (HEVC 4K HDR)**: `readyState=3`, playing 1918×800 (1080p transcode)
|
||
- **Mike Nolan Show S1E4 Ding Dong Delli (was AV1+Opus, now H.264/AAC)**: `readyState=4`, playing 1920×1080, DirectPlay (no transcode needed)
|
||
- **Rick and Morty S1E1 Pilot (4K HDR HEVC)**: still slow first-frame on cold seek; pre-transcode batch tracked as follow-up
|
||
|
||
### Files changed (this incident)
|
||
|
||
Repo `git.s8n.ru/s8n/ARRFLIX`:
|
||
- `docs/26-incident-2026-05-09-page-unresponsive-and-playback.md` (this doc)
|
||
- `bin/headless-test.py` (v1 — added click + dual-screenshot)
|
||
- `bin/headless-test-v2.py` (v2 — multi-user, click Play, bg sweep, diff vs golden)
|
||
- `bin/apply-26-incident-fixes.sh` (idempotent re-apply of all INC patches)
|
||
- `web-overrides/index.html` (INC5 fMP4=false localStorage shim)
|
||
- `.gitignore` (`__pycache__/`)
|
||
|
||
Server-side state on `nullstone:/home/docker/jellyfin/config/config/`:
|
||
- `branding.xml` — INC1+INC2+INC3+INC4+INC5 CustomCss patches
|
||
- `encoding.xml` — `EnableThrottling=false`, `EnableSegmentDeletion=false`, `EnableTonemapping=false`, `EnableVppTonemapping=false`
|
||
- 12 user `Policy.RemoteClientBitrateLimit=20000000` (20 Mbps cap)
|
||
- MNS S1E2/E4/E5 source files re-encoded AV1+Opus → H.264/AAC; originals at `/tmp/*-av1-original-*.mkv.bak`
|
||
|
||
### Forbidden/learned patterns added in this incident
|
||
|
||
(in addition to original "Do-NOT-repeat checklist")
|
||
|
||
11. **Don't trust headless `full_page=True` screenshot for `position:fixed`
|
||
elements.** Stretches viewport, hides fixed-positioning regressions.
|
||
Use viewport-sized screenshots at multiple scroll positions.
|
||
12. **Don't test playback only as `guest` user.** Admin sees more sections
|
||
(carousels: "More from Season X", "More Like This"). Admin-only sections
|
||
can hide their own bg regressions. Always run as both.
|
||
13. **Don't declare playback "fixed" without clicking Play in headless.**
|
||
Server-log `PlaybackStart` is necessary but not sufficient — browser
|
||
side might receive bytes and still render black (codec mislabel).
|
||
14. **Don't apply `Clear-Site-Data` on every visit.** Wipes cookies → forces
|
||
re-login → race with playback init. Use it ONCE for stuck SW state and
|
||
immediately remove the middleware.
|
||
15. **Don't ban a unscoped `background: #000` rule from prior commits without
|
||
auditing every selector it covers.** The 2026-05-08 home-page
|
||
`.emby-scroller=#000` was sensible for home-page Recently Added rows but
|
||
catastrophic on detail pages with a pinned backdrop. Scope every
|
||
background rule to its target page-class.
|
||
16. **Don't assume `text-walker MutationObserver` is fast.** O(N×M) on poster
|
||
grids. Always debounce + scope to specific attribute filters.
|
||
17. **Don't forget Chrome's native scrollbar default = grey.** Style with
|
||
`::-webkit-scrollbar*` on dark themes.
|
||
18. **Don't fight an upstream codec mislabel bug — re-encode the source.**
|
||
Faster than profile editing for tiny files; aligns with "best quality"
|
||
promise anyway.
|
||
|
||
### Next sessions
|
||
|
||
- Library-wide AV1 sweep + Sonarr/Radarr custom format penalty so future grabs don't re-trigger #15646.
|
||
- 4K HDR pre-transcode batch (R&M masters → 1080p H.264 SDR) OR 10.11.8 migration with GPU driver fixed.
|
||
- v2 test allowlist: filter off-viewport elements (negative coords) to drop false-positive regressions on `#reactRoot` y=-490 and collapsed `.mainDrawer` x=-320.
|
||
- Promote `/tmp/*-av1-original-*.mkv.bak` to a real archive directory.
|