# Real-hardware failure mode audit (post-v0.5.31) **Agent 9 of 9-agent wave, 2026-05-05.** Pessimistic enumeration. ## A. Boot path ### A1. Secure Boot + GRUB_DISTRIBUTOR rebrand - shim chain itself untouched (uses `/EFI/fedora/`), but `grub2-mkconfig` regenerates entries naming `veilor-os` while shim only trusts paths under `/EFI/fedora/grubx64.efi`. Strict UEFI: menu boots, kernel signatures verify via Fedora's MOK chain. Risk: `os-prober` writing dual-boot Windows entries breaks MBR/MOK. - **Symptom:** dual-boot with Windows shows `Verification failed: (0x1A) Security Violation`. - **Prob:** MED. **Fix:** S. **Target:** v0.5.32. ### A2. KMS handoff — `fbcon=nodefer` necessary but not sufficient - On Intel Arc/iGPU late-gen + NVIDIA proprietary chains, 5-15s blank between vt switch and SDDM start because `simpledrm` releases before `i915`/`nvidia-drm` claim. - **Symptom:** ~10s blank pre-SDDM; user thinks crashed. - **Prob:** HIGH. **Fix:** S — add `i915.modeset=1 nvidia-drm.modeset=1 amdgpu.modeset=1`. SDDM `Type=simple` startup. **Target:** v0.5.32. ### A3. USBGuard hash-based rules - `scripts/20-harden-kernel.sh:127-131` ships **empty** rules.conf with `ImplicitPolicyTarget=block`. First boot, admin runs `usbguard generate-policy`. Per `feedback_usbguard_dock.md`, this writes hash+parent-hash rules that break on dock replug. - **Symptom:** keyboard/mouse dies on first dock unplug-replug. - **Prob:** HIGH. **Fix:** M — patch invocation to `--with-hash=false`, or ship `veilor-usbguard-enroll` wrapper. **Target:** v0.5.32 (same bug we already learned). ### A4. Wifi/Bluetooth firmware - `@hardware-support` pulls `linux-firmware` etc. - Realtek RTL8852/MT7921 firmware ships in `linux-firmware-whence` only. - **Prob:** LOW. **Fix:** S (add explicit `linux-firmware-whence`). **Target:** v0.5.32. ### A5. Bluetooth disabled at boot - `scripts/20-harden-kernel.sh:111` disables `bluetooth` service. BT keyboards/mice don't pair until user enables service. - **Prob:** MED (laptop users). **Fix:** S — leave bluetooth.service enabled, mask `obex` only. **Target:** v0.6. ## B. First-boot KDE session ### B1. Plasma 6 Wayland fallback on hybrid graphics - SDDM config doesn't pin session. NVIDIA Optimus + intel-iris triggers Wayland → silent fallback to X11 on some HW. - **Symptom:** screen tearing, no fractional scaling. - **Prob:** MED. **Fix:** S — add `[Autologin] Session=plasma` + `[General] DefaultSession=plasma.desktop`. **Target:** v0.6. ### B2. SUSPEND/RESUME KILLS WIFI — THE BIG ONE 🚨 - `veilor-modules-lock.service` sets `kernel.modules_disabled=1` 30s after graphical.target. `iwlwifi`, `iwlmvm`, `cfg80211` reload on resume from S3/S0ix. With modules locked: **resume → permanent wifi death until reboot**. Same for `nvidia` autoload, `xhci_pci` re-init on dock attach. - **Symptom:** close laptop lid → reopen → no wifi, no dock USB, until reboot. - **Prob:** VERY HIGH (every laptop user, day 1). - **Fix:** M — gate lock on `ConditionACPower=true` + reset on suspend, OR move from `modules_disabled` to `module.sig_enforce=1` kernel cmdline (no runtime lock needed). - **Target:** v0.5.32 — **BLOCKER**. ### B3. Lid-close handling - `logind.conf` not modified. Defaults `HandleLidSwitch=suspend`. Combined with B2, every lid close = wifi loss. - **Prob:** HIGH. **Fix:** S. **Target:** v0.5.32 (paired with B2). ## C. Day-2 ops ### C1. `/etc/default/grub` + `/etc/kernel/cmdline` drift - Kickstart writes `GRUB_CMDLINE_LINUX_DEFAULT=""`. Real installer writes `/etc/kernel/cmdline` with LUKS rd.luks args. `kernel-install` reads the latter; `grub2-mkconfig` re-reads `/etc/default/grub`. - **Symptom:** `dnf upgrade kernel` regenerates grub.cfg from default/grub, drops LUKS unlock args from new entry → unbootable. - **Prob:** HIGH. **Fix:** M — sync both files in `veilor-update`, or migrate fully to BLS without grub-mkconfig. - **Target:** v0.5.32. ### C2. SELinux relabel on first boot - `firstboot.sh` flips to `enforcing` and `touch /.autorelabel`. On large /home (encrypted btrfs), relabel takes 2-5min — user sees frozen screen with cursor. - **Symptom:** stuck "first boot" appears hung. - **Prob:** MED. **Fix:** S (add plymouth message). **Target:** v0.6. ### C3. F44 upgrade - Hardcoded `python3.14` path (kickstart:334) for transaction_progress.py patch. Survives no upgrade. - **Prob:** certainty by Nov 2026. **Fix:** M. **Target:** v0.6. ### C4. chrony NTS unreachable from corp networks - Cloudflare NTS over UDP 4460 blocked by many corp firewalls. chronyd will fail-stop sync. - **Symptom:** clock skew → TLS failures → broken everything. - **Prob:** MED. **Fix:** S (add fallback `pool` line — already present, verify ordering). **Target:** v0.5.32. ## D. Networking ### D1. firewalld drop zone vs Tailscale 🚨 - `tailscale up` requires UDP 41641 + tailscale0 trusted. Default `drop` zone blocks tailscale0. - **Prob:** HIGH (this user uses Tailscale daily). - **Fix:** S — ship `/etc/firewalld/zones/trusted.xml` with `tailscale0` interface. - **Target:** v0.5.32. ### D2. systemd-resolved DoT vs corp split-DNS - No /etc/resolved.conf.d entries shipped (overlay dir empty). - Corp internal hostnames fail. - **Prob:** LOW. **Fix:** M. **Target:** v0.7. ## E. Hardware diversity ### E1. NVMe vs SATA LUKS perf - Argon2id KDF tuned to memory, not IO. - **Prob:** cosmetic. Skip. ### E2. ARM aarch64 - Out of scope for v0.5/0.6. ### E3. TPM2 unlock - Already on roadmap. **Target:** v0.7. ## Top 10 ranked (prob × severity) | # | Issue | Prob | Sev | Target | |---|-------|------|-----|--------| | 1 | **B2 Suspend/resume wifi death** (modules_disabled) | VHIGH | CRITICAL | v0.5.32 | | 2 | **C1 kernel-upgrade grub drift** (LUKS args lost) | HIGH | CRITICAL | v0.5.32 | | 3 | **A3 USBGuard hash rules** (dock replug) | HIGH | HIGH | v0.5.32 | | 4 | **D1 firewalld blocks tailscale0** | HIGH | HIGH | v0.5.32 | | 5 | **A2 KMS blank-screen 10s** | HIGH | MED | v0.5.32 | | 6 | **B3 Lid-close suspend** (compounds B2) | HIGH | MED | v0.5.32 | | 7 | **A1 Secure Boot + os-prober dual-boot** | MED | HIGH | v0.6 | | 8 | **C4 NTS blocked corp** | MED | MED | v0.5.32 | | 9 | **B1 Plasma Wayland fallback** | MED | MED | v0.6 | | 10 | **C3 F44 path-pinned patch** | CERTAIN | LOW (Nov) | v0.6 | ## Top 5 to preempt in v0.5.32 1. **B2 modules-lock vs resume** — gate on no-pending-suspend, OR swap to `module.sig_enforce=1` kernel cmdline. 2. **C1 cmdline drift** — make `veilor-update` fail-loud if `/etc/kernel/cmdline` and `/etc/default/grub` diverge; regen BLS on every kernel install. 3. **A3 USBGuard id-based rules** — `veilor-usb-enroll` wrapper that calls `usbguard generate-policy --with-hash=false`. Same fix that already burned us on onyx. 4. **D1 Tailscale zone** — ship `/etc/firewalld/zones/trusted.xml` listing `tailscale0`, plus NetworkManager dispatcher to assign it. 5. **A2 KMS handoff** — append `i915.modeset=1 amdgpu.modeset=1 nvidia-drm.modeset=1` to bootloader cmdline. **Critical insight:** B2 alone bricks the laptop for any user who closes their lid. Without that fix, v0.5.32 is shippable on desktops only. Same architectural class as the LUKS bug — security feature breaks legitimate kernel state transitions.