Logs the full output of the 9-agent deep-dive run on 2026-05-05 to
docs/research/2026-05-05-agent-wave/. Pulls every actionable finding
into one indexed location so v0.5.32 planning has a paper trail.
Files:
docs/research/2026-05-05-agent-wave/README.md — index
docs/research/2026-05-05-agent-wave/01-...real-hardware.md — Plymouth + LUKS edge cases
docs/research/2026-05-05-agent-wave/02-...firstboot-ux.md — SDDM + first-boot UX
docs/research/2026-05-05-agent-wave/03-...spike-plan.md — bootc-image-builder 1-week spike
docs/research/2026-05-05-agent-wave/04-...tier-2.md — AppArmor + nftables + audit + homed
docs/research/2026-05-05-agent-wave/05-...launch.md — threat model + v0.7 launch checklist
docs/research/2026-05-05-agent-wave/06-...log-capture.md — virtio-9p host-share for anaconda logs
docs/research/2026-05-05-agent-wave/07-...skel-branding.md — /etc/skel gap audit
docs/research/2026-05-05-agent-wave/08-...ci-hardening.md — SHA-pin actions + SBOM + SLSA L3
docs/research/2026-05-05-agent-wave/09-...failure-modes.md — real-hardware pessimistic audit
Plus the prior linter-applied:
docs/ROADMAP.md — Lessons learned section, v0.5.32 active block,
v0.6 promotion of veilor-postinstall + veilor-doctor,
v0.7 bootc spike scheduled
docs/THREAT-MODEL.md — drafted by Agent 5; in/out scope, comparison
matrix, v0.7 launch checklist
Top blockers identified for v0.5.32 (cross-cited in README):
1. Suspend/resume wifi death (kernel.modules_disabled=1)
2. veilor-firstboot.service WantedBy=graphical.target
3. kernel-upgrade grub drift
4. USBGuard hash-rules problem (already learned on onyx)
5. firewalld blocks tailscale0
6. /etc/skel/ empty
7. virtio-9p log capture replaces broken virtio-serial path
Wave + verifier pattern (per ROADMAP lessons learned #4) validated:
9 parallel agents on distinct topics produced converging blocker
list. The same pattern landed v0.5.31 four-bug fix from the prior
4-agent verification wave on v0.5.30 outcome.
7.2 KiB
7.2 KiB
Real-hardware failure mode audit (post-v0.5.31)
Agent 9 of 9-agent wave, 2026-05-05. Pessimistic enumeration.
A. Boot path
A1. Secure Boot + GRUB_DISTRIBUTOR rebrand
- shim chain itself untouched (uses
/EFI/fedora/), butgrub2-mkconfigregenerates entries namingveilor-oswhile shim only trusts paths under/EFI/fedora/grubx64.efi. Strict UEFI: menu boots, kernel signatures verify via Fedora's MOK chain. Risk:os-proberwriting dual-boot Windows entries breaks MBR/MOK. - Symptom: dual-boot with Windows shows
Verification failed: (0x1A) Security Violation. - Prob: MED. Fix: S. Target: v0.5.32.
A2. KMS handoff — fbcon=nodefer necessary but not sufficient
- On Intel Arc/iGPU late-gen + NVIDIA proprietary chains, 5-15s blank
between vt switch and SDDM start because
simpledrmreleases beforei915/nvidia-drmclaim. - Symptom: ~10s blank pre-SDDM; user thinks crashed.
- Prob: HIGH. Fix: S — add
i915.modeset=1 nvidia-drm.modeset=1 amdgpu.modeset=1. SDDMType=simplestartup. Target: v0.5.32.
A3. USBGuard hash-based rules
scripts/20-harden-kernel.sh:127-131ships empty rules.conf withImplicitPolicyTarget=block. First boot, admin runsusbguard generate-policy. Perfeedback_usbguard_dock.md, this writes hash+parent-hash rules that break on dock replug.- Symptom: keyboard/mouse dies on first dock unplug-replug.
- Prob: HIGH. Fix: M — patch invocation to
--with-hash=false, or shipveilor-usbguard-enrollwrapper. Target: v0.5.32 (same bug we already learned).
A4. Wifi/Bluetooth firmware
@hardware-supportpullslinux-firmwareetc.- Realtek RTL8852/MT7921 firmware ships in
linux-firmware-whenceonly. - Prob: LOW. Fix: S (add explicit
linux-firmware-whence). Target: v0.5.32.
A5. Bluetooth disabled at boot
scripts/20-harden-kernel.sh:111disablesbluetoothservice. BT keyboards/mice don't pair until user enables service.- Prob: MED (laptop users). Fix: S — leave bluetooth.service
enabled, mask
obexonly. Target: v0.6.
B. First-boot KDE session
B1. Plasma 6 Wayland fallback on hybrid graphics
- SDDM config doesn't pin session. NVIDIA Optimus + intel-iris triggers Wayland → silent fallback to X11 on some HW.
- Symptom: screen tearing, no fractional scaling.
- Prob: MED. Fix: S — add
[Autologin] Session=plasma[General] DefaultSession=plasma.desktop. Target: v0.6.
B2. SUSPEND/RESUME KILLS WIFI — THE BIG ONE 🚨
veilor-modules-lock.servicesetskernel.modules_disabled=130s after graphical.target.iwlwifi,iwlmvm,cfg80211reload on resume from S3/S0ix. With modules locked: resume → permanent wifi death until reboot. Same fornvidiaautoload,xhci_pcire-init on dock attach.- Symptom: close laptop lid → reopen → no wifi, no dock USB, until reboot.
- Prob: VERY HIGH (every laptop user, day 1).
- Fix: M — gate lock on
ConditionACPower=true+ reset on suspend, OR move frommodules_disabledtomodule.sig_enforce=1kernel cmdline (no runtime lock needed). - Target: v0.5.32 — BLOCKER.
B3. Lid-close handling
logind.confnot modified. DefaultsHandleLidSwitch=suspend. Combined with B2, every lid close = wifi loss.- Prob: HIGH. Fix: S. Target: v0.5.32 (paired with B2).
C. Day-2 ops
C1. /etc/default/grub + /etc/kernel/cmdline drift
- Kickstart writes
GRUB_CMDLINE_LINUX_DEFAULT="". Real installer writes/etc/kernel/cmdlinewith LUKS rd.luks args.kernel-installreads the latter;grub2-mkconfigre-reads/etc/default/grub. - Symptom:
dnf upgrade kernelregenerates grub.cfg from default/grub, drops LUKS unlock args from new entry → unbootable. - Prob: HIGH. Fix: M — sync both files in
veilor-update, or migrate fully to BLS without grub-mkconfig. - Target: v0.5.32.
C2. SELinux relabel on first boot
firstboot.shflips toenforcingandtouch /.autorelabel. On large /home (encrypted btrfs), relabel takes 2-5min — user sees frozen screen with cursor.- Symptom: stuck "first boot" appears hung.
- Prob: MED. Fix: S (add plymouth message). Target: v0.6.
C3. F44 upgrade
- Hardcoded
python3.14path (kickstart:334) for transaction_progress.py patch. Survives no upgrade. - Prob: certainty by Nov 2026. Fix: M. Target: v0.6.
C4. chrony NTS unreachable from corp networks
- Cloudflare NTS over UDP 4460 blocked by many corp firewalls. chronyd will fail-stop sync.
- Symptom: clock skew → TLS failures → broken everything.
- Prob: MED. Fix: S (add fallback
poolline — already present, verify ordering). Target: v0.5.32.
D. Networking
D1. firewalld drop zone vs Tailscale 🚨
tailscale uprequires UDP 41641 + tailscale0 trusted. Defaultdropzone blocks tailscale0.- Prob: HIGH (this user uses Tailscale daily).
- Fix: S — ship
/etc/firewalld/zones/trusted.xmlwithtailscale0interface. - Target: v0.5.32.
D2. systemd-resolved DoT vs corp split-DNS
- No /etc/resolved.conf.d entries shipped (overlay dir empty).
- Corp internal hostnames fail.
- Prob: LOW. Fix: M. Target: v0.7.
E. Hardware diversity
E1. NVMe vs SATA LUKS perf
- Argon2id KDF tuned to memory, not IO.
- Prob: cosmetic. Skip.
E2. ARM aarch64
- Out of scope for v0.5/0.6.
E3. TPM2 unlock
- Already on roadmap. Target: v0.7.
Top 10 ranked (prob × severity)
| # | Issue | Prob | Sev | Target |
|---|---|---|---|---|
| 1 | B2 Suspend/resume wifi death (modules_disabled) | VHIGH | CRITICAL | v0.5.32 |
| 2 | C1 kernel-upgrade grub drift (LUKS args lost) | HIGH | CRITICAL | v0.5.32 |
| 3 | A3 USBGuard hash rules (dock replug) | HIGH | HIGH | v0.5.32 |
| 4 | D1 firewalld blocks tailscale0 | HIGH | HIGH | v0.5.32 |
| 5 | A2 KMS blank-screen 10s | HIGH | MED | v0.5.32 |
| 6 | B3 Lid-close suspend (compounds B2) | HIGH | MED | v0.5.32 |
| 7 | A1 Secure Boot + os-prober dual-boot | MED | HIGH | v0.6 |
| 8 | C4 NTS blocked corp | MED | MED | v0.5.32 |
| 9 | B1 Plasma Wayland fallback | MED | MED | v0.6 |
| 10 | C3 F44 path-pinned patch | CERTAIN | LOW (Nov) | v0.6 |
Top 5 to preempt in v0.5.32
- B2 modules-lock vs resume — gate on no-pending-suspend, OR swap
to
module.sig_enforce=1kernel cmdline. - C1 cmdline drift — make
veilor-updatefail-loud if/etc/kernel/cmdlineand/etc/default/grubdiverge; regen BLS on every kernel install. - A3 USBGuard id-based rules —
veilor-usb-enrollwrapper that callsusbguard generate-policy --with-hash=false. Same fix that already burned us on onyx. - D1 Tailscale zone — ship
/etc/firewalld/zones/trusted.xmllistingtailscale0, plus NetworkManager dispatcher to assign it. - A2 KMS handoff — append
i915.modeset=1 amdgpu.modeset=1 nvidia-drm.modeset=1to bootloader cmdline.
Critical insight: B2 alone bricks the laptop for any user who closes their lid. Without that fix, v0.5.32 is shippable on desktops only. Same architectural class as the LUKS bug — security feature breaks legitimate kernel state transitions.