veilor-os/docs/research/2026-05-05-agent-wave/01-plymouth-luks-real-hardware.md
veilor-org 49a2e2557e docs: 9-agent research wave findings — v0.5.32 blocker map
Logs the full output of the 9-agent deep-dive run on 2026-05-05 to
docs/research/2026-05-05-agent-wave/. Pulls every actionable finding
into one indexed location so v0.5.32 planning has a paper trail.

Files:
  docs/research/2026-05-05-agent-wave/README.md             — index
  docs/research/2026-05-05-agent-wave/01-...real-hardware.md — Plymouth + LUKS edge cases
  docs/research/2026-05-05-agent-wave/02-...firstboot-ux.md  — SDDM + first-boot UX
  docs/research/2026-05-05-agent-wave/03-...spike-plan.md    — bootc-image-builder 1-week spike
  docs/research/2026-05-05-agent-wave/04-...tier-2.md         — AppArmor + nftables + audit + homed
  docs/research/2026-05-05-agent-wave/05-...launch.md         — threat model + v0.7 launch checklist
  docs/research/2026-05-05-agent-wave/06-...log-capture.md    — virtio-9p host-share for anaconda logs
  docs/research/2026-05-05-agent-wave/07-...skel-branding.md  — /etc/skel gap audit
  docs/research/2026-05-05-agent-wave/08-...ci-hardening.md   — SHA-pin actions + SBOM + SLSA L3
  docs/research/2026-05-05-agent-wave/09-...failure-modes.md  — real-hardware pessimistic audit

Plus the prior linter-applied:
  docs/ROADMAP.md      — Lessons learned section, v0.5.32 active block,
                          v0.6 promotion of veilor-postinstall + veilor-doctor,
                          v0.7 bootc spike scheduled
  docs/THREAT-MODEL.md  — drafted by Agent 5; in/out scope, comparison
                          matrix, v0.7 launch checklist

Top blockers identified for v0.5.32 (cross-cited in README):
  1. Suspend/resume wifi death (kernel.modules_disabled=1)
  2. veilor-firstboot.service WantedBy=graphical.target
  3. kernel-upgrade grub drift
  4. USBGuard hash-rules problem (already learned on onyx)
  5. firewalld blocks tailscale0
  6. /etc/skel/ empty
  7. virtio-9p log capture replaces broken virtio-serial path

Wave + verifier pattern (per ROADMAP lessons learned #4) validated:
9 parallel agents on distinct topics produced converging blocker
list. The same pattern landed v0.5.31 four-bug fix from the prior
4-agent verification wave on v0.5.30 outcome.
2026-05-05 14:52:53 +01:00

109 lines
4.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Plymouth + LUKS unlock — real-hardware edge cases
**Agent 1 of 9-agent wave, 2026-05-05.**
## State at v0.5.31
- Live ISO cmdline pins `plymouth.enable=0 fbcon=nodefer`.
- Installed system uses Plymouth `details` theme.
- LUKS2 argon2id, no clevis / cryptenroll, no recovery key generation.
- `rd.vconsole.keymap=` not set.
## Findings
### 1. KMS / fbcon races
- **Symptom:** Black screen at LUKS prompt, cursor blinks, keystrokes
swallowed but never accepted.
- **Cause:** `i915` / `amdgpu` / `nvidia-drm` modeset fires *during*
plymouthd handover. With `plymouth.enable=0` we skip the splash but
the ask-password agent still opens `/dev/tty1`, which races `fbcon`
rebind.
- **Fix:** keep `fbcon=nodefer`, append
`nvidia-drm.modeset=1 i915.fastboot=0 amdgpu.dc=1` to bootloader.
NVIDIA Optimus killer is `nvidia-drm.modeset=1`.
- **Probability:** HIGH on Optimus, MED on AMD APU, LOW on Intel iGPU.
### 2. Plymouth theme choice — keep `details`
- `details` (kernel/systemd journal under prompt) is best for
blind-typing because the user sees `Please enter passphrase…` *as
text*, full echo as `*`.
- `text` is minimal fallback (no echo, no journal).
- `spinner` is the documented "endless loop, no prompt" failure mode
on real laptops (adi1090x/plymouth-themes#10, Arch BBS 296529).
- **No change.** But verify `plymouth-set-default-theme details`
actually ran post-install (Debian #986023 shows it silently fails
when initramfs rebuild is suppressed). Add `dracut --force
--regenerate-all` after the call.
### 3. Initramfs keymap — HIGH probability for non-US users
- **Symptom:** AZERTY/QWERTZ/Cyrillic user types correct passphrase,
gets "no key available". F43 ships en-US in initramfs by default.
- **Bugs:** RHBZ 1405539, RHBZ 1890085, fedora-silverblue#3.
- **Fix:** drop a placeholder `rd.vconsole.keymap=us` AND have
`firstboot.sh` rewrite it from `/etc/vconsole.conf` after the user
picks a layout. Also `/etc/dracut.conf.d/veilor-keymap.conf` with
`install_items+=" /etc/vconsole.conf "` so keymap is *baked* into
initramfs.
### 4. systemd-cryptsetup vs legacy `crypt` — F43 = systemd-cryptsetup
- F40+ unconditionally uses `systemd-cryptsetup@.service` from
`/etc/crypttab`. Old `rd.luks.uuid=` cmdline still parsed. Stable
through 6.x kernels. No change needed.
### 5. argon2id memory cost — MED on old laptops (<8 GB RAM)
- LUKS2 default = 1 GiB memory cost, `iter-time=2000 ms`. On
Core 2 Duo / Pentium-N this becomes 815s unlock + thrash.
Atom-class N4020: 30s+.
- **Fix in installer post-script:**
`cryptsetup luksConvertKey --pbkdf-memory 524288 --iter-time 2000`
— halves memory to 512 MiB, knocks ~50% off unlock latency.
### 6. TPM2 unlock — defer to v0.6
- F43 ships `systemd-cryptenroll --tpm2-device=auto` ([Fedora
Magazine](https://fedoramagazine.org/automatically-decrypt-your-disk-using-tpm2/)).
No clevis required.
- **v0.6 plan:** opt-in via `veilor-firstboot`
`systemd-cryptenroll --tpm2-pcrs=7+11`. PCR 7 (secure boot state)
+ 11 (kernel/initrd). Don't auto-enroll; PCR pinning is a footgun
on kernel updates.
### 7. FIDO2 unlock — v0.7
- `systemd-cryptenroll --fido2-device=auto` requires `libfido2` +
hmac-secret support. secureblue ships this. Add `libfido2` to
`%packages` + `veilor-fido2-enroll` wrapper.
### 8. Recovery key — MISSING, ship in v0.6
- Today: forgotten passphrase = brick.
- **Fix:** in `firstboot.sh` add
`cryptsetup luksAddKey --pbkdf argon2id /dev/X <(systemd-creds
setup --print-key | head -c 64)` and print the 64-char key once
to a numbered envelope-style screen. Mirrors macOS FileVault.
## Action items
| # | Change | Target |
|---|--------|--------|
| 1 | `nvidia-drm.modeset=1 i915.fastboot=0 amdgpu.dc=1 rd.vconsole.keymap=us` to bootloader append | v0.5.32 |
| 2 | `/etc/dracut.conf.d/veilor-keymap.conf` with `install_items+=" /etc/vconsole.conf "` | v0.5.32 |
| 3 | Force `dracut -f --regenerate-all` after `plymouth-set-default-theme details` | v0.5.32 |
| 4 | argon2id retune (`40-luks-tune.sh`) | v0.6 |
| 5 | Recovery-key generation in firstboot | v0.6 |
| 6 | TPM2 opt-in via `systemd-cryptenroll --tpm2-pcrs=7+11` | v0.6 |
| 7 | FIDO2 opt-in | v0.7 |
## Sources
- [LUKS keyboard layout — fedora-silverblue/issue-tracker#3](https://github.com/fedora-silverblue/issue-tracker/issues/3)
- [RHBZ 1405539 — keymap not honored on initramfs rebuild](https://bugzilla.redhat.com/show_bug.cgi?id=1405539)
- [RHBZ 1890085 — English keymap forced in initramfs](https://bugzilla.redhat.com/show_bug.cgi?id=1890085)
- [Fedora Magazine — TPM2 autodecrypt with systemd-cryptenroll](https://fedoramagazine.org/automatically-decrypt-your-disk-using-tpm2/)
- [Leo3418 — argon2id LUKS tuning](https://leo3418.github.io/collections/gentoo-config-luks2-grub-systemd/tune-parameters.html)
- [QubesOS#8600 — argon2id parameters](https://github.com/QubesOS/qubes-issues/issues/8600)