Critical install bug fix + cosmetic round-up + first formal test procedure document. ## Critical: LUKS unlock on first boot Generated installer kickstart's %post was injecting `rd.luks.uuid=…` into `/etc/default/grub` only. Fedora 43 uses BLS (Boot Loader Specification) entries in `/boot/loader/entries/*.conf`; those are NOT regenerated by `grub2-mkconfig`. Result: the kernel boots without `rd.luks.uuid=`, dracut's cryptsetup-generator never spawns the unlock unit, plymouth has no password to ask for, and dracut-initqueue loops on dev-disk-by-uuid for ~3min before dropping to emergency shell. The fix layers both write paths: - `/etc/default/grub` — keeps the args around for future kernels (kernel-install reads this when adding new entries). - `grubby --update-kernel=ALL --args=...` — rewrites the `options` line of every existing BLS entry so the kernel that boots NEXT actually has the args. Verified by reading `/proc/cmdline` from the dracut emergency shell on a v0.5.26 install; old cmdline had only `root=UUID=… ro rootflags=subvol=root` and was missing the LUKS arg entirely. ## GRUB / branding - `/etc/default/grub` is sed'd to `GRUB_DISTRIBUTOR="veilor-os"` (was already there, kept). - BLS entries' `title` line is rewritten in-place to "veilor-os (<kver>)" for every kernel — `grub2-mkconfig` does not touch BLS titles, so this is the only path. - `/boot/loader/entries/*-0-rescue-*.conf` is removed: the auto-built rescue entry was leaking "Fedora Linux" into the GRUB menu and showing a second boot option that nobody asked for. The rescue kernel image itself is left in /boot. - Hostname defaults to `veilor` (was inheriting the `localhost-live` name anaconda writes when the kickstart's network directive is ignored under cmdline mode). - `/etc/machine-info` adds `PRETTY_HOSTNAME="veilor-os"` so `hostnamectl status` and any consumer reading machine-info see the brand. ## Boot UX - `fbcon=nodefer` added to live-ISO bootloader cmdline. On real laptops with a hardware GPU, the kernel modeset blanks the framebuffer console mid-boot; without `nodefer` the installer banner draws into a frozen framebuffer and the user sees a black screen with a blinking cursor for ~30s. virtio-vga in QEMU doesn't trigger this so it never reproduced in VM. Symptom report on v0.5.26 was the trigger to investigate. ## Installer cosmetics - `GUM_CHOOSE_CURSOR` and `GUM_INPUT_PROMPT` switched from `❯ ` to `> `. The unicode arrow falls back to a fixed-width block on the linux fbcon font and lipgloss then duplicates that block at col +23, producing the "Install Install" double-render and the stray-T artifact in password fields. Plain ASCII renders identically across fbcon, virtio-vga, and X/Wayland gum runs. - `VERSION_ID` bumped 0.5.8 → 0.5.27 in the os-release drop-in. The installer banner reads this at runtime, so the live ISO + installed system both now show "veilor-os 0.5.27". ## Test procedure - `test/TESTING.md` — first canonical test procedure document. Splits VM (cheap iteration, hybrid sendkey + human passwords) from real hardware (mandatory for tag). Documents the standard test passwords (`veilortest1` for both LUKS and admin), the kill-and-relaunch step to skip CD on second boot, and the per-step pass/fail contract. - `test/METHOD-CHANGELOG.md` — append-only audit trail for changes to the procedure. Future releases that alter the test method must add an entry here with the why. - `test/test-runs/_TEMPLATE.md` — per-run report template. Each tagged release should land a filled report alongside it. ## test/run-vm.sh Decoupled QEMU monitor sock setup from auto-inject. Previously `NO_INJECT=1` (used to suppress autotype noise into prompts) also killed the monitor sock, leaving the VM undriveable. Monitor sock is now always exposed; only the inject helper is gated on the pubkey detection.
7.4 KiB
veilor-os — Testing Procedure
This document is the canonical procedure for validating a veilor-os ISO
end-to-end. Every release that gets a tag MUST have a corresponding
test-run report in test/test-runs/ linked from the release notes.
If reality forces you to deviate from the steps below, do not silently
patch the procedure — open a commit that updates this file and
appends an entry to test/METHOD-CHANGELOG.md explaining what changed
and why. The changelog is what makes the procedure auditable; the
procedure itself is just the latest snapshot.
Two test environments
| Environment | Catches | Doesn't catch |
|---|---|---|
| VM (QEMU + virtio-vga) | install logic, kickstart bugs, %post failures, anaconda transaction failures, GRUB write, BLS entries, package selection, network stack | KMS / fbcon issues, real-firmware Secure Boot, USB controller quirks, GPU driver compatibility, sleep/wake, battery, thermals |
| Real hardware (USB → spare laptop) | everything VM doesn't | install repeatability (you only have so many spare laptops) |
Both are required for any tagged release. VM first (cheap iteration), real hardware second (final sign-off).
VM test — hybrid procedure
The VM cannot type LUKS / admin passwords through QEMU's sendkey
monitor command — plymouth's IPC ignores synthesised keystrokes (we
verified this across 14+ sendkey variants in earlier sessions). The
hybrid procedure splits the work: Claude/automation drives every step
that doesn't need a password; the human types the two passwords (LUKS
- admin) into the QEMU window directly.
Standard test passwords (lab use only — never reuse outside this repo):
| Prompt | Type |
|---|---|
| LUKS passphrase | veilortest1 |
| Admin password | veilortest1 |
Both passwords identical on purpose — easier to remember mid-test, both
satisfy the installer's 8-char min, neither contains shell-special
chars (validate_pw rejects " $ \ \ & | / \n`).
Run a VM test
cd ~/ai-lab/_github/veilor-os
# Pull the ISO you want to test (from a CI release or local build)
ls /home/admin/Downloads/veilor-os-*.iso
# Wipe stale state, launch VM with monitor sock (no auto-inject — we
# don't want sendkey noise typing into prompts)
FRESH=1 NO_INJECT=1 DISPLAY=:0 ./test/run-vm.sh \
/home/admin/Downloads/veilor-os-43-YYYYMMDD-HHMMSS.iso
Then either (a) drive the install yourself in the QEMU window, or (b) hand the monitor sock to Claude / a script:
- Monitor sock:
test/veilor-vm.monitor.sock - Send a key:
echo "sendkey ret" | socat - "UNIX-CONNECT:$SOCK" - Screendump:
echo "screendump /tmp/x.ppm" | socat - "UNIX-CONNECT:$SOCK"; magick /tmp/x.ppm /tmp/x.png
Steps to verify
The complete checklist lives in test/boot-checklist.md — that file is
the granular pass/fail list. The high-level flow is:
- Live boot. GRUB (legacy menu, no Plymouth splash) → text scroll → veilor-installer banner on tty1 within ~30s. No "fedora" branding anywhere on screen.
- Installer menu. "Install" highlighted by default. No phantom duplicate items, no stray characters in input fields.
- Disk picker.
/dev/vda(or whatever virtio gives you) listed with size + model. - Passwords. LUKS + admin prompts; user types
veilortest1twice. - Locale. en_GB.UTF-8 picks up.
- Confirm. Disk shown with
WILL BE ERASED, locale + LUKS/admin ticks shown. - Anaconda. "Installing veilor-os to /dev/vda · 10–30 min · logs
on tty4". Watch for
Configuring man-db— if anything fails, this is historically where it dies. - Reboot. VM reboots; ISO must NOT boot first this time. Kill QEMU + relaunch without ISO drive (see Boot installed disk below) to skip the GRUB-from-ISO path.
- GRUB. Single "veilor-os" entry (no rescue, no "Fedora Linux").
- LUKS prompt. Plymouth
detailstheme — text-mode prompt for passphrase. User typesveilortest1in the QEMU window (sendkey will not work). - First boot. SDDM splash → admin user pre-filled → admin types
veilortest1→ password-change prompt (chage -d 0 expired the password) → user picks new password → KDE Plasma session. - Hardening checks per
test/boot-checklist.md(SELinux enforcing, fail2ban active, USBGuard active, tuned profile, etc.).
Boot installed disk (skip ISO)
After the install reboots, QEMU's CD-first boot order will land back
in the live ISO. Easiest workaround: kill QEMU and re-launch without
the -drive file=...iso line. The qcow2 retains the install:
pkill -f 'qemu-system.*veilor-os'
cd ~/ai-lab/_github/veilor-os/test
DISPLAY=:0 qemu-system-x86_64 \
-enable-kvm -cpu host -smp 4 -m 4096 \
-machine q35,smm=on \
-global driver=cfi.pflash01,property=secure,value=on \
-drive if=pflash,format=raw,readonly=on,file=/usr/share/edk2/ovmf/OVMF_CODE.fd \
-drive if=pflash,format=raw,file=$PWD/veilor-vm.nvram \
-drive file=$PWD/veilor-vm.qcow2,if=virtio,format=qcow2 \
-monitor unix:$PWD/veilor-vm.monitor.sock,server,nowait \
-netdev user,id=net0,hostfwd=tcp::2222-:22 \
-device virtio-net-pci,netdev=net0 \
-vga virtio -display gtk,gl=on
Real-hardware test — USB → spare laptop
Required for every tagged release. The VM cannot reproduce KMS / fbcon / GPU-driver issues; only real silicon will.
1. Flash USB
# 8GB+ USB stick, identified by lsblk (e.g. /dev/sda — confirm vendor)
sudo umount /dev/sdX* 2>/dev/null
sudo wipefs -a /dev/sdX
sudo dd if=/path/to/veilor-os-*.iso of=/dev/sdX bs=4M status=progress conv=fsync
sync
sudo eject /dev/sdX
Etcher / GNOME Disks also fine. Verify-after-flash is built into
Etcher; for dd, run cmp on the first ISO_SIZE bytes if paranoid.
2. Boot test
- Disable Secure Boot in firmware (until we MOK-enroll our shim, which is v0.5+).
- Boot from USB.
- Walk the same numbered steps as the VM section, except:
- On "TYPE NOW: passphrase" steps, you actually have a keyboard.
- At step 8, the laptop will eject the USB and reboot to the installed system without intervention.
- At step 11, do NOT use
veilortest1for the post-install admin password change — pick something real if this is your daily-driver laptop, or a throwaway if it's a test machine. The kickstart's ChainOfTrust ends here; from this prompt forward you own the password.
3. Capture findings
Fill in a fresh test/test-runs/YYYY-MM-DD-vX.Y.Z.md from the
template. Always capture: GRUB title, kernel cmdline (cat /proc/cmdline), lsblk -f, getenforce, systemctl is-active fail2ban usbguard tuned auditd firewalld, journalctl -b -p err --no-pager.
If anything regressed, that goes at the top of the report under Regressions, with a screenshot if possible.
Per-run report template
Copy test/test-runs/_TEMPLATE.md (created when the first real
test-run lands) and fill in section-by-section. Keep them brief —
this is meant to be a 5-minute write-up, not a thesis.
When to alter this procedure
If a step turns out to be wrong, redundant, or missing:
- Edit this file.
- Append to
test/METHOD-CHANGELOG.mdwith: date, version it first applied to, what changed, and why (cite a specific test-run report if the change is in response to a finding). - Reference the changelog entry in your commit message.
The changelog is the audit trail. Don't skip it.