History

veilor-org e83483a077 v0.5.30: broad error suppression + manual bootloader + virtio log capture Three-layer fix for the persistent anaconda transaction failure that killed v0.5.28 (gen_grub_cfgstub) and v0.5.29 (aggregate dnf5 error). ## Layer 1: broad error suppression in transaction_progress.py dnf5 under RPM 6.0 + cmdline anaconda emits a final aggregate `error("transaction process has ended with errors..")` at end of transaction whenever its internal failure counter > 0, regardless of whether we suppressed individual script_error events. Reproduced twice. The narrow patch in v0.5.29 suppressed per-package errors but the aggregate still raised PayloadInstallationError and aborted the install before the bootloader phase ran. v0.5.30 patch turns the `elif token == 'error':` branch in process_transaction_progress into a log.warning. All four producers (cpio_error, script_error, unpack_error, generic error) now flow through to a warning + continue. Pattern matches both the original anaconda layout AND the v0.5.29 narrow-patched layout, so re-applying on top of either is a no-op. This brings us back to v0.5.28 broad-suppression behaviour. The side effect that bit us in v0.5.28 (silent grub2-efi-x64 scriptlet failure → empty /boot/efi/EFI/fedora/ → gen_grub_cfgstub fails) is addressed by Layer 2 below. ## Layer 2: bootloader install moved out of anaconda The generated install kickstart now has `bootloader --location=none`, which tells anaconda NOT to invoke its own bootloader install code path (and therefore NOT to call gen_grub_cfgstub). All grub work moves into the chroot %post block: 1. `dnf reinstall grub2-efi-x64 grub2-pc grub2-tools shim-x64 efibootmgr` — re-runs scriptlets in the chroot with full PID 1 systemd state, so the systemd-run-style triggers that anaconda's chroot truncates actually execute. 2. `grub2-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=fedora --no-nvram` — populates /boot/efi/EFI/fedora/ 3. `gen_grub_cfgstub /boot/grub2 /boot/efi/EFI/fedora` (or `grub2-mkconfig` fallback) — writes /boot/efi/EFI/fedora/grub.cfg. 4. `efibootmgr -c -d <disk> -p <part> -L "veilor-os" -l \EFI\fedora\shimx64.efi` — registers the NVRAM boot entry pointing at the signed shim. Each step logs to stdout and continues on failure (`set +e` block); diagnostics surface in the install log without aborting the whole %post. ## Layer 3: virtio-serial log capture in run-vm.sh Anaconda 43.x autodetects `/dev/virtio-ports/org.fedoraproject.anaconda.log.0` and streams program/packaging/storage/anaconda logs through it in real time, before any tmpfs / pivot, before networking, surviving kernel panic. Wiring it into run-vm.sh means the host gets a tail-able log file at `test/anaconda-vm-YYYYMMDD-HHMMSS.log` for every VM run. We've lost logs three times in a row to anaconda failures + tmpfs reboots. This breaks the loop. ## Diagnostic story Before this commit: VM aborts → live ISO reboots itself → /tmp/ tmpfs gone → no logs → guess what failed. Three days, two and a half false fixes. After this commit: VM aborts → host has /home/admin/ai-lab/_github/veilor-os/test/anaconda-vm-*.log with the actual scriptlet output, the actual exit codes, the actual file-trigger failures. Future debug becomes evidence-based. Files changed: kickstart/veilor-os.ks — broad error suppression patch overlay/usr/local/bin/veilor-installer — --location=none + manual grub test/run-vm.sh — virtio-serial chardev wiring Verified: bash -n clean, ksvalidator clean.		2026-05-05 11:59:35 +01:00
..
test-runs	v0.5.27: rd.luks.uuid via grubby, GRUB rebrand, fbcon=nodefer, ASCII gum cursor	2026-05-05 01:43:00 +01:00
auto-install-keymap.sh	v0.5.5: autonomous install test harness (#12 )	2026-05-02 22:49:51 +01:00
auto-install.sh	test/auto-install.sh: auto-fetch + reassemble chunked ISO from ci-latest	2026-05-02 22:50:37 +01:00
boot-checklist.md	veilor-os v0.1 scaffold — kickstart + hardening + 3-mode power + DuckSans-ready KDE black theme	2026-04-30 03:43:33 +01:00
METHOD-CHANGELOG.md	v0.5.27: rd.luks.uuid via grubby, GRUB rebrand, fbcon=nodefer, ASCII gum cursor	2026-05-05 01:43:00 +01:00
README.md	v0.5.5: autonomous install test harness (#12 )	2026-05-02 22:49:51 +01:00
run-vm.sh	v0.5.30: broad error suppression + manual bootloader + virtio log capture	2026-05-05 11:59:35 +01:00
TESTING.md	v0.5.27: rd.luks.uuid via grubby, GRUB rebrand, fbcon=nodefer, ASCII gum cursor	2026-05-05 01:43:00 +01:00

README.md

test/

Test harnesses for veilor-os ISO builds.

Files

File	Purpose
`run-vm.sh`	Manual smoke test — boot the latest ISO interactively in QEMU/KVM. SSH key injection via cloud-init seed + monitor sendkey fallback for live-image login.
`auto-install.sh`	Autonomous end-to-end install test. Boots ISO, drives the gum installer via QEMU monitor `sendkey`, waits for anaconda to finish + reboot, SSHs into the installed system, runs validation checklist. Prints PASS/FAIL summary.
`auto-install-keymap.sh`	Sourced helper. Provides `km_send_str`, `km_send_chord`, `km_send_key`, `km_screendump`, `km_wait_socket`, etc. Reusable by other automation.
`boot-checklist.md`	Manual post-install checklist (run on a real spare laptop).

Running the autonomous installer test

./test/auto-install.sh build/out/veilor-os-*.iso

Hardcoded inputs (deterministic — do not edit during a test run):

Disk: first /dev/vda (the only disk in QEMU)
Hostname: veilor (installer hardcoded since v0.5.4)
LUKS passphrase: testpass1234
Admin password: adminpass1234
Locale: en_GB.UTF-8

Expected runtime: 20–30 minutes wall clock (anaconda dominates).

Outputs

/tmp/veilor-auto-install.log — full driver log
/tmp/veilor-auto-install-NN-<step>.png — milestone screenshots
/tmp/veilor-auto-install-final-ssh.txt — final SSH session capture (uname/lsblk/cmdline/failed units)

Exit codes

0 — all validation checks passed
1 — any failure (anaconda crashed, SSH never came up, validation check failed)
2 — preflight failure (missing tool, bad ISO arg, missing OVMF)

Prerequisites

qemu-system-x86_64, qemu-img, socat, ssh, ssh-keygen
edk2-ovmf (OVMF UEFI firmware at /usr/share/edk2/ovmf/OVMF_{CODE,VARS}.fd)
mkisofs or xorriso (for cloud-init seed ISO; harness falls back to TTY1 driving if seed cannot be built or cloud-init does not run on the installed system)
convert from ImageMagick (optional — converts PPM screendumps to PNG; harness keeps PPM if absent)
KVM access (/dev/kvm readable by the user)

What it validates

Post-install on the booted system:

/etc/os-release → NAME=veilor-os
hostnamectl --static → veilor
systemctl is-active → active for sshd fail2ban usbguard tuned auditd firewalld chronyd sddm
getenforce → Enforcing (preferred) or Permissive (acceptable for v0.5.x)
lsblk -f shows crypto_LUKS + btrfs
/etc/crypttab has a LUKS entry
getent passwd admin returns the user
/usr/local/bin/{veilor-power,veilor-doctor,veilor-update} are present and executable
/proc/cmdline contains init_on_alloc=1

Troubleshooting

Stuck at boot banner: ISO didn't autostart veilor-installer on tty1. Check serial.log and auto-install-vm-NN-*.png screenshots. The harness aborts after 5 minutes of identical screen frames.
SSH never up: cloud-init may not have run on the installed system (no cidata mount). The harness falls back to TTY1 driving — typing the LUKS passphrase, logging in as admin, and hand-injecting the SSH key. If both paths fail, validation cannot proceed.
screendump produces unreadable PPM: install ImageMagick (dnf install ImageMagick) so the harness converts to PNG.

README.md Unescape Escape