veilor-os/test/METHOD-CHANGELOG.md

86 lines
3.8 KiB
Markdown
Raw Normal View History

v0.5.27: rd.luks.uuid via grubby, GRUB rebrand, fbcon=nodefer, ASCII gum cursor Critical install bug fix + cosmetic round-up + first formal test procedure document. ## Critical: LUKS unlock on first boot Generated installer kickstart's %post was injecting `rd.luks.uuid=…` into `/etc/default/grub` only. Fedora 43 uses BLS (Boot Loader Specification) entries in `/boot/loader/entries/*.conf`; those are NOT regenerated by `grub2-mkconfig`. Result: the kernel boots without `rd.luks.uuid=`, dracut's cryptsetup-generator never spawns the unlock unit, plymouth has no password to ask for, and dracut-initqueue loops on dev-disk-by-uuid for ~3min before dropping to emergency shell. The fix layers both write paths: - `/etc/default/grub` — keeps the args around for future kernels (kernel-install reads this when adding new entries). - `grubby --update-kernel=ALL --args=...` — rewrites the `options` line of every existing BLS entry so the kernel that boots NEXT actually has the args. Verified by reading `/proc/cmdline` from the dracut emergency shell on a v0.5.26 install; old cmdline had only `root=UUID=… ro rootflags=subvol=root` and was missing the LUKS arg entirely. ## GRUB / branding - `/etc/default/grub` is sed'd to `GRUB_DISTRIBUTOR="veilor-os"` (was already there, kept). - BLS entries' `title` line is rewritten in-place to "veilor-os (<kver>)" for every kernel — `grub2-mkconfig` does not touch BLS titles, so this is the only path. - `/boot/loader/entries/*-0-rescue-*.conf` is removed: the auto-built rescue entry was leaking "Fedora Linux" into the GRUB menu and showing a second boot option that nobody asked for. The rescue kernel image itself is left in /boot. - Hostname defaults to `veilor` (was inheriting the `localhost-live` name anaconda writes when the kickstart's network directive is ignored under cmdline mode). - `/etc/machine-info` adds `PRETTY_HOSTNAME="veilor-os"` so `hostnamectl status` and any consumer reading machine-info see the brand. ## Boot UX - `fbcon=nodefer` added to live-ISO bootloader cmdline. On real laptops with a hardware GPU, the kernel modeset blanks the framebuffer console mid-boot; without `nodefer` the installer banner draws into a frozen framebuffer and the user sees a black screen with a blinking cursor for ~30s. virtio-vga in QEMU doesn't trigger this so it never reproduced in VM. Symptom report on v0.5.26 was the trigger to investigate. ## Installer cosmetics - `GUM_CHOOSE_CURSOR` and `GUM_INPUT_PROMPT` switched from `❯ ` to `> `. The unicode arrow falls back to a fixed-width block on the linux fbcon font and lipgloss then duplicates that block at col +23, producing the "Install Install" double-render and the stray-T artifact in password fields. Plain ASCII renders identically across fbcon, virtio-vga, and X/Wayland gum runs. - `VERSION_ID` bumped 0.5.8 → 0.5.27 in the os-release drop-in. The installer banner reads this at runtime, so the live ISO + installed system both now show "veilor-os 0.5.27". ## Test procedure - `test/TESTING.md` — first canonical test procedure document. Splits VM (cheap iteration, hybrid sendkey + human passwords) from real hardware (mandatory for tag). Documents the standard test passwords (`veilortest1` for both LUKS and admin), the kill-and-relaunch step to skip CD on second boot, and the per-step pass/fail contract. - `test/METHOD-CHANGELOG.md` — append-only audit trail for changes to the procedure. Future releases that alter the test method must add an entry here with the why. - `test/test-runs/_TEMPLATE.md` — per-run report template. Each tagged release should land a filled report alongside it. ## test/run-vm.sh Decoupled QEMU monitor sock setup from auto-inject. Previously `NO_INJECT=1` (used to suppress autotype noise into prompts) also killed the monitor sock, leaving the VM undriveable. Monitor sock is now always exposed; only the inject helper is gated on the pubkey detection.
2026-05-05 01:43:00 +01:00
# veilor-os — Test Method Changelog
Append-only log of changes to `test/TESTING.md`. Each entry: date, the
veilor-os version it first applied to, what changed in the procedure,
and *why*. The why is the load-bearing part — without it this file
becomes a list of opinions.
Entries are newest-first.
---
## 2026-05-06 · v0.5.32 · ISO build path moved to Forgejo
**Change:** Build host for the test ISO has moved off GitHub Actions
onto the Forgejo runner on nullstone. The hybrid VM test procedure in
`TESTING.md` is **unchanged** — the gum installer still drives every
step it can, the operator still types the LUKS + admin passwords
directly into the QEMU window. The only thing different is where the
ISO comes from and how the host log is captured.
**Practical deltas for testers:**
- ISO download: from the Forgejo `ci-latest` rolling release at
<https://git.s8n.ru/veilor-org/veilor-os/releases/tag/ci-latest>.
The tag is force-replaced on each successful `build-iso.yml` run, so
always re-download — don't rely on a cached copy.
- Re-flash to USB / virtio-blk image via Etcher / `dd`**unchanged**.
Same `sha256sum -c` step; same image format.
- virtio-9p host log capture is now **active by default** in
`test/run-vm.sh`. This replaces the broken virtio-serial path
flagged by Agent 6 in the 2026-05-05 wave; Anaconda logs land in the
host-side mount automatically once the VM boots, no manual `tail -f`
on a broken serial console.
- Build host for the record: forgejo-runner on nullstone, runner label
`ubuntu-24.04`, image `catthehacker/ubuntu:act-24.04`. Reproducibility
is unchanged from the GH Actions ubuntu-24.04 base — the act image
matches GHA's runner image to within package versions.
**Why:** GitHub mirror was disabled 2026-05-06 (repo is now
private-by-default on Forgejo); GH Actions builds would just stop
producing artifacts. Moving CI in-house onto nullstone keeps the
test/release loop intact and removes the external dependency for
private-build cycles. Documenting the change here so a future tester
reading TESTING.md doesn't waste time hunting an artifact in a
GitHub run that never happened.
**Files touched in this entry:**
- `test/METHOD-CHANGELOG.md` — this entry.
`test/TESTING.md` itself is **not** edited — the procedure prose still
applies verbatim. Only the build host and the URL where the ISO lives
changed.
---
v0.5.27: rd.luks.uuid via grubby, GRUB rebrand, fbcon=nodefer, ASCII gum cursor Critical install bug fix + cosmetic round-up + first formal test procedure document. ## Critical: LUKS unlock on first boot Generated installer kickstart's %post was injecting `rd.luks.uuid=…` into `/etc/default/grub` only. Fedora 43 uses BLS (Boot Loader Specification) entries in `/boot/loader/entries/*.conf`; those are NOT regenerated by `grub2-mkconfig`. Result: the kernel boots without `rd.luks.uuid=`, dracut's cryptsetup-generator never spawns the unlock unit, plymouth has no password to ask for, and dracut-initqueue loops on dev-disk-by-uuid for ~3min before dropping to emergency shell. The fix layers both write paths: - `/etc/default/grub` — keeps the args around for future kernels (kernel-install reads this when adding new entries). - `grubby --update-kernel=ALL --args=...` — rewrites the `options` line of every existing BLS entry so the kernel that boots NEXT actually has the args. Verified by reading `/proc/cmdline` from the dracut emergency shell on a v0.5.26 install; old cmdline had only `root=UUID=… ro rootflags=subvol=root` and was missing the LUKS arg entirely. ## GRUB / branding - `/etc/default/grub` is sed'd to `GRUB_DISTRIBUTOR="veilor-os"` (was already there, kept). - BLS entries' `title` line is rewritten in-place to "veilor-os (<kver>)" for every kernel — `grub2-mkconfig` does not touch BLS titles, so this is the only path. - `/boot/loader/entries/*-0-rescue-*.conf` is removed: the auto-built rescue entry was leaking "Fedora Linux" into the GRUB menu and showing a second boot option that nobody asked for. The rescue kernel image itself is left in /boot. - Hostname defaults to `veilor` (was inheriting the `localhost-live` name anaconda writes when the kickstart's network directive is ignored under cmdline mode). - `/etc/machine-info` adds `PRETTY_HOSTNAME="veilor-os"` so `hostnamectl status` and any consumer reading machine-info see the brand. ## Boot UX - `fbcon=nodefer` added to live-ISO bootloader cmdline. On real laptops with a hardware GPU, the kernel modeset blanks the framebuffer console mid-boot; without `nodefer` the installer banner draws into a frozen framebuffer and the user sees a black screen with a blinking cursor for ~30s. virtio-vga in QEMU doesn't trigger this so it never reproduced in VM. Symptom report on v0.5.26 was the trigger to investigate. ## Installer cosmetics - `GUM_CHOOSE_CURSOR` and `GUM_INPUT_PROMPT` switched from `❯ ` to `> `. The unicode arrow falls back to a fixed-width block on the linux fbcon font and lipgloss then duplicates that block at col +23, producing the "Install Install" double-render and the stray-T artifact in password fields. Plain ASCII renders identically across fbcon, virtio-vga, and X/Wayland gum runs. - `VERSION_ID` bumped 0.5.8 → 0.5.27 in the os-release drop-in. The installer banner reads this at runtime, so the live ISO + installed system both now show "veilor-os 0.5.27". ## Test procedure - `test/TESTING.md` — first canonical test procedure document. Splits VM (cheap iteration, hybrid sendkey + human passwords) from real hardware (mandatory for tag). Documents the standard test passwords (`veilortest1` for both LUKS and admin), the kill-and-relaunch step to skip CD on second boot, and the per-step pass/fail contract. - `test/METHOD-CHANGELOG.md` — append-only audit trail for changes to the procedure. Future releases that alter the test method must add an entry here with the why. - `test/test-runs/_TEMPLATE.md` — per-run report template. Each tagged release should land a filled report alongside it. ## test/run-vm.sh Decoupled QEMU monitor sock setup from auto-inject. Previously `NO_INJECT=1` (used to suppress autotype noise into prompts) also killed the monitor sock, leaving the VM undriveable. Monitor sock is now always exposed; only the inject helper is gated on the pubkey detection.
2026-05-05 01:43:00 +01:00
## 2026-05-05 · v0.5.27 · TESTING.md created
**Change:** First version of the canonical procedure document.
**Why:** Through v0.2 → v0.5.26 we'd been reproducing the test
procedure ad-hoc each time, which meant test runs were uncomparable
and regressions were caught only by accident. The v0.5.26 → v0.5.27
debugging session surfaced the LUKS-cmdline bug, the GRUB rebrand
gap, the gum-cursor render glitch, and the fbcon KMS issue all in a
single VM run — but only because the test happened to walk every step
in order. Codifying the steps means the next regression is caught the
same way reliably.
The procedure documents the **hybrid VM method** explicitly: Claude
drives every step it can via QEMU monitor `sendkey`, the human types
LUKS + admin passwords directly into the QEMU window because plymouth
ignores synthesised keystrokes. Past trail (14+ failed sendkey
variants) is the source of truth for that limitation; do not re-fight
that battle without first rereading the trail.
The procedure also separates **VM** (cheap iteration, catches install
logic) from **real hardware** (mandatory for tag, catches firmware /
KMS / GPU). Future releases must produce a `test/test-runs/` report
for each before tagging.
**Files added:**
- `test/TESTING.md` — canonical procedure
- `test/METHOD-CHANGELOG.md` — this file
- `test/test-runs/` — per-run reports go here (template lands with
first real run, currently empty)