v0.5.31: kernel-install via /etc/kernel/cmdline + set-e leak + rescue glob
Four-bug fix from 4-agent verification wave on v0.5.30 outcome. Bug 1 CRITICAL: --location=none made anaconda skip CollectKernelArgumentsTask (installation.py:149-151). --append= args never collected, BLS entries wrote with empty cmdline. Drop --location=none, let anaconda do its bootloader path; broad transaction_progress patch already silences the gen_grub_cfgstub class failure. Bug 2 CRITICAL: kernel-install reads /etc/kernel/cmdline as source of truth (per 90-loaderentry.install:84-95). Veilor never wrote that file so kernel-install fell through to /proc/cmdline (live ISO's). Add 3-path write: /etc/kernel/cmdline (Path A canonical), /etc/default/grub (Path B legacy), grubby --update-kernel=ALL (Path C last-writer guard). Plus explicit kernel-install add per kernel after Path A write. Bug 3: rescue BLS glob *-0-rescue-*.conf required trailing hyphen; F43 uses *-0-rescue.conf. Fix: *-0-rescue*.conf (matches both). Bug 4: set +e/set -e scope leak in %post. v0.5.30 closed manual bootloader block with set -e which re-enabled errexit for the rest of %post that was authored with set +e semantics. Result: any non-guarded command failure aborted the LUKS args injection block. Fix: remove the closing set -e. Files: overlay/usr/local/bin/veilor-installer. Verified: bash -n clean, ksvalidator clean.
This commit is contained in:
parent
b2468542c0
commit
a0b0d02bf2
1 changed files with 89 additions and 46 deletions
|
|
@ -407,21 +407,27 @@ __SSHKEY_DIRECTIVE__
|
|||
# - `fbcon=nodefer` — keep linux framebuffer console alive through KMS
|
||||
# handoff so plymouth LUKS prompt remains visible on real GPUs.
|
||||
#
|
||||
# `--location=none` — DO NOT let anaconda install the bootloader. v0.5.30
|
||||
# moved bootloader install to %post chroot below for two reasons:
|
||||
# 1. Anaconda's gen_grub_cfgstub script (efi.py:194-201) runs
|
||||
# against an /boot/efi/EFI/fedora/ tree that grub2-efi-x64's
|
||||
# posttrans scriptlet may not have populated yet — Fedora 43's
|
||||
# RPM 6.0 + dnf5 + cmdline-mode anaconda combo is brittle here.
|
||||
# Reproduced as "gen_grub_cfgstub script failed" twice.
|
||||
# 2. Running grub2-install + grub2-mkconfig directly in %post lets
|
||||
# us pick up the env after anaconda finishes the package
|
||||
# transaction, with all scriptlets' file artifacts settled, and
|
||||
# gives clearer error messages if anything goes wrong.
|
||||
# We still install the packages (grub2-efi-x64, shim-x64, efibootmgr)
|
||||
# via %packages — anaconda just doesn't auto-invoke its bootloader code
|
||||
# path.
|
||||
bootloader --location=none --append="lockdown=integrity slab_nomerge init_on_alloc=1 init_on_free=1 randomize_kstack_offset=on vsyscall=none fbcon=nodefer"
|
||||
# NOTE on --location: v0.5.30 used --location=none to skip anaconda's
|
||||
# bootloader install (sidestep gen_grub_cfgstub). Side effect: anaconda
|
||||
# also skipped CollectKernelArgumentsTask (installation.py:149-151), so
|
||||
# `--append=` args were NEVER COLLECTED. kernel-install then wrote BLS
|
||||
# entries with empty /etc/kernel/cmdline, falling through to the live
|
||||
# ISO's /proc/cmdline — no rd.luks.uuid, no fbcon=nodefer, no hardening.
|
||||
# Result: dracut emergency shell on first boot.
|
||||
#
|
||||
# v0.5.31 lets anaconda install the bootloader (default behavior, no
|
||||
# --location flag). With our broad transaction_progress patch in the
|
||||
# live ks, anaconda's gen_grub_cfgstub still runs, but if grub2-efi-x64's
|
||||
# posttrans had a non-fatal scriptlet failure the patch swallows it
|
||||
# without aborting. The %post chroot below STILL does belt-and-braces
|
||||
# fixup (dnf reinstall, grub2-install, etc.) in case anaconda's path
|
||||
# left something incomplete.
|
||||
#
|
||||
# Critically v0.5.31 also writes /etc/kernel/cmdline FIRST in %post then
|
||||
# re-runs kernel-install per kernel. That's the canonical Fedora 43 path
|
||||
# for landing args in BLS entries — kernel-install reads /etc/kernel/cmdline
|
||||
# (90-loaderentry.install:84-95) when generating BLS option lines.
|
||||
bootloader --append="lockdown=integrity slab_nomerge init_on_alloc=1 init_on_free=1 randomize_kstack_offset=on vsyscall=none fbcon=nodefer"
|
||||
|
||||
# Disk: zero, LUKS2 (argon2id), btrfs subvolumes (no LVM intermediary).
|
||||
# Native btrfs-on-LUKS matches Fedora KDE Spin defaults; LVM+btrfs combo
|
||||
|
|
@ -664,7 +670,13 @@ if [ -n "$EFI_DISK" ] && [ -e "$EFI_DISK" ]; then
|
|||
fi
|
||||
|
||||
echo "[INFO] bootloader install: see above for any [WARN] lines"
|
||||
set -e
|
||||
# NOTE: deliberately NOT `set -e` here. The block above opened with
|
||||
# `set +e` and the rest of %post is a sequence of best-effort hardening
|
||||
# steps that have local `|| true` guards on the operations that may
|
||||
# legitimately fail. Re-enabling errexit would cause `set -e` to abort
|
||||
# the whole %post on the first non-guarded command (e.g. a `grep -q`
|
||||
# returning 1). v0.5.30 had this bug and it silently truncated
|
||||
# the LUKS args injection.
|
||||
|
||||
# GRUB branding: replace fedora distributor with veilor-os in menu titles.
|
||||
# Drop rhgb quiet from default cmdline → all kernel/systemd messages
|
||||
|
|
@ -696,45 +708,73 @@ sed -i \
|
|||
# user lands in emergency shell on first boot.
|
||||
LUKS_UUID=$(blkid -t TYPE=crypto_LUKS -o value -s UUID 2>/dev/null | head -1)
|
||||
if [ -n "$LUKS_UUID" ]; then
|
||||
# Args:
|
||||
# rd.luks.uuid=luks-XXX — tells dracut to expect a LUKS device,
|
||||
# triggers cryptsetup-generator.
|
||||
# rd.luks.options=...=tries=5 — five typo retries before giving up
|
||||
# (default 1; one slip = emergency
|
||||
# shell after 3min, terrible UX).
|
||||
# rd.luks.options=...=timeout=0 — never time out unlock device wait
|
||||
# (default 1m30s; slow user typing
|
||||
# on a long passphrase still works).
|
||||
# fbcon=nodefer — keep linux framebuffer console alive
|
||||
# through KMS handoff. Without this on
|
||||
# real laptops the plymouth LUKS prompt
|
||||
# draws into a frozen framebuffer and
|
||||
# the user sees a black screen with a
|
||||
# blinking cursor. Already in the live
|
||||
# ISO bootloader cmdline; missing from
|
||||
# the installed-system bootloader line
|
||||
# in the generated install ks above
|
||||
# (also fixed there).
|
||||
LUKS_ARGS="rd.luks.uuid=luks-${LUKS_UUID} rd.luks.options=luks-${LUKS_UUID}=tries=5,timeout=0 fbcon=nodefer"
|
||||
LUKS_ARGS="rd.luks.uuid=luks-${LUKS_UUID} rd.luks.options=luks-${LUKS_UUID}=tries=5,timeout=0"
|
||||
HARDEN_ARGS="lockdown=integrity slab_nomerge init_on_alloc=1 init_on_free=1 randomize_kstack_offset=on vsyscall=none fbcon=nodefer"
|
||||
|
||||
# Path 1: persist into /etc/default/grub so future kernels inherit.
|
||||
# Find the running root UUID (the btrfs filesystem holding the root
|
||||
# subvol). At this point in %post chroot, `/` is the target root;
|
||||
# findmnt -o UUID resolves to the btrfs UUID anaconda chose.
|
||||
ROOT_UUID=$(findmnt -n -o UUID /)
|
||||
[ -z "$ROOT_UUID" ] && ROOT_UUID=$(blkid -s UUID -o value /dev/mapper/luks-${LUKS_UUID} 2>/dev/null)
|
||||
|
||||
# Three write paths, in priority order:
|
||||
#
|
||||
# Path A: /etc/kernel/cmdline (the canonical source of truth for
|
||||
# `kernel-install`). Per /usr/lib/kernel/install.d/90-loaderentry.install
|
||||
# lines 84-95, kernel-install reads /etc/kernel/cmdline first when
|
||||
# authoring BLS entries. If we write this BEFORE re-running
|
||||
# kernel-install, every BLS entry inherits our args. Persists
|
||||
# across `dnf upgrade kernel`, `dnf reinstall grub2-*`, and any
|
||||
# other path that re-fires kernel-install hooks.
|
||||
#
|
||||
# Path B: /etc/default/grub (legacy GRUB_CMDLINE_LINUX). Read by
|
||||
# `grub2-mkconfig` for the generated grub.cfg. Belt-and-braces;
|
||||
# kernel-install ignores this, but grub2-mkconfig respects it.
|
||||
#
|
||||
# Path C: grubby --update-kernel=ALL. Direct edit to BLS option
|
||||
# lines. Acts as the last-writer in case our cmdline write didn't
|
||||
# trigger a fresh kernel-install pass.
|
||||
#
|
||||
# Earlier veilor-os versions only used B+C. v0.5.31 adds Path A as
|
||||
# the primary, because v0.5.30 testing showed B+C are racy with
|
||||
# anaconda's own CreateBLSEntriesTask which uses kernel-install
|
||||
# internally and can rewrite entries from empty /etc/kernel/cmdline,
|
||||
# producing options lines with no rd.luks.uuid even when grubby
|
||||
# successfully ran.
|
||||
|
||||
# Path A
|
||||
mkdir -p /etc/kernel
|
||||
if [ -n "$ROOT_UUID" ]; then
|
||||
echo "root=UUID=${ROOT_UUID} ro rootflags=subvol=root ${LUKS_ARGS} ${HARDEN_ARGS}" > /etc/kernel/cmdline
|
||||
echo "[INFO] wrote /etc/kernel/cmdline (canonical kernel-install source)"
|
||||
else
|
||||
echo "[WARN] could not determine root UUID; /etc/kernel/cmdline not written"
|
||||
fi
|
||||
|
||||
# Path B
|
||||
if ! grep -q "rd.luks.uuid" /etc/default/grub 2>/dev/null; then
|
||||
sed -i "s|^GRUB_CMDLINE_LINUX=\"|GRUB_CMDLINE_LINUX=\"${LUKS_ARGS} |" /etc/default/grub
|
||||
fi
|
||||
|
||||
# Path 2: update existing BLS entries so the kernel that boots NEXT
|
||||
# gets the args. grubby walks /boot/loader/entries/*.conf and edits
|
||||
# the `options` line in-place.
|
||||
# Re-run kernel-install for every kernel — picks up new /etc/kernel/cmdline,
|
||||
# rewrites BLS entries with our args. This is the load-bearing step.
|
||||
for kver in /lib/modules/*/; do
|
||||
kver=$(basename "$kver")
|
||||
[ -f "/lib/modules/$kver/vmlinuz" ] || continue
|
||||
kernel-install add "$kver" "/lib/modules/$kver/vmlinuz" 2>&1 | tail -3 || \
|
||||
echo "[WARN] kernel-install add $kver failed"
|
||||
done
|
||||
|
||||
# Path C: belt-and-braces grubby update in case kernel-install missed any
|
||||
grubby --update-kernel=ALL --args="${LUKS_ARGS}" 2>&1 | tail -5 || true
|
||||
|
||||
# Verification: every BLS entry MUST carry the LUKS arg now. Empty
|
||||
# output = success.
|
||||
# Verification: every BLS entry MUST carry the LUKS arg.
|
||||
drift=$(grep -L "rd.luks.uuid" /boot/loader/entries/*.conf 2>/dev/null)
|
||||
if [ -n "$drift" ]; then
|
||||
echo "[WARN] BLS entries missing rd.luks.uuid: $drift"
|
||||
echo "[ERR] BLS entries missing rd.luks.uuid after all 3 paths: $drift"
|
||||
else
|
||||
echo "[OK] all BLS entries carry rd.luks.uuid"
|
||||
fi
|
||||
|
||||
echo "[INFO] injected ${LUKS_ARGS} into /etc/default/grub + BLS entries"
|
||||
fi
|
||||
|
||||
# Verify anaconda wrote /etc/crypttab for the LUKS device. anaconda's
|
||||
|
|
@ -786,7 +826,10 @@ grub2-mkconfig -o /boot/grub2/grub.cfg 2>/dev/null || true
|
|||
# points at it is created by `kernel-install` and shows up in GRUB as a
|
||||
# second menu item. For a branded distro it's noisy + reveals "Fedora"
|
||||
# in the menu. The rescue image itself is harmless to keep on disk.
|
||||
rm -f /boot/loader/entries/*-0-rescue-*.conf 2>/dev/null || true
|
||||
# Match both `<machine-id>-0-rescue.conf` (current Fedora 43 layout) and
|
||||
# `<machine-id>-0-rescue-<kver>.conf` (older layout). The earlier glob
|
||||
# `*-0-rescue-*.conf` required a trailing hyphen and missed the new form.
|
||||
rm -f /boot/loader/entries/*-0-rescue*.conf 2>/dev/null || true
|
||||
|
||||
# Hostname: default to "veilor" rather than the localhost-live / fedora
|
||||
# fallback that anaconda writes. User can override post-install with
|
||||
|
|
|
|||
Loading…
Reference in a new issue