v0.5.31: kernel-install via /etc/kernel/cmdline + set-e leak + rescue glob
Four-bug fix from 4-agent verification wave on v0.5.30 outcome. Bug 1 CRITICAL: --location=none made anaconda skip CollectKernelArgumentsTask (installation.py:149-151). --append= args never collected, BLS entries wrote with empty cmdline. Drop --location=none, let anaconda do its bootloader path; broad transaction_progress patch already silences the gen_grub_cfgstub class failure. Bug 2 CRITICAL: kernel-install reads /etc/kernel/cmdline as source of truth (per 90-loaderentry.install:84-95). Veilor never wrote that file so kernel-install fell through to /proc/cmdline (live ISO's). Add 3-path write: /etc/kernel/cmdline (Path A canonical), /etc/default/grub (Path B legacy), grubby --update-kernel=ALL (Path C last-writer guard). Plus explicit kernel-install add per kernel after Path A write. Bug 3: rescue BLS glob *-0-rescue-*.conf required trailing hyphen; F43 uses *-0-rescue.conf. Fix: *-0-rescue*.conf (matches both). Bug 4: set +e/set -e scope leak in %post. v0.5.30 closed manual bootloader block with set -e which re-enabled errexit for the rest of %post that was authored with set +e semantics. Result: any non-guarded command failure aborted the LUKS args injection block. Fix: remove the closing set -e. Files: overlay/usr/local/bin/veilor-installer. Verified: bash -n clean, ksvalidator clean.
This commit is contained in:
parent
e83483a077
commit
2788b95a12
1 changed files with 89 additions and 46 deletions
|
|
@ -407,21 +407,27 @@ __SSHKEY_DIRECTIVE__
|
||||||
# - `fbcon=nodefer` — keep linux framebuffer console alive through KMS
|
# - `fbcon=nodefer` — keep linux framebuffer console alive through KMS
|
||||||
# handoff so plymouth LUKS prompt remains visible on real GPUs.
|
# handoff so plymouth LUKS prompt remains visible on real GPUs.
|
||||||
#
|
#
|
||||||
# `--location=none` — DO NOT let anaconda install the bootloader. v0.5.30
|
# NOTE on --location: v0.5.30 used --location=none to skip anaconda's
|
||||||
# moved bootloader install to %post chroot below for two reasons:
|
# bootloader install (sidestep gen_grub_cfgstub). Side effect: anaconda
|
||||||
# 1. Anaconda's gen_grub_cfgstub script (efi.py:194-201) runs
|
# also skipped CollectKernelArgumentsTask (installation.py:149-151), so
|
||||||
# against an /boot/efi/EFI/fedora/ tree that grub2-efi-x64's
|
# `--append=` args were NEVER COLLECTED. kernel-install then wrote BLS
|
||||||
# posttrans scriptlet may not have populated yet — Fedora 43's
|
# entries with empty /etc/kernel/cmdline, falling through to the live
|
||||||
# RPM 6.0 + dnf5 + cmdline-mode anaconda combo is brittle here.
|
# ISO's /proc/cmdline — no rd.luks.uuid, no fbcon=nodefer, no hardening.
|
||||||
# Reproduced as "gen_grub_cfgstub script failed" twice.
|
# Result: dracut emergency shell on first boot.
|
||||||
# 2. Running grub2-install + grub2-mkconfig directly in %post lets
|
#
|
||||||
# us pick up the env after anaconda finishes the package
|
# v0.5.31 lets anaconda install the bootloader (default behavior, no
|
||||||
# transaction, with all scriptlets' file artifacts settled, and
|
# --location flag). With our broad transaction_progress patch in the
|
||||||
# gives clearer error messages if anything goes wrong.
|
# live ks, anaconda's gen_grub_cfgstub still runs, but if grub2-efi-x64's
|
||||||
# We still install the packages (grub2-efi-x64, shim-x64, efibootmgr)
|
# posttrans had a non-fatal scriptlet failure the patch swallows it
|
||||||
# via %packages — anaconda just doesn't auto-invoke its bootloader code
|
# without aborting. The %post chroot below STILL does belt-and-braces
|
||||||
# path.
|
# fixup (dnf reinstall, grub2-install, etc.) in case anaconda's path
|
||||||
bootloader --location=none --append="lockdown=integrity slab_nomerge init_on_alloc=1 init_on_free=1 randomize_kstack_offset=on vsyscall=none fbcon=nodefer"
|
# left something incomplete.
|
||||||
|
#
|
||||||
|
# Critically v0.5.31 also writes /etc/kernel/cmdline FIRST in %post then
|
||||||
|
# re-runs kernel-install per kernel. That's the canonical Fedora 43 path
|
||||||
|
# for landing args in BLS entries — kernel-install reads /etc/kernel/cmdline
|
||||||
|
# (90-loaderentry.install:84-95) when generating BLS option lines.
|
||||||
|
bootloader --append="lockdown=integrity slab_nomerge init_on_alloc=1 init_on_free=1 randomize_kstack_offset=on vsyscall=none fbcon=nodefer"
|
||||||
|
|
||||||
# Disk: zero, LUKS2 (argon2id), btrfs subvolumes (no LVM intermediary).
|
# Disk: zero, LUKS2 (argon2id), btrfs subvolumes (no LVM intermediary).
|
||||||
# Native btrfs-on-LUKS matches Fedora KDE Spin defaults; LVM+btrfs combo
|
# Native btrfs-on-LUKS matches Fedora KDE Spin defaults; LVM+btrfs combo
|
||||||
|
|
@ -664,7 +670,13 @@ if [ -n "$EFI_DISK" ] && [ -e "$EFI_DISK" ]; then
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "[INFO] bootloader install: see above for any [WARN] lines"
|
echo "[INFO] bootloader install: see above for any [WARN] lines"
|
||||||
set -e
|
# NOTE: deliberately NOT `set -e` here. The block above opened with
|
||||||
|
# `set +e` and the rest of %post is a sequence of best-effort hardening
|
||||||
|
# steps that have local `|| true` guards on the operations that may
|
||||||
|
# legitimately fail. Re-enabling errexit would cause `set -e` to abort
|
||||||
|
# the whole %post on the first non-guarded command (e.g. a `grep -q`
|
||||||
|
# returning 1). v0.5.30 had this bug and it silently truncated
|
||||||
|
# the LUKS args injection.
|
||||||
|
|
||||||
# GRUB branding: replace fedora distributor with veilor-os in menu titles.
|
# GRUB branding: replace fedora distributor with veilor-os in menu titles.
|
||||||
# Drop rhgb quiet from default cmdline → all kernel/systemd messages
|
# Drop rhgb quiet from default cmdline → all kernel/systemd messages
|
||||||
|
|
@ -696,45 +708,73 @@ sed -i \
|
||||||
# user lands in emergency shell on first boot.
|
# user lands in emergency shell on first boot.
|
||||||
LUKS_UUID=$(blkid -t TYPE=crypto_LUKS -o value -s UUID 2>/dev/null | head -1)
|
LUKS_UUID=$(blkid -t TYPE=crypto_LUKS -o value -s UUID 2>/dev/null | head -1)
|
||||||
if [ -n "$LUKS_UUID" ]; then
|
if [ -n "$LUKS_UUID" ]; then
|
||||||
# Args:
|
LUKS_ARGS="rd.luks.uuid=luks-${LUKS_UUID} rd.luks.options=luks-${LUKS_UUID}=tries=5,timeout=0"
|
||||||
# rd.luks.uuid=luks-XXX — tells dracut to expect a LUKS device,
|
HARDEN_ARGS="lockdown=integrity slab_nomerge init_on_alloc=1 init_on_free=1 randomize_kstack_offset=on vsyscall=none fbcon=nodefer"
|
||||||
# triggers cryptsetup-generator.
|
|
||||||
# rd.luks.options=...=tries=5 — five typo retries before giving up
|
|
||||||
# (default 1; one slip = emergency
|
|
||||||
# shell after 3min, terrible UX).
|
|
||||||
# rd.luks.options=...=timeout=0 — never time out unlock device wait
|
|
||||||
# (default 1m30s; slow user typing
|
|
||||||
# on a long passphrase still works).
|
|
||||||
# fbcon=nodefer — keep linux framebuffer console alive
|
|
||||||
# through KMS handoff. Without this on
|
|
||||||
# real laptops the plymouth LUKS prompt
|
|
||||||
# draws into a frozen framebuffer and
|
|
||||||
# the user sees a black screen with a
|
|
||||||
# blinking cursor. Already in the live
|
|
||||||
# ISO bootloader cmdline; missing from
|
|
||||||
# the installed-system bootloader line
|
|
||||||
# in the generated install ks above
|
|
||||||
# (also fixed there).
|
|
||||||
LUKS_ARGS="rd.luks.uuid=luks-${LUKS_UUID} rd.luks.options=luks-${LUKS_UUID}=tries=5,timeout=0 fbcon=nodefer"
|
|
||||||
|
|
||||||
# Path 1: persist into /etc/default/grub so future kernels inherit.
|
# Find the running root UUID (the btrfs filesystem holding the root
|
||||||
|
# subvol). At this point in %post chroot, `/` is the target root;
|
||||||
|
# findmnt -o UUID resolves to the btrfs UUID anaconda chose.
|
||||||
|
ROOT_UUID=$(findmnt -n -o UUID /)
|
||||||
|
[ -z "$ROOT_UUID" ] && ROOT_UUID=$(blkid -s UUID -o value /dev/mapper/luks-${LUKS_UUID} 2>/dev/null)
|
||||||
|
|
||||||
|
# Three write paths, in priority order:
|
||||||
|
#
|
||||||
|
# Path A: /etc/kernel/cmdline (the canonical source of truth for
|
||||||
|
# `kernel-install`). Per /usr/lib/kernel/install.d/90-loaderentry.install
|
||||||
|
# lines 84-95, kernel-install reads /etc/kernel/cmdline first when
|
||||||
|
# authoring BLS entries. If we write this BEFORE re-running
|
||||||
|
# kernel-install, every BLS entry inherits our args. Persists
|
||||||
|
# across `dnf upgrade kernel`, `dnf reinstall grub2-*`, and any
|
||||||
|
# other path that re-fires kernel-install hooks.
|
||||||
|
#
|
||||||
|
# Path B: /etc/default/grub (legacy GRUB_CMDLINE_LINUX). Read by
|
||||||
|
# `grub2-mkconfig` for the generated grub.cfg. Belt-and-braces;
|
||||||
|
# kernel-install ignores this, but grub2-mkconfig respects it.
|
||||||
|
#
|
||||||
|
# Path C: grubby --update-kernel=ALL. Direct edit to BLS option
|
||||||
|
# lines. Acts as the last-writer in case our cmdline write didn't
|
||||||
|
# trigger a fresh kernel-install pass.
|
||||||
|
#
|
||||||
|
# Earlier veilor-os versions only used B+C. v0.5.31 adds Path A as
|
||||||
|
# the primary, because v0.5.30 testing showed B+C are racy with
|
||||||
|
# anaconda's own CreateBLSEntriesTask which uses kernel-install
|
||||||
|
# internally and can rewrite entries from empty /etc/kernel/cmdline,
|
||||||
|
# producing options lines with no rd.luks.uuid even when grubby
|
||||||
|
# successfully ran.
|
||||||
|
|
||||||
|
# Path A
|
||||||
|
mkdir -p /etc/kernel
|
||||||
|
if [ -n "$ROOT_UUID" ]; then
|
||||||
|
echo "root=UUID=${ROOT_UUID} ro rootflags=subvol=root ${LUKS_ARGS} ${HARDEN_ARGS}" > /etc/kernel/cmdline
|
||||||
|
echo "[INFO] wrote /etc/kernel/cmdline (canonical kernel-install source)"
|
||||||
|
else
|
||||||
|
echo "[WARN] could not determine root UUID; /etc/kernel/cmdline not written"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Path B
|
||||||
if ! grep -q "rd.luks.uuid" /etc/default/grub 2>/dev/null; then
|
if ! grep -q "rd.luks.uuid" /etc/default/grub 2>/dev/null; then
|
||||||
sed -i "s|^GRUB_CMDLINE_LINUX=\"|GRUB_CMDLINE_LINUX=\"${LUKS_ARGS} |" /etc/default/grub
|
sed -i "s|^GRUB_CMDLINE_LINUX=\"|GRUB_CMDLINE_LINUX=\"${LUKS_ARGS} |" /etc/default/grub
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Path 2: update existing BLS entries so the kernel that boots NEXT
|
# Re-run kernel-install for every kernel — picks up new /etc/kernel/cmdline,
|
||||||
# gets the args. grubby walks /boot/loader/entries/*.conf and edits
|
# rewrites BLS entries with our args. This is the load-bearing step.
|
||||||
# the `options` line in-place.
|
for kver in /lib/modules/*/; do
|
||||||
|
kver=$(basename "$kver")
|
||||||
|
[ -f "/lib/modules/$kver/vmlinuz" ] || continue
|
||||||
|
kernel-install add "$kver" "/lib/modules/$kver/vmlinuz" 2>&1 | tail -3 || \
|
||||||
|
echo "[WARN] kernel-install add $kver failed"
|
||||||
|
done
|
||||||
|
|
||||||
|
# Path C: belt-and-braces grubby update in case kernel-install missed any
|
||||||
grubby --update-kernel=ALL --args="${LUKS_ARGS}" 2>&1 | tail -5 || true
|
grubby --update-kernel=ALL --args="${LUKS_ARGS}" 2>&1 | tail -5 || true
|
||||||
|
|
||||||
# Verification: every BLS entry MUST carry the LUKS arg now. Empty
|
# Verification: every BLS entry MUST carry the LUKS arg.
|
||||||
# output = success.
|
|
||||||
drift=$(grep -L "rd.luks.uuid" /boot/loader/entries/*.conf 2>/dev/null)
|
drift=$(grep -L "rd.luks.uuid" /boot/loader/entries/*.conf 2>/dev/null)
|
||||||
if [ -n "$drift" ]; then
|
if [ -n "$drift" ]; then
|
||||||
echo "[WARN] BLS entries missing rd.luks.uuid: $drift"
|
echo "[ERR] BLS entries missing rd.luks.uuid after all 3 paths: $drift"
|
||||||
|
else
|
||||||
|
echo "[OK] all BLS entries carry rd.luks.uuid"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "[INFO] injected ${LUKS_ARGS} into /etc/default/grub + BLS entries"
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Verify anaconda wrote /etc/crypttab for the LUKS device. anaconda's
|
# Verify anaconda wrote /etc/crypttab for the LUKS device. anaconda's
|
||||||
|
|
@ -786,7 +826,10 @@ grub2-mkconfig -o /boot/grub2/grub.cfg 2>/dev/null || true
|
||||||
# points at it is created by `kernel-install` and shows up in GRUB as a
|
# points at it is created by `kernel-install` and shows up in GRUB as a
|
||||||
# second menu item. For a branded distro it's noisy + reveals "Fedora"
|
# second menu item. For a branded distro it's noisy + reveals "Fedora"
|
||||||
# in the menu. The rescue image itself is harmless to keep on disk.
|
# in the menu. The rescue image itself is harmless to keep on disk.
|
||||||
rm -f /boot/loader/entries/*-0-rescue-*.conf 2>/dev/null || true
|
# Match both `<machine-id>-0-rescue.conf` (current Fedora 43 layout) and
|
||||||
|
# `<machine-id>-0-rescue-<kver>.conf` (older layout). The earlier glob
|
||||||
|
# `*-0-rescue-*.conf` required a trailing hyphen and missed the new form.
|
||||||
|
rm -f /boot/loader/entries/*-0-rescue*.conf 2>/dev/null || true
|
||||||
|
|
||||||
# Hostname: default to "veilor" rather than the localhost-live / fedora
|
# Hostname: default to "veilor" rather than the localhost-live / fedora
|
||||||
# fallback that anaconda writes. User can override post-install with
|
# fallback that anaconda writes. User can override post-install with
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue