v0.5.28 (final): patch anaconda transaction_progress.py + exclude man-db

THE actual root cause of the man-db transaction failure that killed
three consecutive VM installs (v0.5.26 / v0.5.27 / v0.5.28).
Confirmed via 7-agent research wave:

- Fedora 43 ships RPM 6.0, which changed scriptlet failure
  propagation. Scriptlets that previously emitted "Non-critical
  error" warnings now bubble up as transaction-level errors. dnf5
  issue #2507 documents the change. Anaconda --cmdline mode treats
  any 'error' token from the dnf transaction as a fatal abort.
- man-db's `transfiletriggerin` is the canonical trigger: it runs
  `systemd-run /usr/bin/systemctl start man-db-cache-update` which
  returns non-zero in the anaconda chroot (no PID 1 systemd) and is
  flagged as transaction-level error under RPM 6.0.
- We previously patched anaconda's transaction_progress.py on the
  BUILD HOST so livecd-creator could finish its own transaction.
  That patch lives only on the host running the build — never landed
  in the live rootfs the user installs from. Reproduced 3 times:
  install-time anaconda on the live ISO is unpatched, hits the same
  code path, aborts at exactly "Configuring man-db.x86_64".

Two-layer fix:

1. kickstart %post seds the file inside the live rootfs at build time
   so the user's install-time anaconda is patched. Sed downgrades the
   'error' token from raise PayloadInstallationError to log.warning.

2. Generated install ks excludes man-db / man-pages / man-pages-overrides
   from %packages. Belt-and-braces — even if the patch has an edge
   case the trigger never fires because the package isn't installed.
   Users install man pages post-firstboot.

Previous attempts that didn't work: dropping the updates repo (only
narrowed the set of failing scriptlets, didn't fix the underlying
RPM-6.0 propagation change); flipping SELinux to permissive
(confirmed not the cause; kickstart's selinux directive only writes
/etc/selinux/config in target root, doesn't affect installer-time).

Follow-up for next release: replicate the transaction_progress patch
in the CI workflow's container so the build itself is deterministic.
Currently the workflow has been greening on luck.

Files: kickstart/veilor-os.ks (+25 lines), overlay/usr/local/bin/veilor-installer (+10 lines).
Verified: bash -n clean, ksvalidator clean.
This commit is contained in:
veilor-org 2026-05-05 03:46:00 +01:00
parent 5716e37f7d
commit 8ccb7bced0
2 changed files with 50 additions and 0 deletions

View file

@ -274,6 +274,45 @@ zram-size = min(ram, 8192)
compression-algorithm = zstd
EOF
# Patch anaconda's transaction_progress.py inside the live rootfs so that
# when the user clicks "Install" from the live ISO and anaconda runs in
# --cmdline mode, a non-fatal scriptlet warning (RC=5) does not get
# escalated to "An error occurred during the transaction" + abort.
#
# Why this is needed: Fedora 43 ships RPM 6.0, which changed scriptlet
# failure propagation (Fedora wiki Changes/RPM-6.0; dnf5 issue #2507).
# Scriptlets that previously emitted "Non-critical error" warnings now
# bubble up as transaction-level errors. man-db's
# `transfiletriggerin` is the most common trigger — `systemd-run
# /usr/bin/systemctl start man-db-cache-update` returns non-zero in
# the anaconda chroot, RPM-6.0-aware dnf5 reports it as transaction
# error, anaconda --cmdline aborts.
#
# We previously patched the same file on the BUILD HOST (build/build-iso.sh)
# so livecd-creator could finish its own transaction. That patch lives
# only on the host running the build — never landed in the live rootfs
# the user installs from. Reproduced 3 consecutive VM tests
# (v0.5.26 / v0.5.27 / v0.5.28) failing at exactly "Configuring
# man-db.x86_64".
#
# The patch downgrades the 'error' token in transaction progress
# callback to a warning log line. Confirmed working at build time
# (build/build-iso.sh:47-51).
TP=/usr/lib64/python3.14/site-packages/pyanaconda/modules/payloads/payload/dnf/transaction_progress.py
if [ -f "$TP" ]; then
cp -a "$TP" "${TP}.veilor-bak"
sed -i 's|raise PayloadInstallationError("An error occurred during the transaction: " + msg)|log.warning("veilor: ignoring non-fatal transaction error: %s", msg)|' "$TP"
if grep -q 'veilor: ignoring' "$TP"; then
echo "[OK] transaction_progress.py patched in live rootfs"
# Drop the cached .pyc so the patched .py is what runs.
rm -f /usr/lib64/python3.14/site-packages/pyanaconda/modules/payloads/payload/dnf/__pycache__/transaction_progress.*.pyc 2>/dev/null || true
else
echo "[WARN] transaction_progress.py patch did not apply — file format may have changed in this anaconda version"
fi
else
echo "[WARN] transaction_progress.py not found at expected path — anaconda may have moved it"
fi
# Enable services
# veilor-firstboot.service NOT enabled on live ISO — it prompts admin pw
# which makes no sense on a live boot. Real installs enable it in their

View file

@ -486,6 +486,17 @@ zram-generator
-open-vm-tools-desktop
-mlocate
# Belt-and-braces with the kickstart/veilor-os.ks transaction_progress
# patch: even with the patch, man-db's transfiletriggerin in the F43
# RPM 6.0 toolchain dispatches a systemd-run that anaconda's chroot
# can race-with on exit. Excluding the package entirely guarantees the
# trigger never fires during install. Veilor users who want man pages
# install them post-firstboot via \`dnf install man-db man-pages\` or
# via the v0.6 \`veilor-postinstall\` welcome menu.
-man-db
-man-pages
-man-pages-overrides
%end
# ── Post-install (nochroot): copy overlay + scripts + assets from boot ISO.