veilor-org/veilor-os

Author	SHA1	Message	Date
obsidian-ai	6391b1104b	bluebuild(recipe): reconcile kickstart %post into BlueBuild modules (A2) Walk every action in kickstart/veilor-os.ks %post and map to its v0.7 atomic equivalent: Build-time script additions: - chmod +x /usr/share/veilor-os/scripts/* + /usr/local/bin/veilor-* (BlueBuild type:files sometimes drops perms) - fc-cache -f after Fira Code stamping - os-release brand override (NAME=veilor-os, ID=veilor, ID_LIKE) - brand-leak guard: fail the image build if any onyx/personal data slipped through into shipped state Layered packages: - zram-generator (memory hygiene; replaces dnf install in kickstart) - jq (used by veilor-doctor for `bootc status --json`) - vim-enhanced + tmux + htop (admin essentials, parity with v0.5.x) Systemd unit enables added: - veilor-postinstall.service (first-login TUI; new in A3) - veilor-doctor.timer (weekly drift check; new in A3) Dropped: anaconda transaction_progress.py patch (build-time CI work, not image content); SDDM display-manager symlink (kinoite ships sddm.service already); SELinux module build (secureblue has its own); systemctl set-default multi-user.target (kinoite is graphical.target by design).	2026-05-06 16:50:02 +01:00
obsidian-ai	4d53d76442	docs: v0.7 user-facing docs (INSTALL-V07, STRATEGY pivot, README, CHANGELOG) A4 inline (agent failed on API): - docs/INSTALL-V07.md: 130-line user walkthrough — bootstrap ISO, Anaconda LUKS prompts, ostreecontainer pull, first-login TUI, day- to-day bootc-upgrade / rpm-ostree-install / bootc-rollback. - docs/STRATEGY.md: append PIVOT EXECUTION 2026-05-06 section recording v0.5 ship, v0.6 cancel, v0.7 active. - README.md: rewrite Quick install block for v0.7 path; legacy v0.5.0 block kept below. - CHANGELOG.md: Unreleased entry covering the spike's CI port + atomic CLI port + docs.	2026-05-06 16:48:48 +01:00
obsidian-ai	606806f82f	overlay: atomic CLI tools for v0.7+ (bootc upgrade, postinstall, doctor) A3 inline (agent failed on API). Three CLIs ported / written for the v0.7+ atomic system: veilor-update — rewritten on bootc upgrade (was dnf upgrade --refresh). Pre-checks bootc status, pauses auditd while staging, prints summary and offers reboot. Returns 0/1/2/3 per legacy contract. veilor-postinstall (NEW) — first-login TUI run via veilor-postinstall.service oneshot. Asks once for keyboard, locale, hostname, GPU drivers, package presets (dev/media/homelab), bluetooth, USBGuard snapshot, then invokes veilor-doctor. Writes /var/lib/veilor/postinstall-complete and self-disables on success. veilor-doctor — Updates section rewritten to parse `bootc status --json` (with jq) when available, falls back to dnf history / check-update for legacy v0.5.x kickstart-installed systems. Plus systemd units: - veilor-postinstall.service (oneshot on graphical.target, gated on absence of done-marker, runs on tty1) - veilor-doctor.service + .timer (weekly drift check)	2026-05-06 16:46:59 +01:00
obsidian-ai	61fec5e1a9	ci(bluebuild): port build to Forgejo runner (nullstone label) A1 inline (agent failed on worktree base mismatch). Adapt build-bluebuild.yml to run on the Forgejo self-hosted runner using the same lessons from build-iso.yml debug: - runs-on: nullstone (resolves to veilor-build:43, fedora43+nodejs) - BlueBuild CLI installed in-job from upstream release tarball v0.9.10 - podman/buildah/skopeo/cosign installed via dnf - bluebuild build with podman driver + skopeo inspect + cosign signing - Push primary to Forgejo registry git.s8n.ru/veilor-org/veilor-os - GHCR push gated to github.server_url == 'https://github.com' only - SBOM + attest-build-provenance gated GH-only (Forgejo has no Fulcio) - All third-party actions remain pinned to node20-shipping versions Secrets needed in Forgejo repo settings: - FORGEJO_REGISTRY_TOKEN: PAT with package:write on veilor-org - FORGEJO_REGISTRY_USER: 's8n-ru' (or org member with write scope)	2026-05-06 16:44:52 +01:00
obsidian-ai	5e94a61ea0	docs(ROADMAP): pivot — v0.6 cancelled, v0.7 BlueBuild OCI is mainline Strategy pivot 2026-05-06: v0.5.32 produced a green ISO on Forgejo runner. That's the kickstart-path proof point. Continuing v0.6 kickstart polish is sunk-cost work on tooling retired at v1.0. Pivot: - v0.5.0 is the FINAL kickstart-path release. Tag, freeze, ship. - v0.6 cancelled as a milestone. Original plan kept inline as HISTORICAL reference. - v0.7 promoted to primary active milestone. Absorbs the v0.6 ergonomic CLI tools (veilor-postinstall / veilor-doctor / veilor-update) with bootc upgrade replacing dnf upgrade. - Active branch: v0.7-bluebuild-spike. All future feature work lands there, not on main.	2026-05-06 16:10:03 +01:00
obsidian-ai	d48e59f05b	docs: add PROOF-OF-WORK.md — receipts of work, tooling, and decisions Single document that surfaces the depth of work behind veilor-os: metrics, distros studied, every tool traversed in the build chain, all 35+ failure classes hit and beaten, key engineering decisions and why, what's in the repo beyond the kickstart, and the self-hosted nullstone CI infrastructure built to support it. Receipts not narrative — every claim links back to a file path, commit, error, or config. Useful as portfolio anchor and as a single read-this-first for anyone returning to the project after a gap.	2026-05-06 16:10:03 +01:00
obsidian-ai	ecd374ab1a	ci: gate cosign/sbom/attest steps to github only cosign keyless sign uses Sigstore Fulcio which requires a Fulcio-trusted OIDC issuer. Forgejo runs don't have one, so cosign falls back to the interactive device flow and times out (error obtaining token: expired_token). Same applies to attest-build-provenance and the SBOM action's signed attestation. Skip all three on Forgejo for now; ISO + sha256 are sufficient for v0.5.x test releases. Re-add when we self-host a Sigstore stack or sign with a key-pair instead of keyless.	2026-05-06 16:10:03 +01:00
obsidian-ai	e17c04007d	docs(README): tone down secureblue credit (no code lifted yet) We layer on their OCI image as v0.7 base; we don't redistribute their source. Drop the AGPLv3-attribution prose — that becomes relevant only if/when we ship a verbatim chunk of their config/policy in our repo.	2026-05-06 16:10:03 +01:00
obsidian-ai	97939d76f8	docs(README): add secureblue column + upstream credit section secureblue (AGPLv3) is the upstream hardened atomic Fedora that the v0.7 BlueBuild spike layers on top of. Comparison table now includes secureblue alongside Kicksecure + stock Fedora KDE. New "Credit & relationship to secureblue" section spells out where their work already solves problems we don't need to reinvent (Trivalent, SELinux policy, kernel cmdline, signed OCI), how veilor-os differs (kickstart install path + branding + Forgejo CI), and the AGPLv3 attribution rule for any code we lift verbatim.	2026-05-06 16:10:03 +01:00
obsidian-ai	abaff9d3c3	ci: symlink /work -> GITHUB_WORKSPACE for ks %post SRC probe	2026-05-06 16:10:03 +01:00
obsidian-ai	29a6677d54	ks: drop apparmor-* packages — not in Fedora 43 repos Build error: 'Failed to find package apparmor-parser : No match for argument'. Fedora 43 base and updates do not ship AppArmor packages; the prior comment was incorrect. Defer AppArmor to v0.7 secureblue OCI hybrid (which has its own LSM stack), or land via COPR overlay later.	2026-05-06 16:10:03 +01:00
obsidian-ai	b3572565e2	ci: run build directly in Fedora job container, drop addnab nest forgejo-runner labels nullstone -> fedora:43 image. Switching runs-on: ubuntu-24.04 -> nullstone makes the job container itself the build environment, eliminating the docker-in-docker workspace bind-mount problem (host path != act-container path). Build now runs as root in fedora:43, installs livecd-tools directly via dnf, and writes outputs to $GITHUB_WORKSPACE which is the natural runner workdir on host. No nested docker, no userns juggling, no explicit -v workspace bind needed.	2026-05-06 16:10:03 +01:00
obsidian-ai	9bf063a178	ci: add /work diagnostic before sed-redirect to surface bind/perm issue	2026-05-06 16:10:03 +01:00
obsidian-ai	3f138e7435	ci: repin fedora:43 build container to amd64 digest Prior pin was the arm64 manifest digest (linux/arm64/v8); on x86_64 host it failed with `exec /usr/bin/sh: exec format error`. Pinned to the amd64 manifest entry from the same fat-manifest.	2026-05-06 16:10:03 +01:00
obsidian-ai	7d6054311b	ci: add --userns=host to nested Fedora build container Forgejo runner on nullstone runs against a daemon with userns-remap=default. addnab/docker-run-action launches the Fedora 43 build container with --privileged, which is incompatible with userns-remap unless --userns=host is also set.	2026-05-06 16:10:03 +01:00
obsidian-ai	6b0828d692	ci: pin sbom/cosign/attest actions to node20-safe versions forgejo-runner v6.4.0 ships node20; floating tags @v0/@v3/@v2 now resolve to actions whose runs.using=node24, which the runner cannot exec. Pin to last node20-shipping release of each: - anchore/sbom-action@v0.17.2 - sigstore/cosign-installer@v3.7.0 - actions/attest-build-provenance@v2.2.3	2026-05-06 16:10:03 +01:00
s8n	a59f1f026a	ci: gate softprops release steps + add Forgejo API equivalents The build-iso workflow used softprops/action-gh-release@v2 unconditionally, which only speaks the GitHub Releases REST API. When the workflow runs on the Forgejo runner registered on nullstone, those steps would fail. Add a server_url check so the GH-only path runs only on github.com, and mirror it with a curl-based step that hits the Forgejo /api/v1/releases endpoints. Behaviour: - github.com: identical to before (action-gh-release@v2). - git.s8n.ru: drop+recreate ci-latest release, upload chunked assets via the Forgejo attachments API. Tag-driven "Attach to release" path mirrored the same way. Refs: A1 build-eng task — Forgejo runner adaptation.	2026-05-06 16:10:03 +01:00
veilor-org	beef32a77c	feat(installer): promote eject-media reminder to its own box In v0.5 the "Remove the install media" reminder was a single line inside the green success box, and operators on both onyx and the friend's RTX 4080 rig missed it — rebooted into the live ISO and re-ran the installer thinking the install had silently failed. Promote the reminder to its own loud yellow thick-bordered gum-style box stacked directly below the success/countdown box, with three lines of explanation. Renders for the full 10s of the countdown so it stays in the operator's face the entire window.	2026-05-06 16:10:03 +01:00
veilor-org	0a70eea950	feat(installer): 10s reboot countdown with per-tick redraw v0.5 used `sleep 5` after a static "System will reboot in 5 seconds." box, which left the operator guessing how much time was left to grab the USB stick. The new loop runs 10 → 1, clearing + redrawing the gum-style success box each tick with the remaining-seconds figure, giving the operator a visible window to act. 10 seconds (vs 5) because real hardware operators were missing the window — laptops with the USB on the far side of the dock take 4-5 seconds to physically reach. 10 is comfortable, not annoying.	2026-05-06 16:10:03 +01:00
veilor-org	877ad91096	feat(installer): confirm-twice for LUKS passphrase + admin password A typo in the LUKS passphrase is unrecoverable — the disk is unmountable without it and we don't escrow the key. Re-prompting until the two reads match catches keyboard-layout surprises (the US/UK quote-key position is the most common one) before they brick the install. Admin password gets the same treatment for consistency. Less catastrophic (resettable from a recovery shell) but a mismatch still locks the user out of their fresh install on first boot. Loop bails on cancel/ESC and re-prompts on validate_pw failure.	2026-05-06 16:10:03 +01:00
veilor-org	a3b3d29b38	feat(installer): staged banner reveal at 40ms/line Read banner.txt line by line with a 40ms sleep between each, then clear and redraw the bordered gum-style version. 5-line banner × 40ms = 200ms total reveal — slow enough to land an aesthetic on the first frame, fast enough that the operator never feels it as lag. Pure cosmetic; no functional change to the install flow.	2026-05-06 16:10:03 +01:00
veilor-org	55221a6af2	fix(installer): swap gum input --password for bash read -srp `gum input --password` corrupts the linux fbcon since v0.5.27 — the bubbletea screen-restore writes back the previous menu buffer because the framebuffer terminfo entry lacks `civis/cnorm` cursor-hide sequences, leaving a duplicate "Install" plus a stray "T" rendered on top of the password field. The fix is a single termios echo-off via `read -srp`: no redraw, no glitch, no dependency on gum's TUI layer for the one screen where it broke. Header still rendered through `gum style` so visual parity with the disk picker / confirm box is preserved. Whiptail fallback path unchanged (passwordbox there has always rendered cleanly).	2026-05-06 16:10:03 +01:00
s8n	d76597c57a	sec: AppArmor v0.6 stub — load profiles in complain mode Per docs/research/2026-05-05-agent-wave/04-hardening-tier-2.md (v0.6 scope item 1). Adds: - apparmor-parser apparmor-utils apparmor-profiles to %packages in BOTH kickstart/veilor-os.ks (live ks) and overlay/usr/local/bin/ veilor-installer (generated install ks heredoc). - scripts/40-apparmor.sh — wires aa-complain on every veilor-shipped profile. Idempotent. "loaded, present, nothing breaks". - overlay/etc/apparmor.d/veilor.d/firefox — 1-liner stub (binary confinement marker only; full policy post-v0.6). - overlay/etc/apparmor.d/veilor.d/thunderbird — same pattern. - Wired 40-apparmor.sh into install %post chain after 30-apply-v03-theme.sh. Complain mode means: profiles loaded, kernel logs syscall denials but does NOT enforce. Operator can review audit.log post-install to inform v0.7 policy authoring.	2026-05-06 16:10:03 +01:00
veilor-org	631e7bd040	ci: TODO marker for SHA-pinning third-party actions Note that all `uses:` directives still resolve to mutable major- version tags. SHA-pinning is the Agent 8 audit recommendation but requires per-action web lookups that stalled the previous SRE attempt; tracked separately so this PR can land first.	2026-05-06 16:10:03 +01:00
veilor-org	9158532c9d	ci: pin fedora:43 base image to digest Pin registry.fedoraproject.org/fedora:43 to its current manifest digest so a malicious or accidental tag-rewrite upstream cannot silently change the base layer of every CI build. Digest was captured via `skopeo inspect --raw` on 2026-05-06. Refresh procedure documented inline.	2026-05-06 16:10:03 +01:00
veilor-org	e93ef644e1	ci: add cosign keyless sigs, SBOM, and provenance attestation Sign each ISO chunk with cosign keyless OIDC, generate an SPDX SBOM of the build output, and attach an in-toto build-provenance attestation. Sigs/certs/SBOM are uploaded alongside the ISO parts in the ci-latest rolling prerelease so the test/auto-install.sh path can verify before reassembling. Action versions are major-version tags (@v3, @v0, @v2). SHA-pinning is tracked separately to keep this PR small and avoid the long web lookups that stalled the previous attempt.	2026-05-06 16:10:03 +01:00
obsidian-ai	21f2b4da9a	ci: pin actions to node20-safe tags + runner sock pass-through forgejo-runner v6.4.0 ships a node20 javascript engine. v4.2+ of actions/checkout and v2.0.5+ of softprops/action-gh-release moved to node24, which the runner refuses to exec. Pin both to last node20 release. Pairs with a runner-side config change (separately deployed on nullstone /home/docker/forgejo-runner/conf/config.yaml) that adds `-v /var/run/docker.sock:/var/run/docker.sock` to per-job container options + whitelists the socket via valid_volumes — without that addnab/docker-run-action@v3 inside the catthehacker/ubuntu job container can't reach the docker engine. - actions/checkout v4 -> v4.1.7 - softprops/action-gh-release v2 -> v2.0.4 - addnab/docker-run-action v3 unchanged (composite/docker, no node) - ludeeus/action-shellcheck@master unchanged (docker-based)	2026-05-06 16:10:03 +01:00
s8n	91d5d26473	sec: polish THREAT-MODEL.md for v0.7 public launch Status flipped Draft → Final. In-scope rows now cite specific config files / settings (auditable from clean checkout): - LUKS2 params from kickstart/veilor-os.ks - sysctl knobs file path - USBGuard policy mode + rule type - sshd_config drop-in path + every directive - auditd rule path + watched paths - chrony NTS endpoints - systemd-resolved DoT settings - bootloader kernel args (lockdown, slab_nomerge, init_on_alloc/free, etc.) Out-of-scope rows un-hedged. 'May not always' phrasings removed; each adversary states unambiguously what veilor-os does NOT do.	2026-05-06 16:10:03 +01:00
veilor-org	c7c0a0bcc8	docs: test run report skeleton for v0.5.32 (Forgejo build) First test-runs/ report off the new template. Records the build host (forgejo-runner on nullstone, ubuntu-24.04 / catthehacker:act-24.04), notes that v0.5.32 is the first ISO produced after the GH Actions mirror was disabled, and pre-populates the Findings section with the 7 v0.5.32 blocker fixes from the 2026-05-05 9-agent wave as expected behaviours the tester must verify. Result is left as "pending A1 build" — the operator + A5 fill in per-step pass/fail and hardening output once the actual VM walkthrough runs against the produced ISO. This is intentional: the report is the scaffold; the test is a separate step.	2026-05-06 16:10:03 +01:00
s8n	9011fd2dbf	docs: README de-GH + Forgejo build status	2026-05-06 16:10:03 +01:00
s8n	20b3541d38	docs: METHOD-CHANGELOG 2026-05-06 forgejo entry	2026-05-06 16:10:03 +01:00
veilor-org	1fa45c3749	chore: gitignore auto-install-vm test artifacts Test rig was leaking test/auto-install-vm.{nvram,qcow2} into untracked state. Pattern matches existing veilor-vm.* exclusion.	2026-05-06 16:10:03 +01:00
veilor-org	d9b206e46b	docs: STRATEGY.md — primary git host moved to git.s8n.ru (Forgejo) Self-hosted Forgejo + forgejo-runner on nullstone now primary. GitHub becomes public mirror (Forgejo push-mirrors every commit + every 8h). 0 GH Actions minutes consumed. Runner labels: ubuntu-24.04 — drop-in for existing build-iso.yml workflow nullstone — privileged Fedora 43 (opt-in via runs-on: nullstone) Deploy artifacts: ~/ai-lab/nullstone-server/forgejo/. External TODO (parent operator owns): - router port-forward 222 → nullstone:222 for public SSH push - no-guest@file allowlist update for external web UI access	2026-05-06 16:10:03 +01:00
veilor-org	89949dc8f2	v0.5.32: ship 7 blockers from 9-agent wave Per docs/research/2026-05-05-agent-wave/README.md priority list. All 7 land together to keep iteration cycles useful — partial fixes bury the lookahead findings agents already mapped. ## 1. CRITICAL — suspend/resume wifi death (Agent 9, B2) `veilor-modules-lock.service` runs `kernel.modules_disabled=1` 30s after graphical.target. iwlwifi/iwlmvm/cfg80211 reload on resume from S3/S0ix → with modules locked, resume breaks wifi until reboot. Same architectural class as the LUKS bug — security feature breaks legitimate kernel state transitions. The unit already has `ConditionKernelCommandLine=!module.sig_enforce=1` (self-skip when signed-modules enforcement is on cmdline). Adding `module.sig_enforce=1` to the kernel cmdline retains the security property (no unsigned modules) without runtime lock-down → resume works. Files: kickstart/veilor-os.ks line 61 + overlay/usr/local/bin/veilor-installer generated bootloader directive both gain `module.sig_enforce=1`. ## 2. veilor-firstboot.service WantedBy=graphical.target (Agent 2) Was `WantedBy=multi-user.target` only. Real installs default to graphical.target so the unit never ran on installed systems — admin pw stayed at install-time + chage -d 0 expired, SDDM PAM bounced to chauthtok screen (recoverable but ugly UX). Now `WantedBy=graphical.target multi-user.target`. Live ISO + multi-user installs both resolve via this list. ## 3. USBGuard hash → id-based baseline (Agent 9, A3) Mirrors memory feedback_usbguard_dock.md — onyx had hash+parent-hash rules that broke on dock replug; we shipped no rules.conf so first boot blocks the USB keyboard. Adds overlay/etc/usbguard/rules.conf with HID-class allow rule (`allow with-interface match-all { 03:: }`) — covers every USB keyboard, mouse, gamepad, fingerprint reader, NFC. Survives dock replug + kernel-bump vendor renumeration. Mass-storage stays implicit-block; user explicitly allows post-firstboot via `ujust veilor-usbguard-enroll` (planned v0.6). ## 4. firewalld trusted zone with tailscale0 pre-bound (Agent 9, D1) User uses Tailscale daily (memory: project_tailscale_mesh.md). Default firewalld zone = drop, blocks tailnet traffic on tailscale0. Adds overlay/etc/firewalld/zones/trusted.xml with `<interface name="tailscale0"/>`. After `tailscale up` brings the interface up, NetworkManager dispatcher associates it with the trusted zone automatically — no user intervention. Default zone stays drop. Only the tailscale0 interface gets ACCEPT. ## 5. /etc/skel branding (Agent 7) Was completely empty. Result: per-user KDE config (~/.config/kdeglobals etc.) pre-empty, so the moment user opened System Settings, KDE wrote fresh ~/.config/* and silently shadowed our /etc/xdg/kdedefaults/*. Visual brand evaporated on first click. Seeds: /etc/skel/.config/kdeglobals (copy of assets/kde/veilor-default.kdeglobals) /etc/skel/.config/breezerc (copy of assets/kde/breezerc) /etc/skel/.config/kwinrc (Plasma 6 wayland defaults: opengl, animspeed=0, blur off, click-to-focus) /etc/skel/.config/konsolerc (default profile = Veilor) /etc/skel/.local/share/konsole/Veilor.profile + .colorscheme User who opens System Settings now writes against branded baseline, not against vanilla Breeze. ## 6. KMS modeset args + initramfs keymap (Agents 1 + 9) Real laptop boot has a 5-15s blank between vt switch and SDDM start because simpledrm releases before i915/nvidia-drm/amdgpu claim. Plus non-US users get locked out at LUKS prompt because initramfs ships en-US keymap by default (RHBZ 1405539, RHBZ 1890085). Adds to bootloader cmdline (live + installed): i915.modeset=1 amdgpu.modeset=1 nvidia-drm.modeset=1 rd.vconsole.keymap=us `rd.vconsole.keymap=us` is a placeholder; the v0.6 firstboot keymap picker will rewrite it from /etc/vconsole.conf. Until then, en-US users get correct LUKS keyboard; non-US users still need the v0.6 fix (per Agent 1). ## 7. virtio-9p log capture (Agent 6) The v0.5.30 virtio-serial wiring depends on rsyslog inside the live ISO (anaconda's setupVirtio writes a rsyslog forward rule), which the live ks doesn't install — files were 0-byte across three install runs. test/run-vm.sh now adds a `-virtfs local,...,mount_tag=hostlogs` share pointing at `test/test-runs/<timestamp>/`. veilor-installer runs `_dump_logs_to_host` via EXIT trap that mounts the share at /mnt/hostlogs and rsyncs /tmp/{anaconda,program,storage,packaging,dnf}.log + /var/log/veilor-installer.log + dmesg + journalctl + the generated ks. Runs on success AND failure AND ^C. No-op on real hardware (9p tag absent) — VM-only debug. ## Validate bash -n overlay/usr/local/bin/veilor-installer # OK ksvalidator kickstart/veilor-os.ks # clean ## Out-of-scope for v0.5.32 (deferred to v0.6) Per Agent 1 follow-ups: argon2id retune for slow CPUs, recovery key generation in firstboot, TPM2/FIDO2 unlock helpers. Per Agent 9 follow-ups: Plasma Wayland fallback X11 install, lid-close handling, SELinux relabel progress UX. Per Agent 4: AppArmor stack + nftables preset + audit log shipping CLI. Per Agent 8 (CI hardening): SHA-pin actions + dependabot + SBOM + SLSA L3 attestation — separate workflow-only commit.	2026-05-06 16:10:03 +01:00
s8n	0b568b016b	Merge pull request 'ci(bluebuild): pin actions to node20-safe tags' (#9 ) from feat/runner-fix-node20-pinning into v0.7-bluebuild-spike	2026-05-06 13:54:31 +01:00
obsidian-ai	e50c9a3b43	ci(bluebuild): pin actions to node20-safe tags forgejo-runner v6.4.0 javascript runtime is node20. Pin every javascript action used in the spike branch's workflows to the last release that ships node20. - actions/checkout v4 -> v4.1.7 (3 files) - softprops/action-gh-release v2 -> v2.0.4 (build-iso) - anchore/sbom-action v0 -> v0.17.2 - actions/attest-build-provenance v2 -> v2.2.3 - blue-build/github-action@v1 unchanged (TODO: SHA pin) This is the spike-branch counterpart of the main-branch fix in feat/runner-fix-docker-sock-and-node20.	2026-05-06 13:54:12 +01:00
s8n	9dc2846316	Merge pull request 'ci(bluebuild): pin blue-build/github-action to commit SHA' (#6 ) from feat/a1-bluebuild-pin into v0.7-bluebuild-spike	2026-05-06 13:53:15 +01:00
s8n	f2e36bfead	ci(bluebuild): pin blue-build/github-action to commit SHA Replace @v1 with @24d146df25adc2cf579e918efe2d9bff6adea408 (the commit v1 currently resolves to). Tag pins on third-party actions are mutable — a maintainer or attacker can re-point v1 at a malicious commit and silently change what runs on every push. Trailing comment '# v1' preserves human readability for future bumps. Refs: 9-agent CI hardening wave (agent 8), 2026-05-05.	2026-05-06 10:32:13 +01:00
veilor-org	3c247bc601	v0.7 spike: BlueBuild recipe + ostreecontainer kickstart + cosign workflow Initial scaffold for the v0.7 hybrid path. Spike branch only — does NOT land in main until success criteria pass (see bluebuild/README.md). ## What this commits - bluebuild/recipe.yml — BlueBuild recipe extending ghcr.io/secureblue/securecore-kinoite-hardened-userns:latest with: * veilor branding overlay (overlay/, assets/, scripts/ at /usr/share/veilor-os) * sudo restored (revert secureblue's run0-only) * Xwayland restored (some apps still need it) * mullvad-browser layered alongside Trivalent (default browser kept) * tailscale + yggdrasil packages (mesh stack layers 1 + 2) * tailscaled.service pre-disabled (awaits first-boot prompt) * yggdrasil.service enabled (idle warm-fallback per STRATEGY.md) * veilor-firstboot.service + veilor-modules-lock.service enabled * cosign signing module configured - bluebuild/config/just/60-veilor.just — ujust recipes: * install-reticulum (RetiNet AGPL fork — mesh layer 3) * install-reticulum-rnode (LoRa hardware) * install-thorium (opt-in browser with explicit CVE-lag warning) * veilor-mesh-join (token paste / QR for tailscale onboarding) - bluebuild/README.md — spike doc + smoke-test commands + 5-item success criteria checklist - kickstart/install-ostreecontainer.ks — install kickstart template for the v0.7 path. No %packages block; uses `ostreecontainer --url=ghcr.io/veilor-org/veilor-os:43 --transport=registry` to populate / from the OCI image directly during anaconda's install pass. No first-boot rebase, no transition window. Keeps existing LUKS+btrfs partitioning verbatim. - .github/workflows/build-bluebuild.yml — GH Actions workflow: * Triggered on push to v0.7-bluebuild-spike, weekly cron, dispatch * Uses blue-build/github-action@v1 (TODO: pin to commit SHA per CI hardening agent 8 follow-up) * Builds + cosign-signs (keyless via Sigstore) + pushes to GHCR * Smoke-tests the OCI image (sudo, mullvad-browser, yggdrasil, tailscale all present) * Generates SBOM (SPDX) via anchore/sbom-action * Publishes SLSA build provenance attestation ## What this does NOT change - main branch is untouched. v0.5.x kickstart path keeps shipping. - kickstart/veilor-os.ks (the live-ISO ks) is untouched — the v0.7 hybrid uses the existing live-ISO build path; only the install-time ks (install-ostreecontainer.ks) is new. - overlay/, scripts/, assets/ are untouched on this branch — the recipe pulls them in via `type: files` modules at build time. ## Spike success criteria (reproduced from bluebuild/README.md) - [ ] `bluebuild build recipe.yml` exits 0 - [ ] `bootc container lint` exits 0 on resulting image - [ ] `podman run` smoke-test passes - [ ] CI workflow builds + cosign-signs + pushes to GHCR - [ ] Installer ISO using `ostreecontainer` against this OCI reaches SDDM with admin login on first boot If all 5 land, merge v0.7-bluebuild-spike → main as v0.7.0. ## Reference - docs/STRATEGY.md (full plan) - docs/ROADMAP.md v0.7 (schedule) - docs/THREAT-MODEL.md (publish before v0.7 ship) - secureblue: https://github.com/secureblue/secureblue - BlueBuild: https://blue-build.org - ostreecontainer: https://docs.fedoraproject.org/en-US/bootc/anaconda-install/	2026-05-05 15:30:04 +01:00
veilor-org	7060d9aa6b	docs: refine strategy — ostreecontainer install + mesh stack + browser stack Refines docs/STRATEGY.md per parent-operator handoff (2026-05-05). Locks in five things the original draft didn't cover, and corrects one mistake. ## Refinement: ostreecontainer install path The original draft proposed a two-step install: Anaconda partitions + kickstart, then on first boot a `veilor-firstboot-rebase.service` runs `bootc rebase ghcr.io/veilor/veilor-os:43`. This commit drops that step. Anaconda's `ostreecontainer --url=... --transport=registry` directive populates the root filesystem directly from the OCI image during the install pass. No first-boot rebase, no transition window, no second reboot. Same end state, simpler path. Stay on `ostreecontainer` through v0.8. Do NOT migrate to the new `bootc` kickstart command until v1.0 — it blocks multi-disk and authenticated registries. Do NOT use `bootc-image-builder anaconda-iso` output — deprecated in image-builder v44+. Produce the OCI image and the bootstrap ISO as separate artifacts. This compresses the v0.7 BlueBuild spike from 2 days → 1 day. ## Correction: keep Trivalent as default The original strategy.md treated Trivalent (secureblue's hardened Chromium) as an override-and-remove. That was wrong: Trivalent's COPR tracks upstream M147+ within hours, ships hardened_malloc + JIT-less + Drumbrake WASM. Default browser pick. Mullvad Browser layered alongside for anti-fingerprint. Thorium remains opt-in via `ujust install-thorium` only — its CVE lag is months and contradicts the threat model. Never default. ## Mesh stack baked in Three-layer warm-stack documented in STRATEGY.md: - L3a Tailscale + Headscale (Day 1, daily driver) - L3b Yggdrasil-go (Day 1, idle warm-fallback, AllowedPublicKeys mode) - L3c Reticulum/RetiNet AGPL fork (opt-in via ujust install-reticulum) Threat floor table: ISP-DNS-block (i, Day 1), ISP-Tailscale-block (ii, Phase 2 promote Yggdrasil), internet-down (iii, opt-in RetiNet + RNode). Tier model: tag:admin / tag:infra / tag:guest with failsafe pre-auth key on yubikey + paper + Authentik OIDC group. ## Onboarding Token paste / QR (user picks). Misskey signup mints reusable 24h-TTL pre-auth key. NOT auto-OIDC at first boot. ## Iroh seeding daemon stub (v0.8 / Phase 2) `veilor-seed.service` documented but NOT implemented until Iroh hits 1.0 (current 0.96–0.98 RC, Q1 2026 target slipped). BLAKE3 + iroh-gossip per-service topic. Static media only — DEFER DB replication forever. ## External dependency tracked nullstone Traefik `no-guest@file` ACL is currently 0.0.0.0/0 allow-all (XFF chain breakage 2026-05-03). Must be fixed before veilor-os first-public-ISO ships, otherwise tag:guest provisioning leaks the full vhost surface to every veilor user. Parent operator owns the fix; explicitly out of veilor-os scope. ## Files - docs/STRATEGY.md — full refinement - docs/ROADMAP.md — v0.7 spike entry now reflects ostreecontainer + mesh stack + 1-day spike target - README.md — drops the "v0.2.5 pre-release" badge + status box (out of date), adds bootc/atomic trajectory paragraph ## What did NOT change - v0.5.x main branch is untouched. The ostreecontainer swap belongs in the v0.7 spike branch, NOT v0.5.32. - nullstone Traefik config is untouched. Out of scope. - The kickstart and overlay code is untouched.	2026-05-05 15:15:52 +01:00
veilor-org	50a241a603	docs: STRATEGY.md — hybrid kickstart bootstrap + bootc OCI on secureblue Locks in the strategic decision from 2026-05-05 secureblue research agent: pivot the technical base toward bootc/OCI, but as a layer over secureblue's `securecore-kinoite-hardened-userns` rather than a Containerfile-from-scratch. ## What changed - New: `docs/STRATEGY.md` — full hybrid plan (kickstart bootstrap → first-boot bootc rebase → bootc-only at v1.0). Documents secureblue rationale, our overrides (drop Trivalent, restore sudo + Xwayland), next concrete steps for v0.7 spike (BlueBuild recipe + GH Actions workflow + `veilor-firstboot-rebase` one-shot). - Updated: `docs/ROADMAP.md` v0.7 bootc-spike subsection — supersedes the Agent 3 Containerfile-from-scratch plan with the BlueBuild layering plan. Spike compresses 1 week → 2 days; hardening review inherited from 30 secureblue contributors. ## Why hybrid, not pure pivot - Anaconda's LUKS UX (single passphrase prompt + custom partitioning) is mature; bootc-image-builder's installer is not yet on par. Keep the kickstart as the bootstrap. - bootc upgrade gets us atomic A/B + signed image chain + instant rollback that we can't realistically build alone with our contributor count. - The kickstart work is not lost — it becomes the day-zero installer through v0.7. v1.0 deprecates it entirely once bootc-image-builder installer ISO matures. ## Why secureblue, not Athena (Arch) \| Axis \| secureblue \| Athena OS \| \|---\|---\|---\| \| Maintainers \| 30 \| 8 \| \| MAC enforcing OOB \| SELinux + custom policy \| AppArmor active, profiles mostly unconfined \| \| Atomic / immutable updates \| Yes (bootc/rpm-ostree) \| No (rolling) \| \| Threat model published \| No \| Yes \| \| MS-signed Secure Boot shim \| Yes (Fedora shim) \| Yes (with auto-MOK) \| Athena's only structural advantage is the published threat model. We're already drafting one (Agent 5 of 2026-05-05 wave) — we get that win regardless. secureblue's contributor count + atomic update infrastructure is the leverage. ## Strategic credibility win Publishing `docs/THREAT-MODEL.md` BEFORE the v0.7 launch positions veilor-os ahead of secureblue (no threat model) and Athena (has threat model but smaller contributor base) on the one axis that matters most. ## Open questions documented in STRATEGY.md - secureblue contribution acceptance for upstream patches (USBGuard id-based-rules fix, threat model framework) - Brave vs Mullvad-Browser pick for default browser - bootc rebase first-boot fallback if rebase fails - Fedora 44 transition timing follows secureblue's release tags	2026-05-05 15:05:59 +01:00
veilor-org	4e9782a18a	docs: 9-agent research wave findings — v0.5.32 blocker map Logs the full output of the 9-agent deep-dive run on 2026-05-05 to docs/research/2026-05-05-agent-wave/. Pulls every actionable finding into one indexed location so v0.5.32 planning has a paper trail. Files: docs/research/2026-05-05-agent-wave/README.md — index docs/research/2026-05-05-agent-wave/01-...real-hardware.md — Plymouth + LUKS edge cases docs/research/2026-05-05-agent-wave/02-...firstboot-ux.md — SDDM + first-boot UX docs/research/2026-05-05-agent-wave/03-...spike-plan.md — bootc-image-builder 1-week spike docs/research/2026-05-05-agent-wave/04-...tier-2.md — AppArmor + nftables + audit + homed docs/research/2026-05-05-agent-wave/05-...launch.md — threat model + v0.7 launch checklist docs/research/2026-05-05-agent-wave/06-...log-capture.md — virtio-9p host-share for anaconda logs docs/research/2026-05-05-agent-wave/07-...skel-branding.md — /etc/skel gap audit docs/research/2026-05-05-agent-wave/08-...ci-hardening.md — SHA-pin actions + SBOM + SLSA L3 docs/research/2026-05-05-agent-wave/09-...failure-modes.md — real-hardware pessimistic audit Plus the prior linter-applied: docs/ROADMAP.md — Lessons learned section, v0.5.32 active block, v0.6 promotion of veilor-postinstall + veilor-doctor, v0.7 bootc spike scheduled docs/THREAT-MODEL.md — drafted by Agent 5; in/out scope, comparison matrix, v0.7 launch checklist Top blockers identified for v0.5.32 (cross-cited in README): 1. Suspend/resume wifi death (kernel.modules_disabled=1) 2. veilor-firstboot.service WantedBy=graphical.target 3. kernel-upgrade grub drift 4. USBGuard hash-rules problem (already learned on onyx) 5. firewalld blocks tailscale0 6. /etc/skel/ empty 7. virtio-9p log capture replaces broken virtio-serial path Wave + verifier pattern (per ROADMAP lessons learned #4) validated: 9 parallel agents on distinct topics produced converging blocker list. The same pattern landed v0.5.31 four-bug fix from the prior 4-agent verification wave on v0.5.30 outcome.	2026-05-05 14:52:53 +01:00
veilor-org	2788b95a12	v0.5.31: kernel-install via /etc/kernel/cmdline + set-e leak + rescue glob Four-bug fix from 4-agent verification wave on v0.5.30 outcome. Bug 1 CRITICAL: --location=none made anaconda skip CollectKernelArgumentsTask (installation.py:149-151). --append= args never collected, BLS entries wrote with empty cmdline. Drop --location=none, let anaconda do its bootloader path; broad transaction_progress patch already silences the gen_grub_cfgstub class failure. Bug 2 CRITICAL: kernel-install reads /etc/kernel/cmdline as source of truth (per 90-loaderentry.install:84-95). Veilor never wrote that file so kernel-install fell through to /proc/cmdline (live ISO's). Add 3-path write: /etc/kernel/cmdline (Path A canonical), /etc/default/grub (Path B legacy), grubby --update-kernel=ALL (Path C last-writer guard). Plus explicit kernel-install add per kernel after Path A write. Bug 3: rescue BLS glob -0-rescue-.conf required trailing hyphen; F43 uses -0-rescue.conf. Fix: -0-rescue*.conf (matches both). Bug 4: set +e/set -e scope leak in %post. v0.5.30 closed manual bootloader block with set -e which re-enabled errexit for the rest of %post that was authored with set +e semantics. Result: any non-guarded command failure aborted the LUKS args injection block. Fix: remove the closing set -e. Files: overlay/usr/local/bin/veilor-installer. Verified: bash -n clean, ksvalidator clean.	2026-05-05 13:47:23 +01:00
veilor-org	e83483a077	v0.5.30: broad error suppression + manual bootloader + virtio log capture Three-layer fix for the persistent anaconda transaction failure that killed v0.5.28 (gen_grub_cfgstub) and v0.5.29 (aggregate dnf5 error). ## Layer 1: broad error suppression in transaction_progress.py dnf5 under RPM 6.0 + cmdline anaconda emits a final aggregate `error("transaction process has ended with errors..")` at end of transaction whenever its internal failure counter > 0, regardless of whether we suppressed individual script_error events. Reproduced twice. The narrow patch in v0.5.29 suppressed per-package errors but the aggregate still raised PayloadInstallationError and aborted the install before the bootloader phase ran. v0.5.30 patch turns the `elif token == 'error':` branch in process_transaction_progress into a log.warning. All four producers (cpio_error, script_error, unpack_error, generic error) now flow through to a warning + continue. Pattern matches both the original anaconda layout AND the v0.5.29 narrow-patched layout, so re-applying on top of either is a no-op. This brings us back to v0.5.28 broad-suppression behaviour. The side effect that bit us in v0.5.28 (silent grub2-efi-x64 scriptlet failure → empty /boot/efi/EFI/fedora/ → gen_grub_cfgstub fails) is addressed by Layer 2 below. ## Layer 2: bootloader install moved out of anaconda The generated install kickstart now has `bootloader --location=none`, which tells anaconda NOT to invoke its own bootloader install code path (and therefore NOT to call gen_grub_cfgstub). All grub work moves into the chroot %post block: 1. `dnf reinstall grub2-efi-x64 grub2-pc grub2-tools shim-x64 efibootmgr` — re-runs scriptlets in the chroot with full PID 1 systemd state, so the systemd-run-style triggers that anaconda's chroot truncates actually execute. 2. `grub2-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=fedora --no-nvram` — populates /boot/efi/EFI/fedora/ 3. `gen_grub_cfgstub /boot/grub2 /boot/efi/EFI/fedora` (or `grub2-mkconfig` fallback) — writes /boot/efi/EFI/fedora/grub.cfg. 4. `efibootmgr -c -d <disk> -p <part> -L "veilor-os" -l \EFI\fedora\shimx64.efi` — registers the NVRAM boot entry pointing at the signed shim. Each step logs to stdout and continues on failure (`set +e` block); diagnostics surface in the install log without aborting the whole %post. ## Layer 3: virtio-serial log capture in run-vm.sh Anaconda 43.x autodetects `/dev/virtio-ports/org.fedoraproject.anaconda.log.0` and streams program/packaging/storage/anaconda logs through it in real time, before any tmpfs / pivot, before networking, surviving kernel panic. Wiring it into run-vm.sh means the host gets a tail-able log file at `test/anaconda-vm-YYYYMMDD-HHMMSS.log` for every VM run. We've lost logs three times in a row to anaconda failures + tmpfs reboots. This breaks the loop. ## Diagnostic story Before this commit: VM aborts → live ISO reboots itself → /tmp/ tmpfs gone → no logs → guess what failed. Three days, two and a half false fixes. After this commit: VM aborts → host has /home/admin/ai-lab/_github/veilor-os/test/anaconda-vm-*.log with the actual scriptlet output, the actual exit codes, the actual file-trigger failures. Future debug becomes evidence-based. Files changed: kickstart/veilor-os.ks — broad error suppression patch overlay/usr/local/bin/veilor-installer — --location=none + manual grub test/run-vm.sh — virtio-serial chardev wiring Verified: bash -n clean, ksvalidator clean.	2026-05-05 11:59:35 +01:00
veilor-org	613d35402e	v0.5.29: narrow anaconda patch + LUKS UX + initramfs assertion Five-fix bundle from 7-agent research wave on the v0.5.28-final gen_grub_cfgstub failure. ## 1. Narrow the anaconda transaction_progress patch (CRITICAL) The v0.5.28 patch was too broad. It rewrote `process_transaction_progress` so every 'error' token in the transaction queue became a `log.warning`. That queue carries four distinct error classes: - cpio_error — payload extraction (genuinely fatal) - script_error — RPM 6.0 cmdline-mode scriptlet warning-as-error (the ONE we want to ignore) - unpack_error — payload corruption (genuinely fatal) - error — generic transaction error (genuinely fatal) By swallowing all four we silently masked grub2-efi-x64's posttrans failure mid-install. /boot/efi/EFI/fedora/ ended up incomplete → gen_grub_cfgstub then failed at the bootloader install phase with "gen_grub_cfgstub script failed" because its `set -eu` script couldn't read the missing files. v0.5.29 narrows the patch: override only the `script_error` callback inside transaction_progress.py to log a warning and NOT enqueue 'error'. The consumer (`process_transaction_progress`) reverts to upstream behaviour where cpio_error / unpack_error / error still raise PayloadInstallationError. Real install-fatal events keep aborting; only the F43-RPM-6.0 scriptlet regression is silenced. The patch is applied via `python3 -c` regex rewrite (more robust than nested sed across multi-line method bodies). ## 2. LUKS UX — `tries=5,timeout=0` (FIX) Default cryptsetup-generator unit allows ONE passphrase try with a 1m30s wait. One typo on a long passphrase = wait 1m30s, then the device-wait timer trips, then dracut emergency shell after 3min total. Brutal. Adding `rd.luks.options=luks-XXX=tries=5,timeout=0` gives five typo-friendly retries with no auto-timeout. ## 3. fbcon=nodefer on installed-system cmdline (FIX) Live ISO cmdline already has `fbcon=nodefer` (added in v0.5.27 to fix the real-laptop black-screen-after-dracut). The installed-system bootloader directive in the generated install ks did NOT carry it. Same KMS handoff happens on the installed system on the same hardware. Now both have the flag. ## 4. /etc/crypttab fallback assertion (BELT-BRACES) Anaconda's custom-partitioning code path normally writes /etc/crypttab for `--encrypted` part directives. Edge cases observed in F43+ where it doesn't. Without crypttab, systemd-cryptsetup-generator can still work from kernel cmdline alone, but cleanup paths and second-stage unlock both fall over. Adding a fallback `echo` that writes the canonical line if it's missing post-anaconda. ## 5. Initramfs LUKS module assertion (DEFENSIVE) Force-include `crypt + systemd-cryptsetup + plymouth` modules in initramfs via /etc/dracut.conf.d/10-veilor-luks.conf. dracut autodetects these when it sees an active LUKS mapping, but %post runs before the LUKS state is fully observable from the chroot. Plus we wipe stale initramfs (`rm -f /boot/initramfs-*.img`) before `--regenerate-all` so the regen actually rewrites bytes. Final assertion runs `lsinitrd \| grep -q cryptsetup` and surfaces a [ERR] line in build output if the module didn't make it. ## What this should fix After the man-db fix in v0.5.28-final, install proceeded past "Configuring xxx" cleanly but died at "Installing boot loader" with gen_grub_cfgstub. Root-cause was the over-broad patch from #1 above. After v0.5.29: - Install transaction completes (man-db excluded; non-man-db scriptlet warnings still suppressed; real errors still raise) - gen_grub_cfgstub runs against complete /boot/efi/EFI/fedora/ - Bootloader install completes - Reboot to disk lands at GRUB veilor-os entry - Kernel + initramfs load (cryptsetup confirmed present) - Plymouth LUKS prompt appears with text fallback - User has 5 tries, no timeout - Unlock → btrfs subvol mount → systemd → SDDM Files: kickstart/veilor-os.ks (+45 lines), overlay/usr/local/bin/veilor-installer (+50 lines). Verified: bash -n clean, ksvalidator clean. References: pyanaconda transaction_progress.py:110-136 (4 producers of 'error') pyanaconda bootloader/efi.py:194-201 (gen_grub_cfgstub call site) /usr/bin/gen_grub_cfgstub (set -eu wrapper for grub2-mkconfig stub) Fedora wiki Changes/RPM-6.0 dnf5 issue #2507 (RPM 6.0 scriptlet propagation regression)	2026-05-05 05:12:24 +01:00
veilor-org	fae677fb68	v0.5.28 (final): patch anaconda transaction_progress.py + exclude man-db THE actual root cause of the man-db transaction failure that killed three consecutive VM installs (v0.5.26 / v0.5.27 / v0.5.28). Confirmed via 7-agent research wave: - Fedora 43 ships RPM 6.0, which changed scriptlet failure propagation. Scriptlets that previously emitted "Non-critical error" warnings now bubble up as transaction-level errors. dnf5 issue #2507 documents the change. Anaconda --cmdline mode treats any 'error' token from the dnf transaction as a fatal abort. - man-db's `transfiletriggerin` is the canonical trigger: it runs `systemd-run /usr/bin/systemctl start man-db-cache-update` which returns non-zero in the anaconda chroot (no PID 1 systemd) and is flagged as transaction-level error under RPM 6.0. - We previously patched anaconda's transaction_progress.py on the BUILD HOST so livecd-creator could finish its own transaction. That patch lives only on the host running the build — never landed in the live rootfs the user installs from. Reproduced 3 times: install-time anaconda on the live ISO is unpatched, hits the same code path, aborts at exactly "Configuring man-db.x86_64". Two-layer fix: 1. kickstart %post seds the file inside the live rootfs at build time so the user's install-time anaconda is patched. Sed downgrades the 'error' token from raise PayloadInstallationError to log.warning. 2. Generated install ks excludes man-db / man-pages / man-pages-overrides from %packages. Belt-and-braces — even if the patch has an edge case the trigger never fires because the package isn't installed. Users install man pages post-firstboot. Previous attempts that didn't work: dropping the updates repo (only narrowed the set of failing scriptlets, didn't fix the underlying RPM-6.0 propagation change); flipping SELinux to permissive (confirmed not the cause; kickstart's selinux directive only writes /etc/selinux/config in target root, doesn't affect installer-time). Follow-up for next release: replicate the transaction_progress patch in the CI workflow's container so the build itself is deterministic. Currently the workflow has been greening on luck. Files: kickstart/veilor-os.ks (+25 lines), overlay/usr/local/bin/veilor-installer (+10 lines). Verified: bash -n clean, ksvalidator clean.	2026-05-05 03:46:00 +01:00
veilor-org	931a19ec93	v0.5.28 (cont): drop updates repo from generated install ks Anaconda's transaction died at "Configuring man-db.x86_64" in both v0.5.26 and v0.5.27 VM tests, reliably, days apart, against a freshly populated package cache. Same failure pattern, same package, with nothing in the visible error other than "The transaction process has ended with errors..". Pattern matches the same Fedora `updates` repo issue that the CI build kickstart already worked around by stripping the `updates` line entirely (`.github/workflows/build-iso.yml` line ~109). The installer-generated kickstart was adding the line back and re-introducing the bug for every user install. This commit aligns the install-time ks with the build-time ks: only the base `releases` repo is consumed by anaconda. Users who want updates run `dnf upgrade` post-install (or the v0.6 `veilor-update` wrapper). Trade-off: first-boot package versions are frozen to the Fedora 43 release date instead of including post-release updates. Acceptable — the alternative is "install reliably fails" which makes any freshness conversation moot. Verified locally: `bash -n` passes, ks template still well-formed. End-to-end re-validation goes through the next CI ISO + VM test run.	2026-05-05 02:52:22 +01:00
veilor-org	e848c7ffc3	v0.5.28 (partial): lock locale to en_US, roadmap post-install menu Install-flow change + roadmap update. The roadmap entry is the durable record; the code change is the immediate effect. ## Locale picker removed The "[4/4] Locale" prompt is gone. Locale is hardcoded to en_US.UTF-8 for the install. Two reasons: 1. The picker only offered en_GB and en_US, both of which install identically apart from the langtag string and a couple of date / currency conventions that nobody who's mid-install is thinking about. It's a fake choice that adds a screen. 2. `localectl set-locale` post-install handles every locale on earth in one command. The v0.7 `veilor-postinstall` first-login menu (see roadmap below) will offer a locale + keyboard layout switch with live preview, which is the right place for that decision. Step counters updated [1/4]→[1/3], [2/4]→[2/3], [3/4]→[3/3]. The Locale row stays in the confirm-summary box because users still want to see what they're getting installed. ## Roadmap - New section v0.5.27–v0.5.28 — documents the install-path stabilisation work explicitly so the bridge between "first green ISO" and "looks polished" is not invisible. Calls out the LUKS BLS fix that landed in v0.5.27 + the gum-input replacement scheduled for v0.5.28. - v0.6 — `veilor-doctor` description expanded: this is the post-install audit tool. Every user runs it weekly to see drift from baseline. - v0.6 — new entry `veilor-postinstall`: EndeavourOS-style first-login welcome menu, single TUI screen, asks once. Covers the "I just installed, what do I configure" gap in one explicit step instead of scattered docs.	2026-05-05 02:48:36 +01:00
veilor-org	1881c14ea7	v0.5.27: rd.luks.uuid via grubby, GRUB rebrand, fbcon=nodefer, ASCII gum cursor Critical install bug fix + cosmetic round-up + first formal test procedure document. ## Critical: LUKS unlock on first boot Generated installer kickstart's %post was injecting `rd.luks.uuid=…` into `/etc/default/grub` only. Fedora 43 uses BLS (Boot Loader Specification) entries in `/boot/loader/entries/.conf`; those are NOT regenerated by `grub2-mkconfig`. Result: the kernel boots without `rd.luks.uuid=`, dracut's cryptsetup-generator never spawns the unlock unit, plymouth has no password to ask for, and dracut-initqueue loops on dev-disk-by-uuid for ~3min before dropping to emergency shell. The fix layers both write paths: - `/etc/default/grub` — keeps the args around for future kernels (kernel-install reads this when adding new entries). - `grubby --update-kernel=ALL --args=...` — rewrites the `options` line of every existing BLS entry so the kernel that boots NEXT actually has the args. Verified by reading `/proc/cmdline` from the dracut emergency shell on a v0.5.26 install; old cmdline had only `root=UUID=… ro rootflags=subvol=root` and was missing the LUKS arg entirely. ## GRUB / branding - `/etc/default/grub` is sed'd to `GRUB_DISTRIBUTOR="veilor-os"` (was already there, kept). - BLS entries' `title` line is rewritten in-place to "veilor-os (<kver>)" for every kernel — `grub2-mkconfig` does not touch BLS titles, so this is the only path. - `/boot/loader/entries/-0-rescue-*.conf` is removed: the auto-built rescue entry was leaking "Fedora Linux" into the GRUB menu and showing a second boot option that nobody asked for. The rescue kernel image itself is left in /boot. - Hostname defaults to `veilor` (was inheriting the `localhost-live` name anaconda writes when the kickstart's network directive is ignored under cmdline mode). - `/etc/machine-info` adds `PRETTY_HOSTNAME="veilor-os"` so `hostnamectl status` and any consumer reading machine-info see the brand. ## Boot UX - `fbcon=nodefer` added to live-ISO bootloader cmdline. On real laptops with a hardware GPU, the kernel modeset blanks the framebuffer console mid-boot; without `nodefer` the installer banner draws into a frozen framebuffer and the user sees a black screen with a blinking cursor for ~30s. virtio-vga in QEMU doesn't trigger this so it never reproduced in VM. Symptom report on v0.5.26 was the trigger to investigate. ## Installer cosmetics - `GUM_CHOOSE_CURSOR` and `GUM_INPUT_PROMPT` switched from `❯ ` to `> `. The unicode arrow falls back to a fixed-width block on the linux fbcon font and lipgloss then duplicates that block at col +23, producing the "Install Install" double-render and the stray-T artifact in password fields. Plain ASCII renders identically across fbcon, virtio-vga, and X/Wayland gum runs. - `VERSION_ID` bumped 0.5.8 → 0.5.27 in the os-release drop-in. The installer banner reads this at runtime, so the live ISO + installed system both now show "veilor-os 0.5.27". ## Test procedure - `test/TESTING.md` — first canonical test procedure document. Splits VM (cheap iteration, hybrid sendkey + human passwords) from real hardware (mandatory for tag). Documents the standard test passwords (`veilortest1` for both LUKS and admin), the kill-and-relaunch step to skip CD on second boot, and the per-step pass/fail contract. - `test/METHOD-CHANGELOG.md` — append-only audit trail for changes to the procedure. Future releases that alter the test method must add an entry here with the why. - `test/test-runs/_TEMPLATE.md` — per-run report template. Each tagged release should land a filled report alongside it. ## test/run-vm.sh Decoupled QEMU monitor sock setup from auto-inject. Previously `NO_INJECT=1` (used to suppress autotype noise into prompts) also killed the monitor sock, leaving the VM undriveable. Monitor sock is now always exposed; only the inject helper is gated on the pubkey detection.	2026-05-05 01:43:00 +01:00
veilor-org	c89c73ee84	v0.5.26: log to /run, tolerate tee failure User hit `/usr/local/bin/veilor-installer: line 33: /usr/bin/tee: input/output error` on real-hardware install. Cause: LOG was `/var/log/veilor-installer.log`, which on the live ISO is backed by an overlay over squashfs. A bad sector / flaky USB → tee write fails → process substitution dies → installer aborts before the menu renders. Two changes: 1. Move LOG to /run/veilor-installer.log — pure tmpfs, never touches the live medium. Same path also unaffected by /var fill or overlay weirdness. 2. Wrap the `exec > >(tee -a $LOG) 2>&1` redirect in a writability probe. If the log can't be appended to (tmpfs OOM, fd exhaustion, anything), skip the tee and run the installer without on-disk persistence rather than crashing. Persistence is a nice-to-have for post-mortem debugging; the installer running is the must-have. This inverts the priority correctly.	2026-05-04 14:20:26 +01:00

1 2 3

138 commits