Commit graph

145 commits

Author SHA1 Message Date
obsidian-ai
48ccabe914 ci(bluebuild): bluebuild bin lives at /usr/bin not /usr/local/bin 2026-05-06 17:06:33 +01:00
obsidian-ai
756b03aa5c ci(bluebuild): override CLI container entrypoint to bluebuild binary
Container's default entrypoint is dumb-init, which interpreted 'build'
as a command to exec rather than as a bluebuild subcommand. Pin
--entrypoint /usr/local/bin/bluebuild and pass 'build ...' as args.
2026-05-06 17:03:53 +01:00
obsidian-ai
1e70cc5461 ci(bluebuild): use ghcr.io/blue-build/cli container instead of action
The blue-build/github-action requires docker buildx which podman
doesn't ship. Symlinking podman as docker isn't enough — the action
calls 'docker buildx inspect' / 'docker buildx rm' which podman
doesn't implement. Pull the official BlueBuild CLI container and run
it with --build-driver buildah; works against podman storage with no
docker dependency.
2026-05-06 17:01:22 +01:00
obsidian-ai
9ee2cec20e ci(bluebuild): symlink podman -> docker (action needs docker CLI) 2026-05-06 16:58:50 +01:00
obsidian-ai
8926894ceb ci(bluebuild): chown /etc/sudo* to root before sudo (userns=host fix) 2026-05-06 16:56:36 +01:00
obsidian-ai
6d8164c199 ci(bluebuild): use blue-build/github-action composite (no CLI binary release)
BlueBuild CLI does not ship pre-built binaries on GitHub Releases
(latest tag v0.9.35 has no assets — install path is cargo or their
container image). Drop the curl-tarball install step and use the
official composite action @ pinned SHA — it runs podman + buildah
inside, works on Forgejo runner identically to GH-hosted because
it's bash, not node-bound.
2026-05-06 16:54:04 +01:00
obsidian-ai
bbdafbce94 ci(bluebuild): slim dnf list + install cosign from upstream binary
dnf5 in Fedora 43 strict-fails when 'already installed' packages
appear in -y install. Drop git/curl/tar/sudo (shipped in
veilor-build:43 image already) and use --skip-unavailable. cosign
isn't packaged in F43 — pull v2.4.1 static binary from upstream.
2026-05-06 16:51:17 +01:00
obsidian-ai
6391b1104b bluebuild(recipe): reconcile kickstart %post into BlueBuild modules (A2)
Walk every action in kickstart/veilor-os.ks %post and map to its
v0.7 atomic equivalent:

Build-time script additions:
- chmod +x /usr/share/veilor-os/scripts/* + /usr/local/bin/veilor-*
  (BlueBuild type:files sometimes drops perms)
- fc-cache -f after Fira Code stamping
- os-release brand override (NAME=veilor-os, ID=veilor, ID_LIKE)
- brand-leak guard: fail the image build if any onyx/personal data
  slipped through into shipped state

Layered packages:
- zram-generator (memory hygiene; replaces dnf install in kickstart)
- jq (used by veilor-doctor for `bootc status --json`)
- vim-enhanced + tmux + htop (admin essentials, parity with v0.5.x)

Systemd unit enables added:
- veilor-postinstall.service (first-login TUI; new in A3)
- veilor-doctor.timer (weekly drift check; new in A3)

Dropped: anaconda transaction_progress.py patch (build-time CI work,
not image content); SDDM display-manager symlink (kinoite ships
sddm.service already); SELinux module build (secureblue has its
own); systemctl set-default multi-user.target (kinoite is
graphical.target by design).
2026-05-06 16:50:02 +01:00
obsidian-ai
4d53d76442 docs: v0.7 user-facing docs (INSTALL-V07, STRATEGY pivot, README, CHANGELOG)
A4 inline (agent failed on API):
- docs/INSTALL-V07.md: 130-line user walkthrough — bootstrap ISO,
  Anaconda LUKS prompts, ostreecontainer pull, first-login TUI, day-
  to-day bootc-upgrade / rpm-ostree-install / bootc-rollback.
- docs/STRATEGY.md: append PIVOT EXECUTION 2026-05-06 section
  recording v0.5 ship, v0.6 cancel, v0.7 active.
- README.md: rewrite Quick install block for v0.7 path; legacy v0.5.0
  block kept below.
- CHANGELOG.md: Unreleased entry covering the spike's CI port +
  atomic CLI port + docs.
2026-05-06 16:48:48 +01:00
obsidian-ai
606806f82f overlay: atomic CLI tools for v0.7+ (bootc upgrade, postinstall, doctor)
A3 inline (agent failed on API). Three CLIs ported / written for the
v0.7+ atomic system:

veilor-update — rewritten on bootc upgrade (was dnf upgrade --refresh).
  Pre-checks bootc status, pauses auditd while staging, prints summary
  and offers reboot. Returns 0/1/2/3 per legacy contract.

veilor-postinstall (NEW) — first-login TUI run via
  veilor-postinstall.service oneshot. Asks once for keyboard, locale,
  hostname, GPU drivers, package presets (dev/media/homelab),
  bluetooth, USBGuard snapshot, then invokes veilor-doctor. Writes
  /var/lib/veilor/postinstall-complete and self-disables on success.

veilor-doctor — Updates section rewritten to parse `bootc status
  --json` (with jq) when available, falls back to dnf history /
  check-update for legacy v0.5.x kickstart-installed systems.

Plus systemd units:
  - veilor-postinstall.service (oneshot on graphical.target, gated on
    absence of done-marker, runs on tty1)
  - veilor-doctor.service + .timer (weekly drift check)
2026-05-06 16:46:59 +01:00
obsidian-ai
61fec5e1a9 ci(bluebuild): port build to Forgejo runner (nullstone label)
A1 inline (agent failed on worktree base mismatch). Adapt
build-bluebuild.yml to run on the Forgejo self-hosted runner using
the same lessons from build-iso.yml debug:

- runs-on: nullstone (resolves to veilor-build:43, fedora43+nodejs)
- BlueBuild CLI installed in-job from upstream release tarball v0.9.10
- podman/buildah/skopeo/cosign installed via dnf
- bluebuild build with podman driver + skopeo inspect + cosign signing
- Push primary to Forgejo registry git.s8n.ru/veilor-org/veilor-os
- GHCR push gated to github.server_url == 'https://github.com' only
- SBOM + attest-build-provenance gated GH-only (Forgejo has no Fulcio)
- All third-party actions remain pinned to node20-shipping versions

Secrets needed in Forgejo repo settings:
- FORGEJO_REGISTRY_TOKEN: PAT with package:write on veilor-org
- FORGEJO_REGISTRY_USER: 's8n-ru' (or org member with write scope)
2026-05-06 16:44:52 +01:00
obsidian-ai
5e94a61ea0 docs(ROADMAP): pivot — v0.6 cancelled, v0.7 BlueBuild OCI is mainline
Strategy pivot 2026-05-06: v0.5.32 produced a green ISO on Forgejo
runner. That's the kickstart-path proof point. Continuing v0.6
kickstart polish is sunk-cost work on tooling retired at v1.0.

Pivot:
- v0.5.0 is the FINAL kickstart-path release. Tag, freeze, ship.
- v0.6 cancelled as a milestone. Original plan kept inline as
  HISTORICAL reference.
- v0.7 promoted to primary active milestone. Absorbs the v0.6
  ergonomic CLI tools (veilor-postinstall / veilor-doctor /
  veilor-update) with bootc upgrade replacing dnf upgrade.
- Active branch: v0.7-bluebuild-spike. All future feature work lands
  there, not on main.
2026-05-06 16:10:03 +01:00
obsidian-ai
d48e59f05b docs: add PROOF-OF-WORK.md — receipts of work, tooling, and decisions
Single document that surfaces the depth of work behind veilor-os:
metrics, distros studied, every tool traversed in the build chain,
all 35+ failure classes hit and beaten, key engineering decisions and
why, what's in the repo beyond the kickstart, and the self-hosted
nullstone CI infrastructure built to support it.

Receipts not narrative — every claim links back to a file path,
commit, error, or config. Useful as portfolio anchor and as a single
read-this-first for anyone returning to the project after a gap.
2026-05-06 16:10:03 +01:00
obsidian-ai
ecd374ab1a ci: gate cosign/sbom/attest steps to github only
cosign keyless sign uses Sigstore Fulcio which requires a
Fulcio-trusted OIDC issuer. Forgejo runs don't have one, so cosign
falls back to the interactive device flow and times out
(error obtaining token: expired_token). Same applies to
attest-build-provenance and the SBOM action's signed attestation.

Skip all three on Forgejo for now; ISO + sha256 are sufficient for
v0.5.x test releases. Re-add when we self-host a Sigstore stack or
sign with a key-pair instead of keyless.
2026-05-06 16:10:03 +01:00
obsidian-ai
e17c04007d docs(README): tone down secureblue credit (no code lifted yet)
We layer on their OCI image as v0.7 base; we don't redistribute their
source. Drop the AGPLv3-attribution prose — that becomes relevant only
if/when we ship a verbatim chunk of their config/policy in our repo.
2026-05-06 16:10:03 +01:00
obsidian-ai
97939d76f8 docs(README): add secureblue column + upstream credit section
secureblue (AGPLv3) is the upstream hardened atomic Fedora that the
v0.7 BlueBuild spike layers on top of. Comparison table now includes
secureblue alongside Kicksecure + stock Fedora KDE. New "Credit &
relationship to secureblue" section spells out where their work
already solves problems we don't need to reinvent (Trivalent,
SELinux policy, kernel cmdline, signed OCI), how veilor-os differs
(kickstart install path + branding + Forgejo CI), and the AGPLv3
attribution rule for any code we lift verbatim.
2026-05-06 16:10:03 +01:00
obsidian-ai
abaff9d3c3 ci: symlink /work -> GITHUB_WORKSPACE for ks %post SRC probe 2026-05-06 16:10:03 +01:00
obsidian-ai
29a6677d54 ks: drop apparmor-* packages — not in Fedora 43 repos
Build error: 'Failed to find package apparmor-parser : No match for
argument'. Fedora 43 base and updates do not ship AppArmor packages;
the prior comment was incorrect. Defer AppArmor to v0.7 secureblue OCI
hybrid (which has its own LSM stack), or land via COPR overlay later.
2026-05-06 16:10:03 +01:00
obsidian-ai
b3572565e2 ci: run build directly in Fedora job container, drop addnab nest
forgejo-runner labels nullstone -> fedora:43 image. Switching
runs-on: ubuntu-24.04 -> nullstone makes the job container itself
the build environment, eliminating the docker-in-docker workspace
bind-mount problem (host path != act-container path).

Build now runs as root in fedora:43, installs livecd-tools directly
via dnf, and writes outputs to $GITHUB_WORKSPACE which is the natural
runner workdir on host. No nested docker, no userns juggling, no
explicit -v workspace bind needed.
2026-05-06 16:10:03 +01:00
obsidian-ai
9bf063a178 ci: add /work diagnostic before sed-redirect to surface bind/perm issue 2026-05-06 16:10:03 +01:00
obsidian-ai
3f138e7435 ci: repin fedora:43 build container to amd64 digest
Prior pin was the arm64 manifest digest (linux/arm64/v8); on x86_64
host it failed with `exec /usr/bin/sh: exec format error`. Pinned to
the amd64 manifest entry from the same fat-manifest.
2026-05-06 16:10:03 +01:00
obsidian-ai
7d6054311b ci: add --userns=host to nested Fedora build container
Forgejo runner on nullstone runs against a daemon with
userns-remap=default. addnab/docker-run-action launches the Fedora 43
build container with --privileged, which is incompatible with
userns-remap unless --userns=host is also set.
2026-05-06 16:10:03 +01:00
obsidian-ai
6b0828d692 ci: pin sbom/cosign/attest actions to node20-safe versions
forgejo-runner v6.4.0 ships node20; floating tags @v0/@v3/@v2 now
resolve to actions whose runs.using=node24, which the runner cannot
exec. Pin to last node20-shipping release of each:

- anchore/sbom-action@v0.17.2
- sigstore/cosign-installer@v3.7.0
- actions/attest-build-provenance@v2.2.3
2026-05-06 16:10:03 +01:00
s8n
a59f1f026a ci: gate softprops release steps + add Forgejo API equivalents
The build-iso workflow used softprops/action-gh-release@v2 unconditionally,
which only speaks the GitHub Releases REST API. When the workflow runs on
the Forgejo runner registered on nullstone, those steps would fail.

Add a server_url check so the GH-only path runs only on github.com, and
mirror it with a curl-based step that hits the Forgejo /api/v1/releases
endpoints. Behaviour:
  - github.com: identical to before (action-gh-release@v2).
  - git.s8n.ru: drop+recreate ci-latest release, upload chunked assets
                via the Forgejo attachments API.

Tag-driven "Attach to release" path mirrored the same way.

Refs: A1 build-eng task — Forgejo runner adaptation.
2026-05-06 16:10:03 +01:00
veilor-org
beef32a77c feat(installer): promote eject-media reminder to its own box
In v0.5 the "Remove the install media" reminder was a single line
inside the green success box, and operators on both onyx and the
friend's RTX 4080 rig missed it — rebooted into the live ISO and
re-ran the installer thinking the install had silently failed.

Promote the reminder to its own loud yellow thick-bordered gum-style
box stacked directly below the success/countdown box, with three
lines of explanation. Renders for the full 10s of the countdown so
it stays in the operator's face the entire window.
2026-05-06 16:10:03 +01:00
veilor-org
0a70eea950 feat(installer): 10s reboot countdown with per-tick redraw
v0.5 used `sleep 5` after a static "System will reboot in 5 seconds."
box, which left the operator guessing how much time was left to grab
the USB stick. The new loop runs 10 → 1, clearing + redrawing the
gum-style success box each tick with the remaining-seconds figure,
giving the operator a visible window to act.

10 seconds (vs 5) because real hardware operators were missing the
window — laptops with the USB on the far side of the dock take
4-5 seconds to physically reach. 10 is comfortable, not annoying.
2026-05-06 16:10:03 +01:00
veilor-org
877ad91096 feat(installer): confirm-twice for LUKS passphrase + admin password
A typo in the LUKS passphrase is unrecoverable — the disk is
unmountable without it and we don't escrow the key. Re-prompting
until the two reads match catches keyboard-layout surprises (the
US/UK quote-key position is the most common one) before they brick
the install.

Admin password gets the same treatment for consistency. Less
catastrophic (resettable from a recovery shell) but a mismatch
still locks the user out of their fresh install on first boot.

Loop bails on cancel/ESC and re-prompts on validate_pw failure.
2026-05-06 16:10:03 +01:00
veilor-org
a3b3d29b38 feat(installer): staged banner reveal at 40ms/line
Read banner.txt line by line with a 40ms sleep between each, then
clear and redraw the bordered gum-style version. 5-line banner ×
40ms = 200ms total reveal — slow enough to land an aesthetic on the
first frame, fast enough that the operator never feels it as lag.

Pure cosmetic; no functional change to the install flow.
2026-05-06 16:10:03 +01:00
veilor-org
55221a6af2 fix(installer): swap gum input --password for bash read -srp
`gum input --password` corrupts the linux fbcon since v0.5.27 — the
bubbletea screen-restore writes back the previous menu buffer because
the framebuffer terminfo entry lacks `civis/cnorm` cursor-hide
sequences, leaving a duplicate "Install" plus a stray "T" rendered on
top of the password field. The fix is a single termios echo-off via
`read -srp`: no redraw, no glitch, no dependency on gum's TUI layer
for the one screen where it broke.

Header still rendered through `gum style` so visual parity with the
disk picker / confirm box is preserved. Whiptail fallback path
unchanged (passwordbox there has always rendered cleanly).
2026-05-06 16:10:03 +01:00
s8n
d76597c57a sec: AppArmor v0.6 stub — load profiles in complain mode
Per docs/research/2026-05-05-agent-wave/04-hardening-tier-2.md (v0.6
scope item 1).

Adds:
  - apparmor-parser apparmor-utils apparmor-profiles to %packages in
    BOTH kickstart/veilor-os.ks (live ks) and overlay/usr/local/bin/
    veilor-installer (generated install ks heredoc).
  - scripts/40-apparmor.sh — wires aa-complain on every veilor-shipped
    profile. Idempotent. "loaded, present, nothing breaks".
  - overlay/etc/apparmor.d/veilor.d/firefox — 1-liner stub (binary
    confinement marker only; full policy post-v0.6).
  - overlay/etc/apparmor.d/veilor.d/thunderbird — same pattern.
  - Wired 40-apparmor.sh into install %post chain after
    30-apply-v03-theme.sh.

Complain mode means: profiles loaded, kernel logs syscall denials but
does NOT enforce. Operator can review audit.log post-install to
inform v0.7 policy authoring.
2026-05-06 16:10:03 +01:00
veilor-org
631e7bd040 ci: TODO marker for SHA-pinning third-party actions
Note that all `uses:` directives still resolve to mutable major-
version tags. SHA-pinning is the Agent 8 audit recommendation but
requires per-action web lookups that stalled the previous SRE
attempt; tracked separately so this PR can land first.
2026-05-06 16:10:03 +01:00
veilor-org
9158532c9d ci: pin fedora:43 base image to digest
Pin registry.fedoraproject.org/fedora:43 to its current manifest
digest so a malicious or accidental tag-rewrite upstream cannot
silently change the base layer of every CI build. Digest was
captured via `skopeo inspect --raw` on 2026-05-06. Refresh
procedure documented inline.
2026-05-06 16:10:03 +01:00
veilor-org
e93ef644e1 ci: add cosign keyless sigs, SBOM, and provenance attestation
Sign each ISO chunk with cosign keyless OIDC, generate an SPDX SBOM
of the build output, and attach an in-toto build-provenance
attestation. Sigs/certs/SBOM are uploaded alongside the ISO parts in
the ci-latest rolling prerelease so the test/auto-install.sh path
can verify before reassembling.

Action versions are major-version tags (@v3, @v0, @v2). SHA-pinning
is tracked separately to keep this PR small and avoid the long web
lookups that stalled the previous attempt.
2026-05-06 16:10:03 +01:00
obsidian-ai
21f2b4da9a ci: pin actions to node20-safe tags + runner sock pass-through
forgejo-runner v6.4.0 ships a node20 javascript engine. v4.2+ of
actions/checkout and v2.0.5+ of softprops/action-gh-release moved to
node24, which the runner refuses to exec. Pin both to last node20
release.

Pairs with a runner-side config change (separately deployed on
nullstone /home/docker/forgejo-runner/conf/config.yaml) that adds
`-v /var/run/docker.sock:/var/run/docker.sock` to per-job container
options + whitelists the socket via valid_volumes — without that
addnab/docker-run-action@v3 inside the catthehacker/ubuntu job
container can't reach the docker engine.

- actions/checkout v4 -> v4.1.7
- softprops/action-gh-release v2 -> v2.0.4
- addnab/docker-run-action v3 unchanged (composite/docker, no node)
- ludeeus/action-shellcheck@master unchanged (docker-based)
2026-05-06 16:10:03 +01:00
s8n
91d5d26473 sec: polish THREAT-MODEL.md for v0.7 public launch
Status flipped Draft → Final.

In-scope rows now cite specific config files / settings (auditable
from clean checkout):
  - LUKS2 params from kickstart/veilor-os.ks
  - sysctl knobs file path
  - USBGuard policy mode + rule type
  - sshd_config drop-in path + every directive
  - auditd rule path + watched paths
  - chrony NTS endpoints
  - systemd-resolved DoT settings
  - bootloader kernel args (lockdown, slab_nomerge, init_on_alloc/free, etc.)

Out-of-scope rows un-hedged. 'May not always' phrasings removed; each
adversary states unambiguously what veilor-os does NOT do.
2026-05-06 16:10:03 +01:00
veilor-org
c7c0a0bcc8 docs: test run report skeleton for v0.5.32 (Forgejo build)
First test-runs/ report off the new template. Records the build host
(forgejo-runner on nullstone, ubuntu-24.04 / catthehacker:act-24.04),
notes that v0.5.32 is the first ISO produced after the GH Actions
mirror was disabled, and pre-populates the Findings section with the
7 v0.5.32 blocker fixes from the 2026-05-05 9-agent wave as expected
behaviours the tester must verify.

Result is left as "pending A1 build" — the operator + A5 fill in
per-step pass/fail and hardening output once the actual VM walkthrough
runs against the produced ISO. This is intentional: the report is the
scaffold; the test is a separate step.
2026-05-06 16:10:03 +01:00
s8n
9011fd2dbf docs: README de-GH + Forgejo build status 2026-05-06 16:10:03 +01:00
s8n
20b3541d38 docs: METHOD-CHANGELOG 2026-05-06 forgejo entry 2026-05-06 16:10:03 +01:00
veilor-org
1fa45c3749 chore: gitignore auto-install-vm test artifacts
Test rig was leaking test/auto-install-vm.{nvram,qcow2} into untracked
state. Pattern matches existing veilor-vm.* exclusion.
2026-05-06 16:10:03 +01:00
veilor-org
d9b206e46b docs: STRATEGY.md — primary git host moved to git.s8n.ru (Forgejo)
Self-hosted Forgejo + forgejo-runner on nullstone now primary.
GitHub becomes public mirror (Forgejo push-mirrors every commit
+ every 8h). 0 GH Actions minutes consumed.

Runner labels:
  ubuntu-24.04 — drop-in for existing build-iso.yml workflow
  nullstone    — privileged Fedora 43 (opt-in via runs-on: nullstone)

Deploy artifacts: ~/ai-lab/nullstone-server/forgejo/.

External TODO (parent operator owns):
  - router port-forward 222 → nullstone:222 for public SSH push
  - no-guest@file allowlist update for external web UI access
2026-05-06 16:10:03 +01:00
veilor-org
89949dc8f2 v0.5.32: ship 7 blockers from 9-agent wave
Per docs/research/2026-05-05-agent-wave/README.md priority list.
All 7 land together to keep iteration cycles useful — partial fixes
bury the lookahead findings agents already mapped.

## 1. CRITICAL — suspend/resume wifi death (Agent 9, B2)

`veilor-modules-lock.service` runs `kernel.modules_disabled=1` 30s
after graphical.target. iwlwifi/iwlmvm/cfg80211 reload on resume
from S3/S0ix → with modules locked, resume breaks wifi until
reboot. Same architectural class as the LUKS bug — security feature
breaks legitimate kernel state transitions.

The unit already has `ConditionKernelCommandLine=!module.sig_enforce=1`
(self-skip when signed-modules enforcement is on cmdline). Adding
`module.sig_enforce=1` to the kernel cmdline retains the security
property (no unsigned modules) without runtime lock-down → resume
works.

Files: kickstart/veilor-os.ks line 61 + overlay/usr/local/bin/veilor-installer
generated bootloader directive both gain `module.sig_enforce=1`.

## 2. veilor-firstboot.service WantedBy=graphical.target (Agent 2)

Was `WantedBy=multi-user.target` only. Real installs default to
graphical.target so the unit never ran on installed systems — admin
pw stayed at install-time + chage -d 0 expired, SDDM PAM bounced
to chauthtok screen (recoverable but ugly UX).

Now `WantedBy=graphical.target multi-user.target`. Live ISO +
multi-user installs both resolve via this list.

## 3. USBGuard hash → id-based baseline (Agent 9, A3)

Mirrors memory feedback_usbguard_dock.md — onyx had hash+parent-hash
rules that broke on dock replug; we shipped no rules.conf so first
boot blocks the USB keyboard.

Adds overlay/etc/usbguard/rules.conf with HID-class allow rule
(`allow with-interface match-all { 03:*:* }`) — covers every USB
keyboard, mouse, gamepad, fingerprint reader, NFC. Survives dock
replug + kernel-bump vendor renumeration. Mass-storage stays
implicit-block; user explicitly allows post-firstboot via
`ujust veilor-usbguard-enroll` (planned v0.6).

## 4. firewalld trusted zone with tailscale0 pre-bound (Agent 9, D1)

User uses Tailscale daily (memory: project_tailscale_mesh.md).
Default firewalld zone = drop, blocks tailnet traffic on tailscale0.

Adds overlay/etc/firewalld/zones/trusted.xml with
`<interface name="tailscale0"/>`. After `tailscale up` brings the
interface up, NetworkManager dispatcher associates it with the
trusted zone automatically — no user intervention.

Default zone stays drop. Only the tailscale0 interface gets ACCEPT.

## 5. /etc/skel branding (Agent 7)

Was completely empty. Result: per-user KDE config (~/.config/kdeglobals
etc.) pre-empty, so the moment user opened System Settings, KDE wrote
fresh ~/.config/* and silently shadowed our /etc/xdg/kdedefaults/*.
Visual brand evaporated on first click.

Seeds:
  /etc/skel/.config/kdeglobals    (copy of assets/kde/veilor-default.kdeglobals)
  /etc/skel/.config/breezerc      (copy of assets/kde/breezerc)
  /etc/skel/.config/kwinrc        (Plasma 6 wayland defaults: opengl, animspeed=0,
                                    blur off, click-to-focus)
  /etc/skel/.config/konsolerc     (default profile = Veilor)
  /etc/skel/.local/share/konsole/Veilor.profile + .colorscheme

User who opens System Settings now writes against branded baseline,
not against vanilla Breeze.

## 6. KMS modeset args + initramfs keymap (Agents 1 + 9)

Real laptop boot has a 5-15s blank between vt switch and SDDM start
because simpledrm releases before i915/nvidia-drm/amdgpu claim. Plus
non-US users get locked out at LUKS prompt because initramfs ships
en-US keymap by default (RHBZ 1405539, RHBZ 1890085).

Adds to bootloader cmdline (live + installed):
  i915.modeset=1 amdgpu.modeset=1 nvidia-drm.modeset=1
  rd.vconsole.keymap=us

`rd.vconsole.keymap=us` is a placeholder; the v0.6 firstboot keymap
picker will rewrite it from /etc/vconsole.conf. Until then, en-US
users get correct LUKS keyboard; non-US users still need the v0.6
fix (per Agent 1).

## 7. virtio-9p log capture (Agent 6)

The v0.5.30 virtio-serial wiring depends on rsyslog inside the live
ISO (anaconda's setupVirtio writes a rsyslog forward rule), which
the live ks doesn't install — files were 0-byte across three
install runs.

test/run-vm.sh now adds a `-virtfs local,...,mount_tag=hostlogs`
share pointing at `test/test-runs/<timestamp>/`. veilor-installer
runs `_dump_logs_to_host` via EXIT trap that mounts the share at
/mnt/hostlogs and rsyncs /tmp/{anaconda,program,storage,packaging,dnf}.log
+ /var/log/veilor-installer.log + dmesg + journalctl + the generated
ks. Runs on success AND failure AND ^C.

No-op on real hardware (9p tag absent) — VM-only debug.

## Validate

  bash -n overlay/usr/local/bin/veilor-installer  # OK
  ksvalidator kickstart/veilor-os.ks               # clean

## Out-of-scope for v0.5.32 (deferred to v0.6)

Per Agent 1 follow-ups: argon2id retune for slow CPUs, recovery key
generation in firstboot, TPM2/FIDO2 unlock helpers. Per Agent 9
follow-ups: Plasma Wayland fallback X11 install, lid-close handling,
SELinux relabel progress UX. Per Agent 4: AppArmor stack +
nftables preset + audit log shipping CLI.

Per Agent 8 (CI hardening): SHA-pin actions + dependabot + SBOM +
SLSA L3 attestation — separate workflow-only commit.
2026-05-06 16:10:03 +01:00
s8n
0b568b016b Merge pull request 'ci(bluebuild): pin actions to node20-safe tags' (#9) from feat/runner-fix-node20-pinning into v0.7-bluebuild-spike 2026-05-06 13:54:31 +01:00
obsidian-ai
e50c9a3b43 ci(bluebuild): pin actions to node20-safe tags
forgejo-runner v6.4.0 javascript runtime is node20. Pin every
javascript action used in the spike branch's workflows to the last
release that ships node20.

- actions/checkout v4 -> v4.1.7 (3 files)
- softprops/action-gh-release v2 -> v2.0.4 (build-iso)
- anchore/sbom-action v0 -> v0.17.2
- actions/attest-build-provenance v2 -> v2.2.3
- blue-build/github-action@v1 unchanged (TODO: SHA pin)

This is the spike-branch counterpart of the main-branch fix in
feat/runner-fix-docker-sock-and-node20.
2026-05-06 13:54:12 +01:00
s8n
9dc2846316 Merge pull request 'ci(bluebuild): pin blue-build/github-action to commit SHA' (#6) from feat/a1-bluebuild-pin into v0.7-bluebuild-spike 2026-05-06 13:53:15 +01:00
s8n
f2e36bfead ci(bluebuild): pin blue-build/github-action to commit SHA
Replace @v1 with @24d146df25adc2cf579e918efe2d9bff6adea408 (the commit
v1 currently resolves to). Tag pins on third-party actions are mutable
— a maintainer or attacker can re-point v1 at a malicious commit and
silently change what runs on every push.

Trailing comment '# v1' preserves human readability for future bumps.

Refs: 9-agent CI hardening wave (agent 8), 2026-05-05.
2026-05-06 10:32:13 +01:00
veilor-org
3c247bc601 v0.7 spike: BlueBuild recipe + ostreecontainer kickstart + cosign workflow
Initial scaffold for the v0.7 hybrid path. Spike branch only — does
NOT land in main until success criteria pass (see bluebuild/README.md).

## What this commits

- bluebuild/recipe.yml — BlueBuild recipe extending
  ghcr.io/secureblue/securecore-kinoite-hardened-userns:latest with:
  * veilor branding overlay (overlay/, assets/, scripts/ at /usr/share/veilor-os)
  * sudo restored (revert secureblue's run0-only)
  * Xwayland restored (some apps still need it)
  * mullvad-browser layered alongside Trivalent (default browser kept)
  * tailscale + yggdrasil packages (mesh stack layers 1 + 2)
  * tailscaled.service pre-disabled (awaits first-boot prompt)
  * yggdrasil.service enabled (idle warm-fallback per STRATEGY.md)
  * veilor-firstboot.service + veilor-modules-lock.service enabled
  * cosign signing module configured

- bluebuild/config/just/60-veilor.just — ujust recipes:
  * install-reticulum (RetiNet AGPL fork — mesh layer 3)
  * install-reticulum-rnode (LoRa hardware)
  * install-thorium (opt-in browser with explicit CVE-lag warning)
  * veilor-mesh-join (token paste / QR for tailscale onboarding)

- bluebuild/README.md — spike doc + smoke-test commands + 5-item
  success criteria checklist

- kickstart/install-ostreecontainer.ks — install kickstart template
  for the v0.7 path. No %packages block; uses
  `ostreecontainer --url=ghcr.io/veilor-org/veilor-os:43 --transport=registry`
  to populate / from the OCI image directly during anaconda's install
  pass. No first-boot rebase, no transition window. Keeps existing
  LUKS+btrfs partitioning verbatim.

- .github/workflows/build-bluebuild.yml — GH Actions workflow:
  * Triggered on push to v0.7-bluebuild-spike, weekly cron, dispatch
  * Uses blue-build/github-action@v1 (TODO: pin to commit SHA per
    CI hardening agent 8 follow-up)
  * Builds + cosign-signs (keyless via Sigstore) + pushes to GHCR
  * Smoke-tests the OCI image (sudo, mullvad-browser, yggdrasil,
    tailscale all present)
  * Generates SBOM (SPDX) via anchore/sbom-action
  * Publishes SLSA build provenance attestation

## What this does NOT change

- main branch is untouched. v0.5.x kickstart path keeps shipping.
- kickstart/veilor-os.ks (the live-ISO ks) is untouched — the v0.7
  hybrid uses the existing live-ISO build path; only the install-time
  ks (install-ostreecontainer.ks) is new.
- overlay/, scripts/, assets/ are untouched on this branch — the
  recipe pulls them in via `type: files` modules at build time.

## Spike success criteria (reproduced from bluebuild/README.md)

- [ ] `bluebuild build recipe.yml` exits 0
- [ ] `bootc container lint` exits 0 on resulting image
- [ ] `podman run` smoke-test passes
- [ ] CI workflow builds + cosign-signs + pushes to GHCR
- [ ] Installer ISO using `ostreecontainer` against this OCI reaches
      SDDM with admin login on first boot

If all 5 land, merge v0.7-bluebuild-spike → main as v0.7.0.

## Reference

- docs/STRATEGY.md (full plan)
- docs/ROADMAP.md v0.7 (schedule)
- docs/THREAT-MODEL.md (publish before v0.7 ship)
- secureblue: https://github.com/secureblue/secureblue
- BlueBuild: https://blue-build.org
- ostreecontainer: https://docs.fedoraproject.org/en-US/bootc/anaconda-install/
2026-05-05 15:30:04 +01:00
veilor-org
7060d9aa6b docs: refine strategy — ostreecontainer install + mesh stack + browser stack
Refines docs/STRATEGY.md per parent-operator handoff (2026-05-05).
Locks in five things the original draft didn't cover, and corrects
one mistake.

## Refinement: ostreecontainer install path

The original draft proposed a two-step install: Anaconda partitions
+ kickstart, then on first boot a `veilor-firstboot-rebase.service`
runs `bootc rebase ghcr.io/veilor/veilor-os:43`. This commit drops
that step.

Anaconda's `ostreecontainer --url=... --transport=registry`
directive populates the root filesystem directly from the OCI image
during the install pass. No first-boot rebase, no transition
window, no second reboot. Same end state, simpler path.

Stay on `ostreecontainer` through v0.8. Do NOT migrate to the new
`bootc` kickstart command until v1.0 — it blocks multi-disk and
authenticated registries. Do NOT use `bootc-image-builder
anaconda-iso` output — deprecated in image-builder v44+. Produce
the OCI image and the bootstrap ISO as separate artifacts.

This compresses the v0.7 BlueBuild spike from 2 days → 1 day.

## Correction: keep Trivalent as default

The original strategy.md treated Trivalent (secureblue's hardened
Chromium) as an override-and-remove. That was wrong: Trivalent's
COPR tracks upstream M147+ within hours, ships hardened_malloc +
JIT-less + Drumbrake WASM. Default browser pick.

Mullvad Browser layered alongside for anti-fingerprint. Thorium
remains opt-in via `ujust install-thorium` only — its CVE lag is
months and contradicts the threat model. Never default.

## Mesh stack baked in

Three-layer warm-stack documented in STRATEGY.md:
- L3a Tailscale + Headscale (Day 1, daily driver)
- L3b Yggdrasil-go (Day 1, idle warm-fallback, AllowedPublicKeys mode)
- L3c Reticulum/RetiNet AGPL fork (opt-in via ujust install-reticulum)

Threat floor table: ISP-DNS-block (i, Day 1), ISP-Tailscale-block
(ii, Phase 2 promote Yggdrasil), internet-down (iii, opt-in RetiNet
+ RNode).

Tier model: tag:admin / tag:infra / tag:guest with failsafe pre-auth
key on yubikey + paper + Authentik OIDC group.

## Onboarding

Token paste / QR (user picks). Misskey signup mints reusable
24h-TTL pre-auth key. NOT auto-OIDC at first boot.

## Iroh seeding daemon stub (v0.8 / Phase 2)

`veilor-seed.service` documented but NOT implemented until Iroh hits
1.0 (current 0.96–0.98 RC, Q1 2026 target slipped). BLAKE3 +
iroh-gossip per-service topic. Static media only — DEFER DB
replication forever.

## External dependency tracked

nullstone Traefik `no-guest@file` ACL is currently 0.0.0.0/0
allow-all (XFF chain breakage 2026-05-03). Must be fixed before
veilor-os first-public-ISO ships, otherwise tag:guest provisioning
leaks the full vhost surface to every veilor user. Parent operator
owns the fix; explicitly out of veilor-os scope.

## Files

- docs/STRATEGY.md — full refinement
- docs/ROADMAP.md — v0.7 spike entry now reflects ostreecontainer
  + mesh stack + 1-day spike target
- README.md — drops the "v0.2.5 pre-release" badge + status box
  (out of date), adds bootc/atomic trajectory paragraph

## What did NOT change

- v0.5.x main branch is untouched. The ostreecontainer swap belongs
  in the v0.7 spike branch, NOT v0.5.32.
- nullstone Traefik config is untouched. Out of scope.
- The kickstart and overlay code is untouched.
2026-05-05 15:15:52 +01:00
veilor-org
50a241a603 docs: STRATEGY.md — hybrid kickstart bootstrap + bootc OCI on secureblue
Locks in the strategic decision from 2026-05-05 secureblue research
agent: pivot the technical base toward bootc/OCI, but as a layer over
secureblue's `securecore-kinoite-hardened-userns` rather than a
Containerfile-from-scratch.

## What changed

- New: `docs/STRATEGY.md` — full hybrid plan (kickstart bootstrap →
  first-boot bootc rebase → bootc-only at v1.0). Documents secureblue
  rationale, our overrides (drop Trivalent, restore sudo + Xwayland),
  next concrete steps for v0.7 spike (BlueBuild recipe + GH Actions
  workflow + `veilor-firstboot-rebase` one-shot).

- Updated: `docs/ROADMAP.md` v0.7 bootc-spike subsection — supersedes
  the Agent 3 Containerfile-from-scratch plan with the BlueBuild
  layering plan. Spike compresses 1 week → 2 days; hardening review
  inherited from 30 secureblue contributors.

## Why hybrid, not pure pivot

- Anaconda's LUKS UX (single passphrase prompt + custom
  partitioning) is mature; bootc-image-builder's installer is not yet
  on par. Keep the kickstart as the bootstrap.
- bootc upgrade gets us atomic A/B + signed image chain + instant
  rollback that we can't realistically build alone with our
  contributor count.
- The kickstart work is not lost — it becomes the day-zero installer
  through v0.7. v1.0 deprecates it entirely once bootc-image-builder
  installer ISO matures.

## Why secureblue, not Athena (Arch)

| Axis | secureblue | Athena OS |
|---|---|---|
| Maintainers | 30 | 8 |
| MAC enforcing OOB | SELinux + custom policy | AppArmor active, profiles mostly unconfined |
| Atomic / immutable updates | Yes (bootc/rpm-ostree) | No (rolling) |
| Threat model published | No | Yes |
| MS-signed Secure Boot shim | Yes (Fedora shim) | Yes (with auto-MOK) |

Athena's only structural advantage is the published threat model.
We're already drafting one (Agent 5 of 2026-05-05 wave) — we get
that win regardless. secureblue's contributor count + atomic update
infrastructure is the leverage.

## Strategic credibility win

Publishing `docs/THREAT-MODEL.md` BEFORE the v0.7 launch positions
veilor-os ahead of secureblue (no threat model) and Athena (has
threat model but smaller contributor base) on the one axis that
matters most.

## Open questions documented in STRATEGY.md

- secureblue contribution acceptance for upstream patches (USBGuard
  id-based-rules fix, threat model framework)
- Brave vs Mullvad-Browser pick for default browser
- bootc rebase first-boot fallback if rebase fails
- Fedora 44 transition timing follows secureblue's release tags
2026-05-05 15:05:59 +01:00
veilor-org
4e9782a18a docs: 9-agent research wave findings — v0.5.32 blocker map
Logs the full output of the 9-agent deep-dive run on 2026-05-05 to
docs/research/2026-05-05-agent-wave/. Pulls every actionable finding
into one indexed location so v0.5.32 planning has a paper trail.

Files:
  docs/research/2026-05-05-agent-wave/README.md             — index
  docs/research/2026-05-05-agent-wave/01-...real-hardware.md — Plymouth + LUKS edge cases
  docs/research/2026-05-05-agent-wave/02-...firstboot-ux.md  — SDDM + first-boot UX
  docs/research/2026-05-05-agent-wave/03-...spike-plan.md    — bootc-image-builder 1-week spike
  docs/research/2026-05-05-agent-wave/04-...tier-2.md         — AppArmor + nftables + audit + homed
  docs/research/2026-05-05-agent-wave/05-...launch.md         — threat model + v0.7 launch checklist
  docs/research/2026-05-05-agent-wave/06-...log-capture.md    — virtio-9p host-share for anaconda logs
  docs/research/2026-05-05-agent-wave/07-...skel-branding.md  — /etc/skel gap audit
  docs/research/2026-05-05-agent-wave/08-...ci-hardening.md   — SHA-pin actions + SBOM + SLSA L3
  docs/research/2026-05-05-agent-wave/09-...failure-modes.md  — real-hardware pessimistic audit

Plus the prior linter-applied:
  docs/ROADMAP.md      — Lessons learned section, v0.5.32 active block,
                          v0.6 promotion of veilor-postinstall + veilor-doctor,
                          v0.7 bootc spike scheduled
  docs/THREAT-MODEL.md  — drafted by Agent 5; in/out scope, comparison
                          matrix, v0.7 launch checklist

Top blockers identified for v0.5.32 (cross-cited in README):
  1. Suspend/resume wifi death (kernel.modules_disabled=1)
  2. veilor-firstboot.service WantedBy=graphical.target
  3. kernel-upgrade grub drift
  4. USBGuard hash-rules problem (already learned on onyx)
  5. firewalld blocks tailscale0
  6. /etc/skel/ empty
  7. virtio-9p log capture replaces broken virtio-serial path

Wave + verifier pattern (per ROADMAP lessons learned #4) validated:
9 parallel agents on distinct topics produced converging blocker
list. The same pattern landed v0.5.31 four-bug fix from the prior
4-agent verification wave on v0.5.30 outcome.
2026-05-05 14:52:53 +01:00
veilor-org
2788b95a12 v0.5.31: kernel-install via /etc/kernel/cmdline + set-e leak + rescue glob
Four-bug fix from 4-agent verification wave on v0.5.30 outcome.

Bug 1 CRITICAL: --location=none made anaconda skip CollectKernelArgumentsTask
(installation.py:149-151). --append= args never collected, BLS entries
wrote with empty cmdline. Drop --location=none, let anaconda do its
bootloader path; broad transaction_progress patch already silences the
gen_grub_cfgstub class failure.

Bug 2 CRITICAL: kernel-install reads /etc/kernel/cmdline as source of
truth (per 90-loaderentry.install:84-95). Veilor never wrote that file
so kernel-install fell through to /proc/cmdline (live ISO's). Add
3-path write: /etc/kernel/cmdline (Path A canonical), /etc/default/grub
(Path B legacy), grubby --update-kernel=ALL (Path C last-writer guard).
Plus explicit kernel-install add per kernel after Path A write.

Bug 3: rescue BLS glob *-0-rescue-*.conf required trailing hyphen;
F43 uses *-0-rescue.conf. Fix: *-0-rescue*.conf (matches both).

Bug 4: set +e/set -e scope leak in %post. v0.5.30 closed manual
bootloader block with set -e which re-enabled errexit for the rest of
%post that was authored with set +e semantics. Result: any
non-guarded command failure aborted the LUKS args injection block.
Fix: remove the closing set -e.

Files: overlay/usr/local/bin/veilor-installer.
Verified: bash -n clean, ksvalidator clean.
2026-05-05 13:47:23 +01:00