veilor-os/CHANGELOG.md

383 lines
15 KiB
Markdown
Raw Permalink Normal View History

# Changelog
All notable changes to veilor-os are documented here.
The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project loosely follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html)
during the pre-1.0 phase.
Each release section records the **bug found** and the **fix applied** so
future maintainers can see why a change exists, not just what it changes.
## [Unreleased]
### v0.7 BlueBuild OCI spike (active — `v0.7-bluebuild-spike`)
CI plumbing landed (~13 fixes) to unblock the first green BlueBuild
run on the self-hosted Forgejo runner. **Build still red** as of
2026-05-08; OCI artifact + installer ISO pending green run.
#### Forgejo runner + build-image plumbing
- Forgejo runner upgraded to **v6.4.0** with `userns-remap=default`.
Buildah needs `--userns=host` to undo the remap inside the job; added
to every `bluebuild build` invocation.
- Custom build image **`veilor-build:43`** (fedora:43 + nodejs +
buildah deps). Replaces the upstream BlueBuild image, which lacked
Forgejo-runner-friendly tooling.
- Workflow now **`runs-on: nullstone`** (single self-hosted runner,
no nested docker).
- Build timeout bumped **60 min → 360 min** to absorb first-time
secureblue base pulls on a cold runner.
#### Signing + registry auth
- **cosign v2.4.1** installed from upstream binary (no Fedora RPM yet
for v2.4.x).
- **GHCR PAT login** added so the BlueBuild step can pull
`ghcr.io/secureblue/kinoite-main-hardened` (rate-limited anonymous).
- **cosign keypair signing** — keyless OIDC fails on Forgejo (no
Sigstore Fulcio integration), so we ship a static keypair under
the repo and sign with `cosign sign --key`. Public key checked in
for verification.
#### BlueBuild recipe pivots
- Base image switched to **`ghcr.io/secureblue/kinoite-main-hardened`**
(the actual published image). Prior reference to
`securecore-kinoite-hardened-userns` was a planning-phase guess and
did not exist.
- Module type pivots driven by buildah-privileged + bind-mounted helper
scripts hitting chmod-permitted blockers:
- `type: files`**`type: copy`** (files module's chmod step
failed under bind-mount).
- `type: script` + `type: systemd`**`type: containerfile` RUN**
(single layer, no helper-script bind-mount).
#### Installer ISO — pivoted
- **livemedia-creator → bootc-image-builder.** livemedia-creator does
not support the `ostreecontainer` install method (only
`ostreesetup`/`url`/`nfs`), so the v0.7 path required the swap.
Build pending OCI artifact.
#### Docs
- This CHANGELOG entry.
- ROADMAP refresh — v0.5.0 marked done, v0.7 OCI marked in-flight,
installer-iso pivot recorded, USB install-log persistence default-on
promise documented, v1.0 ship criteria carried over.
### Infra (out-of-tree, recorded for traceability)
- **2026-05-08** — Headscale OIDC 403 fixed by adding
`172.20.0.0/24` (docker proxy bridge gateway) to the
`no-guest@file` Traefik middleware allowlist on nullstone.
Unblocks `tag:guest` provisioning for veilor-os clients.
- **All GitHub remotes removed** from veilor-os local clones, six
worktrees, and sibling projects (auth-limbo, minecraft-launcher,
minecraft-server, infra). GH push-mirrors disabled. Forgejo-only
since 2026-05-05.
### Planned (deferred / parking)
- v0.3 polish — Plymouth black theme, SDDM theme, Konsole profile,
wallpaper SVG. Re-enable `init_on_alloc=1 init_on_free=1` post-install
via `veilor-firstboot` so live boot stays fast but installed system
keeps the memory hygiene.
- USBGuard auto-snapshot on first boot.
- veilor-firstboot UX improvements (cleaner banner, better error paths).
---
## [0.5.0] — 2026-05-06
**Tag:** `v0.5.0`**final kickstart-path release**.
The hardened-Fedora-43 kickstart line ships. Future work moves to
the v0.7 BlueBuild OCI spike; the kickstart retires at v1.0.
### Added
- First green Forgejo-CI ISO build (~2.7 GB live ISO, EFI + BIOS
bootable). Released as `ci-latest` artifact at
`git.s8n.ru/veilor-org/veilor-os/releases/tag/ci-latest`.
- **gum TUI installer** wrapping Anaconda — single LUKS prompt,
locale locked to `en_US.UTF-8`, admin-password first-boot flow.
- **LUKS2 argon2id + btrfs subvols** install via Anaconda, written
through `/etc/kernel/cmdline` so BLS entries carry the cmdline
veilor needs.
- **3-mode `veilor-power` CLI** (`save | mid | perf`) with AC/battery
udev auto-switching, lifted into the overlay.
- **KDE black theme** + Fira Code system font, branded
`/etc/os-release`, GRUB rebrand, plymouth detail-text boot.
- Hardening: SELinux enforcing, USBGuard default-block, fail2ban +
auditd, firewalld drop zone, NTS chrony, DNS-over-TLS, locked
root.
- Self-hosted **Forgejo CI** on nullstone replaces the GitHub
Actions build pipeline.
### Fixed (delta from v0.2.5 → v0.5.0 — 35+ failure classes)
The full v0.5.x grind is documented per-release in commit messages
(v0.5.21v0.5.32). Headline fixes:
- **`--location=none` skipped `CollectKernelArgumentsTask`.** Anaconda
shipped BLS entries with empty cmdline. Fix: write
`/etc/kernel/cmdline` directly + `/etc/default/grub` + grubby +
explicit `kernel-install add`. (v0.5.31)
- **`transaction_progress.py` install scroll** masked real failures
when patched too broadly. Narrowed the patch to only suppress
`Configuring xxx.x86_64`. (v0.5.28 → v0.5.29)
- **Locale dialog raced anaconda startup.** Lock to en_US.UTF-8,
defer locale choice to `veilor-postinstall` (v0.7 scope). (v0.5.28)
- **`fbcon=nodefer`** + GRUB rebrand + ASCII gum cursor make the
install flow legible on linux fbcon. (v0.5.27)
- **`rd.luks.uuid`** injected via `grubby --update-kernel=ALL` in
chroot `%post` — earlier releases relied on Anaconda which silently
dropped it. (v0.5.23, v0.5.27)
- **9-agent research wave** identified the v0.5.32 blocker map; 7
blockers shipped in one bundle.
### Notes
- Treat v0.5.0 as the **portfolio anchor** for the kickstart path.
v0.5.32-rc was the last test-run; v0.5.0 was tagged on
2026-05-06 as the freeze point.
- v0.6 was **cancelled** the same day (folded into v0.7). See
`docs/ROADMAP.md` strategy-pivot section.
---
## [0.2.5] — 2026-05-01
**Commit:** `8515bdb`
### Fixed
- **Live boot took 5+ minutes on KVM.** Dracut sat at the parse-livenet
stage for what looked like a hang. Root cause: `init_on_alloc=1`
and `init_on_free=1` zero every memory page on allocation and free.
In a virtualised guest with paravirtual memory, this multiplied the
early-boot cost by ~5x. Removed both flags from the *live* kernel
cmdline.
### Notes
- The two memory-hygiene flags will be re-added on the **installed**
system via `veilor-firstboot` in v0.3 — the cost on bare metal is
negligible, the live-ISO penalty is the only place it bites.
- Live cmdline retained: `lockdown=integrity slab_nomerge
randomize_kstack_offset=on vsyscall=none`.
---
## [0.2.4] — 2026-05-01
**Commit:** `a23ce63`
### Fixed
- **VM booted but stalled at dracut "parse-livenet" looking for a label
that never matched.** Root cause: an upstream bug in
`livecd-tools``imgcreate/live.py::__get_efi_image_stanza()` writes
the EFI grub stanza as `root=live:LABEL=...` for dracut. Dracut on
live ISOs expects `live:CDLABEL=...` for ISO9660 volume labels;
`LABEL=` matches partition labels which a live ISO doesn't have.
- Patched `live.py` in-place inside the CI build container before
invoking `livecd-creator`. With the patched stanza, the VM booted
cleanly to the SDDM login prompt.
### Changed
- CI workflow now `sed`s the patch into the installed `live.py` and
asserts the patch landed before continuing the build.
### Notes
- Bug also affects `livemedia-creator --make-iso --no-virt` and any
other consumer of `imgcreate.LiveImageCreator`. Worth filing
upstream once we have a clean repro recipe.
---
## [0.2.3] — 2026-05-01
**Commit:** `ef54a24`
### Added
- Manual `useradd admin` invocation in chroot `%post`. `livecd-creator`
does not run an installer phase, so the kickstart `user` directive
is silently ignored. Without this, the booted live system has no
admin account at all, and SDDM falls back to "no users" — login
impossible.
### Fixed
- **`/etc/os-release` was still pointing at stock Fedora.** Even with
the overlay tree successfully copied, `kde-theme-apply.sh` was
resolving `/etc/os-release.d/veilor` from the wrong path (the build
host's repo, not the overlay's installed location).
- Rewired the symlink chain cleanly: `/etc/os-release →
../usr/lib/os-release`, with the override file written to
`/usr/lib/os-release` directly during `%post`.
- Branding now reflects veilor-os in `/etc/os-release`,
`hostnamectl`, and the SDDM session menu.
### Notes
- The `user --name=admin` directive stays in the kickstart for
documentation and for any future `livemedia-creator`-based
installer ISO that *does* honour it.
---
## [0.2.2] — 2026-05-01
**Commit:** `3408841`
### Fixed
- **Overlay was partially copied — boot worked but veilor-power, KDE
theme, custom scripts were all missing.** Found via offline debugfs
inspection of the v0.2.1 rootfs: tuned profiles, sshd hardening,
sudoers entries, and systemd units were present, but
`/usr/share/veilor-os/{assets,scripts}` was empty.
- Root cause: `%post --nochroot` ran with `set -eu`. When the first
`cp` of a non-essential overlay file returned non-zero, the script
aborted, leaving the assets/scripts copy step un-executed. None of
the chroot `%post` scripts could then find what they needed and they
silently no-op'd.
### Changed
- `%post --nochroot` now uses `set +e` around `cp`/`mkdir` so a
partial-permissions error on one tree doesn't kill the whole copy.
- Added `/var/log/veilor-nochroot.log` — every action in
`%post --nochroot` now traces with timestamps. Future debugging is
one `journalctl --boot` away.
### Notes
- The looser error handling is intentional but bounded — only the
overlay copy uses `set +e`. Hardening scripts that follow run with
strict mode.
---
## [0.2.1] — 2026-05-01
**Commit:** `9c6136f`
### Fixed
- **ISO booted, but it was effectively bare Fedora KDE.** No
hardening, no theme, no `veilor-power`, no `/etc/os-release`
override. Confirmed by mounting v0.2.0 with debugfs:
`/etc/os-release` symlinked to `../usr/lib/os-release` (Fedora's
default), no `/usr/share/veilor-os`, no overlay files anywhere.
- Root cause: `%post --nochroot` hardcoded `/mnt/sysimage` as the
destination. `/mnt/sysimage` is the **livemedia-creator** install
root. We had switched the build pipeline to **livecd-creator**,
which exposes the destination as `$INSTALL_ROOT` — a different path
inside its tmpfs sandbox.
- Switched the copy target to `$INSTALL_ROOT`.
### Notes
- Partial overlay landed in v0.2.1 (tuned, sshd, sddm.conf) — but
`/usr/share/veilor-os/{assets,scripts}` was still missing because
`set -eu` aborted partway through the cp tree. That fix is in v0.2.2.
- Lesson learned: tooling-specific environment variables matter.
`$INSTALL_ROOT` is the portable answer; `/mnt/sysimage` is a
livemedia-creator-only convention.
---
## [0.2.0] — 2026-04-30
**Commit:** `7c4a94d` (tagged release)
### Added
- First green ISO. Reproducible build pipeline lands.
- GitHub Actions workflow `build-iso.yml` produces a UEFI+BIOS-bootable
live ISO from `kickstart/veilor-os.ks`.
- CI: kickstart syntax linting (`ksvalidator`) gate.
- Kickstart based on Fedora 43, KDE Plasma minimal, hardening
packages selected (`fail2ban`, `usbguard`, `tuned`, `audit`,
`firewalld`).
- Overlay tree authored: tuned profiles, sshd hardening, sysctl
drop-in, sudoers, udev rules, KDE theme assets, Fira Code font.
- 3-mode power profiles: `veilor-power save | mid | perf` with
AC/battery udev auto-switching.
### Notes — known limitations of v0.2.0
- **The overlay never actually applied to the installed system.**
The `%post --nochroot` copy step targeted `/mnt/sysimage`
(livemedia-creator's install root) but the build pipeline had moved
to livecd-creator, which uses `$INSTALL_ROOT`. Result: the ISO
*boots* and presents a working KDE Plasma desktop, but it is in
practice **stock Fedora 43 KDE** with no veilor-os hardening,
branding, theme, or power scripts applied.
- v0.2.0 is best understood as a **build-pipeline milestone** — the
ISO format, EFI/BIOS bootability, partitioning, and squashfs build
all work end-to-end. The userspace customisation layer was wired
but not delivering. Treat v0.2.0 as proof-of-build, not as a
feature-complete release.
- See **v0.2.5** for the first feature-complete ISO that actually
ships veilor-os hardening and branding into the running system.
### Build pipeline path to green
For posterity, the issues resolved between v0.1 (scaffold) and v0.2.0
(first green ISO):
- pcre2 / selinux-policy version skew on stock Fedora 43 base —
worked around with a pinned `fix-repo` for the local build only;
CI uses `dnf upgrade --refresh` to sidestep entirely.
- KDE Plasma hard-deps (cups, geoclue2, ModemManager, PackageKit) —
kept at the package level, masked at the daemon level.
- `%post --nochroot` source path — multi-path detection added so the
overlay can be sourced from `/work` (CI) or `/run/install/repo`
(virt) or kickstart-relative (no-virt).
- `livemedia-creator --make-iso --no-virt` produced a squashfs but
no EFI/BOOT image. Switched to `livecd-creator` (`livecd-tools`)
which is purpose-built for live ISOs and handles EFI grafting.
- Tmpdir on `/tmp` exhausted the GitHub Actions tmpfs cap (16GB
vs ~30GB working set). Moved to `/var/lmc` on the runner's host
ext4.
---
## [0.1.0] — 2026-04-29
**Commit:** `1822005`
### Added
- Initial repo scaffold: `kickstart/`, `build/`, `overlay/`, `scripts/`,
`assets/`, `docs/`, `test/`.
- Kickstart skeleton (Fedora 43 KDE base, single-prompt LUKS install,
hardened bootloader cmdline, locked root, blank-password admin with
`chage -d 0` to force first-boot reset).
- Hardening scripts ported and rebranded from operator's reference
system: base hardening, kernel hardening, custom SELinux policy
module `veilor-systemd`.
- KDE theme: BreezeBlackPure base + grey accent (`#686B6F`).
- Fira Code chosen as system font (Fedora `fira-code-fonts`,
SIL OFL 1.1).
- Test harness: VM runner (`test/run-vm.sh`) with QEMU + OVMF for
fast iteration, with `SECBOOT=1` and `FRESH=1` modes.
- Documentation: `BUILD.md`, `INSTALL.md`, `HARDENING.md`,
`POWER.md`, `boot-checklist.md`.
### Notes
- v0.1 was scaffold-only — no green ISO yet. Build pipeline iterated
through ~22 distinct toolchain issues before producing v0.2.0.
- All `onyx` references stripped from shipped artifacts; comments
refer to "reference system" only.