backup: phase 1 + phase 2 scripts; daily script repaired and deployed

Repairs the orphaned synapse-signing-key block at scripts/backup.sh
lines 119-122 that was exiting the script under set -e before the
Minecraft block could run, leaving 5 of the last 7 days without a
world backup and zero usable snapshots after 7-day retention.

Phase 1 (deployed today to /opt/docker/backup.sh on nullstone):
- Repaired script — orphan block removed, MC arm wrapped so failures
  in one tar don't kill the run
- tar exit code 1 ("file changed as we read it") now treated as
  success on the live MC world; spark profiler tmp file noise
  silenced via --ignore-failed-read --warning=no-file-changed
- Plugin DBs (homestead, AuthMe, CoreProtect, LuckPerms) and configs
  now backed up alongside the world
- Sentinel /opt/backups/.last-success stamped only when the world
  arm succeeds — gives outside monitors a single mtime to alert on
- Manually verified end-to-end: 12G world tarball, 492M plugins,
  279M dbs, 14 config files, sentinel updated. Pre-fix script saved
  at /opt/docker/backup.sh.bak-20260507-pre-phase1.

Phase 2 (scripts in repo, deployment pending operator sudo):
- scripts/restic-backup-playerdata.sh — Class A 5-min restic snapshots
  of playerdata/, stats/, advancements/, plugin DBs, LuckPerms;
  rcon save-all flush before snapshot; tag-scoped retention
- scripts/restic-init.sh — one-time bootstrap (root-only) for
  /etc/mc-backup.{env,pw} + repo init at /home/user/restic/
- scripts/systemd/mc-backup-playerdata.{service,timer} — 5-min timer
  with hardening (ProtectSystem=strict, ReadOnlyPaths, etc)
- docs/RUNBOOK-BACKUP-RESTORE.md updated with both phases'
  deployment steps and the operator-action checklist

Off-host mirror to onyx (Phase 4) and class B/C/D world snapshots
(Phase 3) are still TODO — see BACKUP-STRATEGY.md §11 phase plan.
This commit is contained in:
s8n 2026-05-07 18:29:30 +01:00
parent 96702116ee
commit 4c16cebb2b
6 changed files with 603 additions and 60 deletions

View file

@ -2,7 +2,7 @@
Strategy doc: [`../BACKUP-STRATEGY.md`](../BACKUP-STRATEGY.md). This runbook is the **operator-facing** procedure for the three scenarios that come up in practice. Keep it short, copy-paste-able, and reachable from the player support workflow.
> **Status (2026-05-07):** This runbook is written **ahead** of the implementation it describes. The `mc-backup-frequent` timer and onyx mirror are NOT yet deployed. The "What if no snapshot exists yet?" section at the bottom covers today's reality.
> **Status (2026-05-07):** Phase 1 (the daily `/opt/docker/backup.sh` MC world tarball) is **deployed and verified** — see "Phase 1 deployment" section near the bottom. Phase 2 (`mc-backup-playerdata.timer`, 5-min cadence) and the onyx off-host mirror are NOT yet deployed; deployment steps in "Phase 2 deployment" below. Until Phase 2 lands, the daily 02:00 tarball is the only safety net (RPO up to 24h).
---
@ -142,11 +142,80 @@ Until phases 14 of `BACKUP-STRATEGY.md` are deployed, the only recovery resou
---
## Phase 1 deployment — DONE 2026-05-07
The daily fallback (`/opt/docker/backup.sh`) was repaired and redeployed. It now backs up MC world (~12 G compressed), plugins (~490 M), plugin DBs (~280 M), and configs nightly at 02:00, prunes after 7 days, and writes a sentinel `/opt/backups/.last-success` on success.
External monitor (cron on onyx) — the simplest dead-man's switch until ntfy lands:
```bash
# Add to onyx crontab, e.g. every 30 min
*/30 * * * * ssh user@192.168.0.100 \
'find /opt/backups/.last-success -mmin -1500 | grep -q . || \
echo "ALERT: nullstone MC backup sentinel stale (>25h)"' \
| mail -s "MC backup stale" you@example.com
```
(swap `mail` for `notify-send`, `ntfy publish`, etc once those are wired)
A copy of the pre-fix script is preserved at `/opt/docker/backup.sh.bak-20260507-pre-phase1` for forensic reference.
---
## Phase 2 deployment — restic playerdata snapshots every 5 min
Implementation is in this repo:
- `scripts/restic-backup-playerdata.sh` — the per-run script
- `scripts/restic-init.sh` — one-time bootstrap (must run as root)
- `scripts/systemd/mc-backup-playerdata.{service,timer}` — 5-min cadence
- Strategy + retention + threat model in `BACKUP-STRATEGY.md`
**Deployment status (2026-05-07): NOT YET DEPLOYED — operator action required.** `restic` is not on nullstone; installing it needs sudo, and `user`'s sudo is password-locked. Operator runs:
```bash
# On nullstone, as root (sudo -i or via console)
apt-get update && apt-get install -y restic mcrcon
cd /opt/docker
git -C /home/user/repos/minecraft-server pull \
|| git clone ssh://git@192.168.0.100:222/s8n/minecraft-server.git /home/user/repos/minecraft-server
cd /home/user/repos/minecraft-server
# 1) Bootstrap repos + env file
sudo bash scripts/restic-init.sh
# 2) Install systemd units + run script
sudo install -m 644 scripts/systemd/mc-backup-playerdata.service /etc/systemd/system/
sudo install -m 644 scripts/systemd/mc-backup-playerdata.timer /etc/systemd/system/
sudo install -m 755 scripts/restic-backup-playerdata.sh /usr/local/bin/
# 3) Enable + start
sudo systemctl daemon-reload
sudo systemctl enable --now mc-backup-playerdata.timer
# 4) Verify
systemctl list-timers mc-backup-playerdata.timer
journalctl -u mc-backup-playerdata.service -n 50 --no-pager
ls -la /home/user/restic/mc-frequent/
restic -r /home/user/restic/mc-frequent --password-file /etc/mc-backup.pw snapshots
```
The first run should appear within ~7 min (`OnBootSec=2min` + 5-min cadence).
### Off-host mirror to onyx (Phase 4 — separate)
After Phase 2 is running cleanly for ~24h, provision `mc-backup` user on onyx with chrooted SFTP, then add a nightly `restic copy` job from nullstone. See `BACKUP-STRATEGY.md` §6 for the SFTP chroot config and §11 phase plan.
Until then, the local nullstone repo is single-host — survives operator error and bad config edits, **not** disk failure. The Phase 1 daily tarball in `/opt/backups/` is the only redundancy until §6 lands.
---
## TODO — open items (links into BACKUP-STRATEGY.md §11)
- [ ] Phase 1: fix `/opt/docker/backup.sh` orphan-line bug (F-backup-1).
- [ ] Phase 2: deploy `mc-backup-frequent.timer` (Class A, 5-min playerdata).
- [ ] Phase 3: deploy `mc-backup-world.timer` (Class B/C/D, hourly).
- [x] Phase 1: fix `/opt/docker/backup.sh` orphan-line bug (F-backup-1). **Done 2026-05-07.**
- [ ] Phase 2: deploy `mc-backup-playerdata.timer` (Class A, 5-min). Scripts in repo; **blocked on operator running `apt install restic` + `restic-init.sh` with sudo**.
- [ ] Phase 3: deploy `mc-backup-world.timer` (Class B/C/D, hourly). Script not yet drafted; will mirror playerdata script.
- [ ] Phase 4: provision `mc-backup` user on onyx + `restic copy` job.
- [ ] Phase 5: schedule monthly drill calendar entry, run first drill.
- [ ] Phase 6: ntfy / Matrix alert wiring (depends on ntfy deployment).
@ -154,3 +223,4 @@ Until phases 14 of `BACKUP-STRATEGY.md` are deployed, the only recovery resou
- [ ] Verify `usercache.json` on this host: confirm UUID lookup workflow above resolves to the right `.dat`.
- [ ] Decide: `mcrcon` package vs lightweight Python `mcrcon` lib.
- [ ] Document compensation policy for unrecoverable losses (operator discretion right now).
- [ ] Drop dead `matrix-postgres` + `mongodb` + `synapse-*` blocks from `/opt/docker/backup.sh` once retirement is complete (currently they no-op-skip — minor noise in log only).

View file

@ -1,16 +1,38 @@
#!/usr/bin/env bash
# /opt/docker/backup.sh
# Backs up all Docker service databases and named volumes to /opt/backups/
# Run as root via cron. Keeps 7 daily backups.
#
# Daily backup of all Docker service databases, named volumes, and the
# Minecraft world to /opt/backups/. Runs as root via cron at 02:00 with
# 7-day retention.
#
# Phase 1 of BACKUP-STRATEGY.md ("stop the bleeding") — repairs the
# orphaned synapse-signing-key block that was killing the script under
# `set -e` before the Minecraft section ran. Also adds structured
# logging and a sentinel `.last-success` file so silent failures are
# detectable from outside the script.
#
# A separate Phase 2 (restic playerdata snapshots every 5 min) is
# delivered by scripts/restic-backup-playerdata.sh + the systemd unit
# pair under scripts/systemd/. This file remains the safety net.
set -euo pipefail
umask 077
BACKUP_DIR="/opt/backups"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_PATH="${BACKUP_DIR}/${TIMESTAMP}"
LOG="${BACKUP_DIR}/backup.log"
SENTINEL="${BACKUP_DIR}/.last-success"
KEEP_DAYS=7
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG"; }
# Track whether each backup arm succeeded so we can honour the
# sentinel contract: only stamp .last-success if the *world* (the
# critical T1 case) was captured. Other arms can fail without
# blocking the sentinel — they have their own logged FAILED lines.
MC_WORLD_OK=0
log() {
printf '[%s] %s\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$*" | tee -a "$LOG"
}
mkdir -p "$BACKUP_PATH"
log "=== Backup started: ${TIMESTAMP} ==="
@ -18,10 +40,12 @@ log "=== Backup started: ${TIMESTAMP} ==="
# ── Matrix PostgreSQL ──────────────────────────────────────────────
log "Dumping Matrix PostgreSQL..."
if docker ps --format '{{.Names}}' | grep -q '^matrix-postgres$'; then
docker exec matrix-postgres pg_dump -U synapse synapse \
| gzip > "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz" \
&& log " Matrix Postgres: OK ($(du -sh "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz" | cut -f1))" \
|| log " Matrix Postgres: FAILED"
if docker exec matrix-postgres pg_dump -U synapse synapse \
| gzip > "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz"; then
log " Matrix Postgres: OK ($(du -sh "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz" | cut -f1))"
else
log " Matrix Postgres: FAILED"
fi
else
log " matrix-postgres not running — skipping"
fi
@ -29,14 +53,16 @@ fi
# ── Rocket.Chat MongoDB ────────────────────────────────────────────
log "Dumping Rocket.Chat MongoDB..."
if docker ps --format '{{.Names}}' | grep -q '^mongodb$'; then
docker exec mongodb mongodump \
if docker exec mongodb mongodump \
-u admin -p CHANGE_ME_MONGO_ADMIN_PASSWORD \
--authenticationDatabase admin \
--db rocketchat \
--archive \
| gzip > "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz" \
&& log " MongoDB: OK ($(du -sh "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz" | cut -f1))" \
|| log " MongoDB: FAILED"
| gzip > "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz"; then
log " MongoDB: OK ($(du -sh "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz" | cut -f1))"
else
log " MongoDB: FAILED"
fi
else
log " mongodb not running — skipping"
fi
@ -46,13 +72,15 @@ log "Backing up Docker volumes..."
for VOLUME in synapse-media rocketchat-uploads; do
if docker volume ls --format '{{.Name}}' | grep -q "^matrix_${VOLUME}\|^rocketchat_${VOLUME}\|^${VOLUME}$"; then
ACTUAL_VOL=$(docker volume ls --format '{{.Name}}' | grep "${VOLUME}" | head -1)
docker run --rm \
if docker run --rm \
-v "${ACTUAL_VOL}:/volume:ro" \
-v "${BACKUP_PATH}:/backup" \
alpine \
tar czf "/backup/${VOLUME}-${TIMESTAMP}.tar.gz" -C /volume . \
&& log " Volume ${VOLUME}: OK" \
|| log " Volume ${VOLUME}: FAILED"
tar czf "/backup/${VOLUME}-${TIMESTAMP}.tar.gz" -C /volume . ; then
log " Volume ${VOLUME}: OK"
else
log " Volume ${VOLUME}: FAILED"
fi
else
log " Volume ${VOLUME}: not found — skipping"
fi
@ -60,7 +88,7 @@ done
# ── Config files (bind mounts) ─────────────────────────────────────
log "Backing up config directories..."
tar czf "${BACKUP_PATH}/configs-${TIMESTAMP}.tar.gz" \
if tar czf "${BACKUP_PATH}/configs-${TIMESTAMP}.tar.gz" \
/opt/docker/traefik/traefik.yml \
/opt/docker/traefik/config/ \
/opt/docker/matrix/docker-compose.yml \
@ -68,57 +96,151 @@ tar czf "${BACKUP_PATH}/configs-${TIMESTAMP}.tar.gz" \
/opt/docker/matrix/synapse-config/homeserver.yaml \
/opt/docker/matrix/synapse-config/matrix.example.com.log.config \
/opt/docker/rocketchat/docker-compose.yml \
2>/dev/null && log " Configs: OK" || log " Configs: partial (some files missing)"
2>/dev/null; then
log " Configs: OK"
else
log " Configs: partial (some files missing)"
fi
# IMPORTANT: signing key is sensitive — back up separately with tight perms
# Synapse signing key — sensitive, copy out separately with tight perms.
if [ -f /opt/docker/matrix/synapse-config/matrix.example.com.signing.key ]; then
cp /opt/docker/matrix/synapse-config/matrix.example.com.signing.key \
"${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
chmod 600 "${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
log " Synapse signing key: backed up (600)"
fi
# ── Minecraft server ───────────────────────────────────────────────
# This is the block that was missing from the deployed copy and
# corrupted by an orphaned synapse-signing-key fragment in the repo
# copy. Wrapped in a subshell so a failure here does NOT exit the
# whole script under `set -e` — we want the prune step and sentinel
# logic to still run.
log "Backing up Minecraft server..."
if docker ps --format '{{.Names}}' | grep -q '^minecraft-mc$'; then
# Server is running - create consistent world snapshot
docker exec minecraft-mc bash -c \
"cd /data && tar czf /tmp/mc-world-backup-${TIMESTAMP}.tar.gz world/ world_nether/ world_the_end/ 2>/dev/null" && \
docker cp minecraft-mc:/tmp/mc-world-backup-${TIMESTAMP}.tar.gz "${BACKUP_PATH}/" && \
docker exec minecraft-mc rm -f /tmp/mc-world-backup-${TIMESTAMP}.tar.gz && \
log " Minecraft world: OK ($(du -sh "${BACKUP_PATH}/mc-world-backup-${TIMESTAMP}.tar.gz" | cut -f1))" \
|| log " Minecraft world: FAILED"
# Backup configs and plugins
tar czf "${BACKUP_PATH}/minecraft-configs-${TIMESTAMP}.tar.gz" \
/opt/docker/minecraft/server.properties \
/opt/docker/minecraft/purpur.yml \
/opt/docker/minecraft/spigot.yml \
/opt/docker/minecraft/paper-*.yml \
/opt/docker/minecraft/bukkit.yml \
/opt/docker/minecraft/ops.json \
/opt/docker/minecraft/banned-*.json \
/opt/docker/minecraft/eula.txt \
2>/dev/null && \
log " Minecraft configs: OK" \
|| log " Minecraft configs: partial (expected)"
else
# Server is stopped - backup everything directly
tar czf "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" \
/opt/docker/minecraft/world/ \
/opt/docker/minecraft/world_nether/ \
/opt/docker/minecraft/world_the_end/ \
/opt/docker/minecraft/plugins/ \
/opt/docker/minecraft/server.properties \
/opt/docker/minecraft/purpur.yml \
/opt/docker/minecraft/spigot.yml \
2>/dev/null && \
log " Minecraft (full, offline): OK ($(du -sh "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" | cut -f1))" \
|| log " Minecraft (offline): partial"
fi
# tar exit codes: 0 = clean, 1 = "some files differed/changed during read"
# (NORMAL on a live MC server — chunks save while we read), 2 = fatal.
# Treat 0 and 1 as success, 2+ as failure.
tar_ok() { local rc=$1; [ "$rc" -le 1 ]; }
"${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
chmod 600 "${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
log " Synapse signing key: backed up (600)"
mc_backup() {
if docker ps --format '{{.Names}}' | grep -q '^minecraft-mc$'; then
# Server running — flush via rcon if mcrcon installed, then
# tar inside the container so we get a consistent point-in-time.
if command -v mcrcon >/dev/null 2>&1; then
mcrcon -H 127.0.0.1 -P 25575 \
-p "${MC_RCON_PASSWORD:-*redacted*}" \
-w 1 "save-all flush" >/dev/null 2>&1 || true
fi
# World tar — runs inside the container. We ignore tar exit 1
# ("file changed as we read it") because that's expected on a
# live server and the resulting archive is still usable.
local tar_rc=0
docker exec minecraft-mc bash -c \
"cd /data && tar czf /tmp/mc-world-backup-${TIMESTAMP}.tar.gz world/ world_nether/ world_the_end/" \
>/dev/null 2>&1 || tar_rc=$?
if tar_ok "$tar_rc" \
&& docker cp "minecraft-mc:/tmp/mc-world-backup-${TIMESTAMP}.tar.gz" "${BACKUP_PATH}/" >/dev/null 2>&1 \
&& docker exec minecraft-mc rm -f "/tmp/mc-world-backup-${TIMESTAMP}.tar.gz" >/dev/null 2>&1; then
local sz
sz=$(du -sh "${BACKUP_PATH}/mc-world-backup-${TIMESTAMP}.tar.gz" | cut -f1)
if [ "$tar_rc" -eq 1 ]; then
log " Minecraft world: OK (${sz}) [tar exit 1 — files changed during read, expected on live server]"
else
log " Minecraft world: OK (${sz})"
fi
MC_WORLD_OK=1
else
log " Minecraft world: FAILED (tar_rc=${tar_rc})"
# Best-effort cleanup of any half-written file inside the container.
docker exec minecraft-mc rm -f "/tmp/mc-world-backup-${TIMESTAMP}.tar.gz" >/dev/null 2>&1 || true
fi
# Plugins (jars + on-disk config) — small, do this regardless
# of world result so we always have plugin state on hand.
# `--ignore-failed-read` suppresses spark profiler tmp files
# (running JFR files briefly mode 600); `--warning=no-file-changed`
# silences CoreProtect db noise in the log.
local prc=0
tar --ignore-failed-read --warning=no-file-changed \
-czf "${BACKUP_PATH}/minecraft-plugins-${TIMESTAMP}.tar.gz" \
-C /opt/docker/minecraft plugins/ >/dev/null 2>&1 || prc=$?
if tar_ok "$prc"; then
log " Minecraft plugins: OK ($(du -sh "${BACKUP_PATH}/minecraft-plugins-${TIMESTAMP}.tar.gz" | cut -f1))"
else
log " Minecraft plugins: FAILED (rc=${prc})"
fi
# Plugin DBs — copied (not dumped, all SQLite/file-based) into
# a tagged tarball so restore is straightforward.
local drc=0
tar --ignore-failed-read --warning=no-file-changed \
-czf "${BACKUP_PATH}/minecraft-dbs-${TIMESTAMP}.tar.gz" \
-C /opt/docker/minecraft \
homestead_data.db \
plugins/AuthMe/authme.db \
plugins/CoreProtect/database.db \
plugins/LuckPerms/ \
>/dev/null 2>&1 || drc=$?
if tar_ok "$drc"; then
log " Minecraft DBs: OK ($(du -sh "${BACKUP_PATH}/minecraft-dbs-${TIMESTAMP}.tar.gz" | cut -f1))"
else
log " Minecraft DBs: partial (rc=${drc} — some files may be missing)"
fi
# Server-side configs and access lists. Some of these files are
# optional (eg whitelist.json absent when whitelisting is off).
# tar reports rc=2 for missing files, so we prefilter the list.
local cfg_files=()
for f in server.properties purpur.yml spigot.yml bukkit.yml \
commands.yml help.yml permissions.yml \
ops.json whitelist.json banned-players.json banned-ips.json \
usercache.json eula.txt docker-compose.yml; do
[ -e "/opt/docker/minecraft/$f" ] && cfg_files+=("$f")
done
local crc=0
tar czf "${BACKUP_PATH}/minecraft-configs-${TIMESTAMP}.tar.gz" \
-C /opt/docker/minecraft "${cfg_files[@]}" \
>/dev/null 2>&1 || crc=$?
if tar_ok "$crc"; then
log " Minecraft configs: OK (${#cfg_files[@]} files)"
else
log " Minecraft configs: FAILED (rc=${crc})"
fi
else
# Server stopped — back up everything from disk directly.
local frc=0
tar czf "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" \
-C /opt/docker/minecraft \
world/ \
world_nether/ \
world_the_end/ \
plugins/ \
homestead_data.db \
server.properties \
purpur.yml \
spigot.yml \
bukkit.yml \
ops.json \
whitelist.json \
banned-players.json \
banned-ips.json \
usercache.json \
docker-compose.yml \
>/dev/null 2>&1 || frc=$?
if tar_ok "$frc"; then
log " Minecraft (full, offline): OK ($(du -sh "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" | cut -f1))"
MC_WORLD_OK=1
else
log " Minecraft (offline): partial (rc=${frc})"
fi
fi
}
# Run MC arm — never let it kill the rest of the script.
if ! mc_backup; then
log " Minecraft arm exited non-zero — see lines above"
fi
# ── Prune old backups ──────────────────────────────────────────────
@ -128,3 +250,19 @@ find "$BACKUP_DIR" -maxdepth 1 -name "*.log" -mtime +30 -delete 2>/dev/null || t
BACKUP_SIZE=$(du -sh "$BACKUP_PATH" | cut -f1)
log "=== Backup complete: ${BACKUP_PATH} (${BACKUP_SIZE}) ==="
# ── Sentinel ───────────────────────────────────────────────────────
# Touch the sentinel only if the world (T1 case) was captured. An
# external monitor (cron on onyx, or ntfy/healthchecks once wired)
# can alert on `find /opt/backups/.last-success -mmin +1500` to catch
# silent failures within 25h of a missed daily run.
if [ "$MC_WORLD_OK" -eq 1 ]; then
{
printf 'last_success=%s\n' "$(date -Iseconds)"
printf 'backup_path=%s\n' "$BACKUP_PATH"
printf 'backup_size=%s\n' "$BACKUP_SIZE"
} > "$SENTINEL"
log "Sentinel updated: ${SENTINEL}"
else
log "WARNING: world backup did NOT succeed — sentinel NOT updated"
fi

View file

@ -0,0 +1,135 @@
#!/usr/bin/env bash
# /usr/local/bin/restic-backup-playerdata.sh
#
# Class A backup per docs/BACKUP-STRATEGY.md — every 5 minutes, snapshot
# playerdata + stats + advancements + plugin DBs + LuckPerms config.
# Skips the heavy region/ files (those are Class B, hourly).
#
# Driven by mc-backup-playerdata.timer (5 min cadence).
#
# Pre-req: restic installed; one-time bootstrap performed by
# scripts/restic-init.sh which creates the local repo and writes
# /etc/mc-backup.env + /etc/mc-backup.pw.
#
# Status (2026-05-07): scripts shipped to repo; deployment to nullstone
# is BLOCKED on operator running `apt install restic` + scripts/restic-init.sh
# under sudo. See docs/RUNBOOK-BACKUP-RESTORE.md "Phase 2 deployment".
set -euo pipefail
umask 077
ENV_FILE="${MC_BACKUP_ENV_FILE:-/etc/mc-backup.env}"
if [ ! -r "$ENV_FILE" ]; then
echo "FATAL: env file $ENV_FILE not readable — run scripts/restic-init.sh first" >&2
exit 2
fi
# shellcheck disable=SC1090
. "$ENV_FILE"
: "${RESTIC_REPOSITORY_FREQUENT:?RESTIC_REPOSITORY_FREQUENT not set in $ENV_FILE}"
: "${RESTIC_PASSWORD_FILE:?RESTIC_PASSWORD_FILE not set in $ENV_FILE}"
: "${MC_DATA:?MC_DATA not set in $ENV_FILE}"
export RESTIC_REPOSITORY="$RESTIC_REPOSITORY_FREQUENT"
export RESTIC_PASSWORD_FILE
LOG="${MC_BACKUP_LOG:-/var/log/mc-backup.log}"
SENTINEL="${MC_BACKUP_FREQUENT_SENTINEL:-/var/lib/mc-backup/last-success-frequent}"
RCON_HOST="${MC_RCON_HOST:-127.0.0.1}"
RCON_PORT="${MC_RCON_PORT:-25575}"
RCON_PASS="${MC_RCON_PASSWORD:-}"
mkdir -p "$(dirname "$SENTINEL")"
log() {
printf '[%s] [frequent] %s\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$*" \
| tee -a "$LOG"
}
on_err() {
local rc=$?
log "ERROR rc=${rc} at line ${BASH_LINENO[0]}"
if [ -n "${ALERT_URL:-}" ]; then
curl -fsS -m 5 -d "mc-backup-frequent FAILED rc=${rc}" "$ALERT_URL" \
>/dev/null 2>&1 || true
fi
exit "$rc"
}
trap on_err ERR
log "=== run start (host=$(hostname)) ==="
# 1. Best-effort: ask the server to flush before snapshotting.
# Don't fail the backup if rcon is down or unreachable — we'd rather
# have a slightly-stale snapshot than no snapshot.
if [ -n "$RCON_PASS" ] && command -v mcrcon >/dev/null 2>&1; then
if mcrcon -H "$RCON_HOST" -P "$RCON_PORT" -p "$RCON_PASS" -w 1 \
"save-all flush" >/dev/null 2>&1; then
log "rcon save-all flush: ok"
else
log "rcon save-all flush: failed (continuing)"
fi
else
log "rcon: skipped (no mcrcon or no password)"
fi
# 2. Build the include list. Anything that's missing on disk is silently
# skipped by restic, so we can list optional paths freely.
INCLUDES=(
"${MC_DATA}/world/playerdata"
"${MC_DATA}/world/stats"
"${MC_DATA}/world/advancements"
"${MC_DATA}/world/level.dat"
"${MC_DATA}/world_nether/level.dat"
"${MC_DATA}/world_the_end/level.dat"
"${MC_DATA}/homestead_data.db"
"${MC_DATA}/plugins/AuthMe"
"${MC_DATA}/plugins/CoreProtect/database.db"
"${MC_DATA}/plugins/LuckPerms"
)
EXISTING=()
for p in "${INCLUDES[@]}"; do
if [ -e "$p" ]; then
EXISTING+=("$p")
fi
done
if [ ${#EXISTING[@]} -eq 0 ]; then
log "no source paths exist — aborting"
exit 3
fi
# 3. Snapshot. Tagged so retention policy can target this class only.
log "snapshotting ${#EXISTING[@]} path(s)"
restic backup \
--tag playerdata \
--tag auto-5min \
--host "$(hostname)" \
--exclude='*.lock' \
--exclude='*.tmp' \
"${EXISTING[@]}" \
>> "$LOG" 2>&1
# 4. Light retention — only on this repo, only on this tag.
restic forget \
--tag auto-5min \
--keep-last 24 \
--keep-hourly 24 \
--keep-daily 7 \
--prune \
--quiet \
>> "$LOG" 2>&1 || log "forget+prune returned non-zero (continuing)"
# 5. Sentinel for external monitor.
{
printf 'last_success=%s\n' "$(date -Iseconds)"
printf 'class=A\n'
printf 'repo=%s\n' "$RESTIC_REPOSITORY"
} > "$SENTINEL"
# 6. Heartbeat (no-op if HEARTBEAT_URL unset).
if [ -n "${HEARTBEAT_URL:-}" ]; then
curl -fsS -m 5 "$HEARTBEAT_URL" >/dev/null 2>&1 || true
fi
log "=== run ok ==="

156
scripts/restic-init.sh Normal file
View file

@ -0,0 +1,156 @@
#!/usr/bin/env bash
# scripts/restic-init.sh
#
# One-time bootstrap for the Phase 2 restic backup chain. Run this on
# nullstone as root (sudo) AFTER `apt install restic mcrcon`.
#
# What it does:
# 1. Generates /etc/mc-backup.pw (40-byte random restic password) if absent.
# 2. Writes /etc/mc-backup.env (consumed by restic-backup-playerdata.sh).
# 3. Initialises the local restic repo at /home/user/restic/mc-frequent.
# 4. Takes a baseline snapshot so the timer's first run is fast.
# 5. Optionally adds an SFTP-mirror block once onyx is provisioned.
#
# Idempotent — re-running is safe; existing files are preserved.
#
# Cross-ref: docs/BACKUP-STRATEGY.md §8.2, docs/RUNBOOK-BACKUP-RESTORE.md.
set -euo pipefail
umask 077
if [ "$(id -u)" -ne 0 ]; then
echo "FATAL: must run as root (sudo)." >&2
exit 2
fi
if ! command -v restic >/dev/null 2>&1; then
echo "FATAL: restic not installed. Run: apt install restic mcrcon" >&2
exit 3
fi
# Resolve target user — restic repo lives under their home so /opt
# disk pressure doesn't matter. nullstone: 142G free on /home.
TARGET_USER="${TARGET_USER:-user}"
if ! id "$TARGET_USER" >/dev/null 2>&1; then
echo "FATAL: user '$TARGET_USER' not found" >&2
exit 4
fi
TARGET_HOME=$(getent passwd "$TARGET_USER" | cut -d: -f6)
PW_FILE="/etc/mc-backup.pw"
ENV_FILE="/etc/mc-backup.env"
REPO_FREQUENT="${TARGET_HOME}/restic/mc-frequent"
REPO_WORLD="${TARGET_HOME}/restic/mc-world"
LOG_DIR="/var/log"
SENTINEL_DIR="/var/lib/mc-backup"
# 1. Password file
if [ ! -e "$PW_FILE" ]; then
head -c 40 /dev/urandom | base64 > "$PW_FILE"
chown root:root "$PW_FILE"
chmod 600 "$PW_FILE"
echo "Generated $PW_FILE (40 bytes random)."
else
echo "$PW_FILE already exists — keeping."
fi
# 2. Env file (only created if missing; user can edit afterwards).
if [ ! -e "$ENV_FILE" ]; then
cat > "$ENV_FILE" <<EOF
# /etc/mc-backup.env — consumed by restic-backup-playerdata.sh and the
# Class B/C/D world script (TBD). Edit as needed.
RESTIC_REPOSITORY_FREQUENT=$REPO_FREQUENT
RESTIC_REPOSITORY_WORLD=$REPO_WORLD
RESTIC_PASSWORD_FILE=$PW_FILE
MC_DATA=/opt/docker/minecraft
MC_BACKUP_LOG=$LOG_DIR/mc-backup.log
MC_BACKUP_FREQUENT_SENTINEL=$SENTINEL_DIR/last-success-frequent
MC_BACKUP_WORLD_SENTINEL=$SENTINEL_DIR/last-success-world
# RCON — used to flush MC saves before snapshotting. Pull from the live
# compose file (services.mc.environment.RCON_PASSWORD).
MC_RCON_HOST=127.0.0.1
MC_RCON_PORT=25575
MC_RCON_PASSWORD=*redacted*
# Off-host mirror destination (onyx via Tailscale). Empty = skip mirror.
TS_OFFHOST_USER=mc-backup
TS_OFFHOST_HOST=100.64.0.1
TS_OFFHOST_PATH=/backups/nullstone-mc-restic
# Alerting — fill in once ntfy.s8n.ru is up. Leave blank for now.
HEARTBEAT_URL=
ALERT_URL=
EOF
chown root:"$(id -gn "$TARGET_USER")" "$ENV_FILE"
chmod 640 "$ENV_FILE"
echo "Wrote $ENV_FILE (mode 640, group=$(id -gn "$TARGET_USER"))."
else
echo "$ENV_FILE already exists — keeping."
fi
# 3. Log + sentinel dirs (writable by target user).
mkdir -p "$SENTINEL_DIR"
chown "$TARGET_USER":"$(id -gn "$TARGET_USER")" "$SENTINEL_DIR"
chmod 750 "$SENTINEL_DIR"
touch "$LOG_DIR/mc-backup.log"
chown "$TARGET_USER":adm "$LOG_DIR/mc-backup.log" 2>/dev/null \
|| chown "$TARGET_USER":"$(id -gn "$TARGET_USER")" "$LOG_DIR/mc-backup.log"
chmod 640 "$LOG_DIR/mc-backup.log"
# 4. Repo init (idempotent — restic init exits non-zero if repo exists).
init_repo() {
local repo=$1
install -d -o "$TARGET_USER" -g "$(id -gn "$TARGET_USER")" -m 700 \
"$(dirname "$repo")" "$repo"
if RESTIC_PASSWORD_FILE="$PW_FILE" RESTIC_REPOSITORY="$repo" \
runuser -u "$TARGET_USER" -- restic snapshots >/dev/null 2>&1; then
echo "Repo $repo: already initialised."
else
RESTIC_PASSWORD_FILE="$PW_FILE" RESTIC_REPOSITORY="$repo" \
runuser -u "$TARGET_USER" -- restic init
echo "Repo $repo: initialised."
fi
}
init_repo "$REPO_FREQUENT"
init_repo "$REPO_WORLD"
# 5. Baseline snapshot of the frequent repo so the first timer run is fast.
echo "Taking baseline snapshot into $REPO_FREQUENT ..."
runuser -u "$TARGET_USER" -- env \
RESTIC_PASSWORD_FILE="$PW_FILE" \
RESTIC_REPOSITORY="$REPO_FREQUENT" \
restic backup \
--tag playerdata --tag baseline --host "$(hostname)" \
--exclude='*.lock' --exclude='*.tmp' \
/opt/docker/minecraft/world/playerdata \
/opt/docker/minecraft/world/stats \
/opt/docker/minecraft/world/advancements \
/opt/docker/minecraft/homestead_data.db \
/opt/docker/minecraft/plugins/AuthMe \
/opt/docker/minecraft/plugins/CoreProtect/database.db \
/opt/docker/minecraft/plugins/LuckPerms \
|| echo "Baseline snapshot returned non-zero — review output above."
cat <<'NEXT'
---------------------------------------------------------------
restic-init.sh complete.
Next steps:
1. Install systemd units:
install -m644 scripts/systemd/mc-backup-playerdata.service \
/etc/systemd/system/
install -m644 scripts/systemd/mc-backup-playerdata.timer \
/etc/systemd/system/
install -m755 scripts/restic-backup-playerdata.sh \
/usr/local/bin/
2. systemctl daemon-reload
3. systemctl enable --now mc-backup-playerdata.timer
4. Tail: journalctl -u mc-backup-playerdata.service -f
Onyx (off-host mirror) provisioning is a separate step — see
docs/RUNBOOK-BACKUP-RESTORE.md "Phase 2 deployment".
---------------------------------------------------------------
NEXT

View file

@ -0,0 +1,29 @@
[Unit]
Description=Minecraft frequent backup (Class A — playerdata + DBs, every 5 min)
Documentation=https://git.s8n.ru/s8n/minecraft-server/src/branch/main/BACKUP-STRATEGY.md
After=docker.service
Wants=docker.service
[Service]
Type=oneshot
User=user
Group=user
EnvironmentFile=/etc/mc-backup.env
ExecStart=/usr/local/bin/restic-backup-playerdata.sh
Nice=10
IOSchedulingClass=best-effort
IOSchedulingPriority=7
# Hardening — restic only needs read on /opt/docker/minecraft and
# write under TARGET_HOME/restic + /var/lib/mc-backup + /var/log.
ProtectSystem=strict
ProtectHome=read-only
ReadOnlyPaths=/opt/docker/minecraft
ReadWritePaths=/home/user/restic /var/lib/mc-backup /var/log
PrivateTmp=true
NoNewPrivileges=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictSUIDSGID=true
LockPersonality=true

View file

@ -0,0 +1,15 @@
[Unit]
Description=Run mc-backup-playerdata every 5 minutes
Documentation=https://git.s8n.ru/s8n/minecraft-server/src/branch/main/BACKUP-STRATEGY.md
[Timer]
# Stagger after boot so MC and Docker have a chance to settle.
OnBootSec=2min
# 5-minute cadence per BACKUP-STRATEGY.md §2 RPO target for Class A.
OnUnitActiveSec=5min
AccuracySec=30s
# Catch up after suspend / downtime.
Persistent=true
[Install]
WantedBy=timers.target