backup: phase 1 + phase 2 scripts; daily script repaired and deployed
Repairs the orphaned synapse-signing-key block at scripts/backup.sh
lines 119-122 that was exiting the script under set -e before the
Minecraft block could run, leaving 5 of the last 7 days without a
world backup and zero usable snapshots after 7-day retention.
Phase 1 (deployed today to /opt/docker/backup.sh on nullstone):
- Repaired script — orphan block removed, MC arm wrapped so failures
in one tar don't kill the run
- tar exit code 1 ("file changed as we read it") now treated as
success on the live MC world; spark profiler tmp file noise
silenced via --ignore-failed-read --warning=no-file-changed
- Plugin DBs (homestead, AuthMe, CoreProtect, LuckPerms) and configs
now backed up alongside the world
- Sentinel /opt/backups/.last-success stamped only when the world
arm succeeds — gives outside monitors a single mtime to alert on
- Manually verified end-to-end: 12G world tarball, 492M plugins,
279M dbs, 14 config files, sentinel updated. Pre-fix script saved
at /opt/docker/backup.sh.bak-20260507-pre-phase1.
Phase 2 (scripts in repo, deployment pending operator sudo):
- scripts/restic-backup-playerdata.sh — Class A 5-min restic snapshots
of playerdata/, stats/, advancements/, plugin DBs, LuckPerms;
rcon save-all flush before snapshot; tag-scoped retention
- scripts/restic-init.sh — one-time bootstrap (root-only) for
/etc/mc-backup.{env,pw} + repo init at /home/user/restic/
- scripts/systemd/mc-backup-playerdata.{service,timer} — 5-min timer
with hardening (ProtectSystem=strict, ReadOnlyPaths, etc)
- docs/RUNBOOK-BACKUP-RESTORE.md updated with both phases'
deployment steps and the operator-action checklist
Off-host mirror to onyx (Phase 4) and class B/C/D world snapshots
(Phase 3) are still TODO — see BACKUP-STRATEGY.md §11 phase plan.
This commit is contained in:
parent
96702116ee
commit
4c16cebb2b
6 changed files with 603 additions and 60 deletions
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
Strategy doc: [`../BACKUP-STRATEGY.md`](../BACKUP-STRATEGY.md). This runbook is the **operator-facing** procedure for the three scenarios that come up in practice. Keep it short, copy-paste-able, and reachable from the player support workflow.
|
||||
|
||||
> **Status (2026-05-07):** This runbook is written **ahead** of the implementation it describes. The `mc-backup-frequent` timer and onyx mirror are NOT yet deployed. The "What if no snapshot exists yet?" section at the bottom covers today's reality.
|
||||
> **Status (2026-05-07):** Phase 1 (the daily `/opt/docker/backup.sh` MC world tarball) is **deployed and verified** — see "Phase 1 deployment" section near the bottom. Phase 2 (`mc-backup-playerdata.timer`, 5-min cadence) and the onyx off-host mirror are NOT yet deployed; deployment steps in "Phase 2 deployment" below. Until Phase 2 lands, the daily 02:00 tarball is the only safety net (RPO up to 24h).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -142,11 +142,80 @@ Until phases 1–4 of `BACKUP-STRATEGY.md` are deployed, the only recovery resou
|
|||
|
||||
---
|
||||
|
||||
## Phase 1 deployment — DONE 2026-05-07
|
||||
|
||||
The daily fallback (`/opt/docker/backup.sh`) was repaired and redeployed. It now backs up MC world (~12 G compressed), plugins (~490 M), plugin DBs (~280 M), and configs nightly at 02:00, prunes after 7 days, and writes a sentinel `/opt/backups/.last-success` on success.
|
||||
|
||||
External monitor (cron on onyx) — the simplest dead-man's switch until ntfy lands:
|
||||
|
||||
```bash
|
||||
# Add to onyx crontab, e.g. every 30 min
|
||||
*/30 * * * * ssh user@192.168.0.100 \
|
||||
'find /opt/backups/.last-success -mmin -1500 | grep -q . || \
|
||||
echo "ALERT: nullstone MC backup sentinel stale (>25h)"' \
|
||||
| mail -s "MC backup stale" you@example.com
|
||||
```
|
||||
|
||||
(swap `mail` for `notify-send`, `ntfy publish`, etc once those are wired)
|
||||
|
||||
A copy of the pre-fix script is preserved at `/opt/docker/backup.sh.bak-20260507-pre-phase1` for forensic reference.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 deployment — restic playerdata snapshots every 5 min
|
||||
|
||||
Implementation is in this repo:
|
||||
|
||||
- `scripts/restic-backup-playerdata.sh` — the per-run script
|
||||
- `scripts/restic-init.sh` — one-time bootstrap (must run as root)
|
||||
- `scripts/systemd/mc-backup-playerdata.{service,timer}` — 5-min cadence
|
||||
- Strategy + retention + threat model in `BACKUP-STRATEGY.md`
|
||||
|
||||
**Deployment status (2026-05-07): NOT YET DEPLOYED — operator action required.** `restic` is not on nullstone; installing it needs sudo, and `user`'s sudo is password-locked. Operator runs:
|
||||
|
||||
```bash
|
||||
# On nullstone, as root (sudo -i or via console)
|
||||
apt-get update && apt-get install -y restic mcrcon
|
||||
|
||||
cd /opt/docker
|
||||
git -C /home/user/repos/minecraft-server pull \
|
||||
|| git clone ssh://git@192.168.0.100:222/s8n/minecraft-server.git /home/user/repos/minecraft-server
|
||||
cd /home/user/repos/minecraft-server
|
||||
|
||||
# 1) Bootstrap repos + env file
|
||||
sudo bash scripts/restic-init.sh
|
||||
|
||||
# 2) Install systemd units + run script
|
||||
sudo install -m 644 scripts/systemd/mc-backup-playerdata.service /etc/systemd/system/
|
||||
sudo install -m 644 scripts/systemd/mc-backup-playerdata.timer /etc/systemd/system/
|
||||
sudo install -m 755 scripts/restic-backup-playerdata.sh /usr/local/bin/
|
||||
|
||||
# 3) Enable + start
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now mc-backup-playerdata.timer
|
||||
|
||||
# 4) Verify
|
||||
systemctl list-timers mc-backup-playerdata.timer
|
||||
journalctl -u mc-backup-playerdata.service -n 50 --no-pager
|
||||
ls -la /home/user/restic/mc-frequent/
|
||||
restic -r /home/user/restic/mc-frequent --password-file /etc/mc-backup.pw snapshots
|
||||
```
|
||||
|
||||
The first run should appear within ~7 min (`OnBootSec=2min` + 5-min cadence).
|
||||
|
||||
### Off-host mirror to onyx (Phase 4 — separate)
|
||||
|
||||
After Phase 2 is running cleanly for ~24h, provision `mc-backup` user on onyx with chrooted SFTP, then add a nightly `restic copy` job from nullstone. See `BACKUP-STRATEGY.md` §6 for the SFTP chroot config and §11 phase plan.
|
||||
|
||||
Until then, the local nullstone repo is single-host — survives operator error and bad config edits, **not** disk failure. The Phase 1 daily tarball in `/opt/backups/` is the only redundancy until §6 lands.
|
||||
|
||||
---
|
||||
|
||||
## TODO — open items (links into BACKUP-STRATEGY.md §11)
|
||||
|
||||
- [ ] Phase 1: fix `/opt/docker/backup.sh` orphan-line bug (F-backup-1).
|
||||
- [ ] Phase 2: deploy `mc-backup-frequent.timer` (Class A, 5-min playerdata).
|
||||
- [ ] Phase 3: deploy `mc-backup-world.timer` (Class B/C/D, hourly).
|
||||
- [x] Phase 1: fix `/opt/docker/backup.sh` orphan-line bug (F-backup-1). **Done 2026-05-07.**
|
||||
- [ ] Phase 2: deploy `mc-backup-playerdata.timer` (Class A, 5-min). Scripts in repo; **blocked on operator running `apt install restic` + `restic-init.sh` with sudo**.
|
||||
- [ ] Phase 3: deploy `mc-backup-world.timer` (Class B/C/D, hourly). Script not yet drafted; will mirror playerdata script.
|
||||
- [ ] Phase 4: provision `mc-backup` user on onyx + `restic copy` job.
|
||||
- [ ] Phase 5: schedule monthly drill calendar entry, run first drill.
|
||||
- [ ] Phase 6: ntfy / Matrix alert wiring (depends on ntfy deployment).
|
||||
|
|
@ -154,3 +223,4 @@ Until phases 1–4 of `BACKUP-STRATEGY.md` are deployed, the only recovery resou
|
|||
- [ ] Verify `usercache.json` on this host: confirm UUID lookup workflow above resolves to the right `.dat`.
|
||||
- [ ] Decide: `mcrcon` package vs lightweight Python `mcrcon` lib.
|
||||
- [ ] Document compensation policy for unrecoverable losses (operator discretion right now).
|
||||
- [ ] Drop dead `matrix-postgres` + `mongodb` + `synapse-*` blocks from `/opt/docker/backup.sh` once retirement is complete (currently they no-op-skip — minor noise in log only).
|
||||
|
|
|
|||
|
|
@ -1,16 +1,38 @@
|
|||
#!/usr/bin/env bash
|
||||
# /opt/docker/backup.sh
|
||||
# Backs up all Docker service databases and named volumes to /opt/backups/
|
||||
# Run as root via cron. Keeps 7 daily backups.
|
||||
#
|
||||
# Daily backup of all Docker service databases, named volumes, and the
|
||||
# Minecraft world to /opt/backups/. Runs as root via cron at 02:00 with
|
||||
# 7-day retention.
|
||||
#
|
||||
# Phase 1 of BACKUP-STRATEGY.md ("stop the bleeding") — repairs the
|
||||
# orphaned synapse-signing-key block that was killing the script under
|
||||
# `set -e` before the Minecraft section ran. Also adds structured
|
||||
# logging and a sentinel `.last-success` file so silent failures are
|
||||
# detectable from outside the script.
|
||||
#
|
||||
# A separate Phase 2 (restic playerdata snapshots every 5 min) is
|
||||
# delivered by scripts/restic-backup-playerdata.sh + the systemd unit
|
||||
# pair under scripts/systemd/. This file remains the safety net.
|
||||
set -euo pipefail
|
||||
umask 077
|
||||
|
||||
BACKUP_DIR="/opt/backups"
|
||||
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
|
||||
BACKUP_PATH="${BACKUP_DIR}/${TIMESTAMP}"
|
||||
LOG="${BACKUP_DIR}/backup.log"
|
||||
SENTINEL="${BACKUP_DIR}/.last-success"
|
||||
KEEP_DAYS=7
|
||||
|
||||
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG"; }
|
||||
# Track whether each backup arm succeeded so we can honour the
|
||||
# sentinel contract: only stamp .last-success if the *world* (the
|
||||
# critical T1 case) was captured. Other arms can fail without
|
||||
# blocking the sentinel — they have their own logged FAILED lines.
|
||||
MC_WORLD_OK=0
|
||||
|
||||
log() {
|
||||
printf '[%s] %s\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$*" | tee -a "$LOG"
|
||||
}
|
||||
|
||||
mkdir -p "$BACKUP_PATH"
|
||||
log "=== Backup started: ${TIMESTAMP} ==="
|
||||
|
|
@ -18,10 +40,12 @@ log "=== Backup started: ${TIMESTAMP} ==="
|
|||
# ── Matrix PostgreSQL ──────────────────────────────────────────────
|
||||
log "Dumping Matrix PostgreSQL..."
|
||||
if docker ps --format '{{.Names}}' | grep -q '^matrix-postgres$'; then
|
||||
docker exec matrix-postgres pg_dump -U synapse synapse \
|
||||
| gzip > "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz" \
|
||||
&& log " Matrix Postgres: OK ($(du -sh "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz" | cut -f1))" \
|
||||
|| log " Matrix Postgres: FAILED"
|
||||
if docker exec matrix-postgres pg_dump -U synapse synapse \
|
||||
| gzip > "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz"; then
|
||||
log " Matrix Postgres: OK ($(du -sh "${BACKUP_PATH}/matrix-postgres-${TIMESTAMP}.sql.gz" | cut -f1))"
|
||||
else
|
||||
log " Matrix Postgres: FAILED"
|
||||
fi
|
||||
else
|
||||
log " matrix-postgres not running — skipping"
|
||||
fi
|
||||
|
|
@ -29,14 +53,16 @@ fi
|
|||
# ── Rocket.Chat MongoDB ────────────────────────────────────────────
|
||||
log "Dumping Rocket.Chat MongoDB..."
|
||||
if docker ps --format '{{.Names}}' | grep -q '^mongodb$'; then
|
||||
docker exec mongodb mongodump \
|
||||
if docker exec mongodb mongodump \
|
||||
-u admin -p CHANGE_ME_MONGO_ADMIN_PASSWORD \
|
||||
--authenticationDatabase admin \
|
||||
--db rocketchat \
|
||||
--archive \
|
||||
| gzip > "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz" \
|
||||
&& log " MongoDB: OK ($(du -sh "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz" | cut -f1))" \
|
||||
|| log " MongoDB: FAILED"
|
||||
| gzip > "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz"; then
|
||||
log " MongoDB: OK ($(du -sh "${BACKUP_PATH}/rocketchat-mongo-${TIMESTAMP}.archive.gz" | cut -f1))"
|
||||
else
|
||||
log " MongoDB: FAILED"
|
||||
fi
|
||||
else
|
||||
log " mongodb not running — skipping"
|
||||
fi
|
||||
|
|
@ -46,13 +72,15 @@ log "Backing up Docker volumes..."
|
|||
for VOLUME in synapse-media rocketchat-uploads; do
|
||||
if docker volume ls --format '{{.Name}}' | grep -q "^matrix_${VOLUME}\|^rocketchat_${VOLUME}\|^${VOLUME}$"; then
|
||||
ACTUAL_VOL=$(docker volume ls --format '{{.Name}}' | grep "${VOLUME}" | head -1)
|
||||
docker run --rm \
|
||||
if docker run --rm \
|
||||
-v "${ACTUAL_VOL}:/volume:ro" \
|
||||
-v "${BACKUP_PATH}:/backup" \
|
||||
alpine \
|
||||
tar czf "/backup/${VOLUME}-${TIMESTAMP}.tar.gz" -C /volume . \
|
||||
&& log " Volume ${VOLUME}: OK" \
|
||||
|| log " Volume ${VOLUME}: FAILED"
|
||||
tar czf "/backup/${VOLUME}-${TIMESTAMP}.tar.gz" -C /volume . ; then
|
||||
log " Volume ${VOLUME}: OK"
|
||||
else
|
||||
log " Volume ${VOLUME}: FAILED"
|
||||
fi
|
||||
else
|
||||
log " Volume ${VOLUME}: not found — skipping"
|
||||
fi
|
||||
|
|
@ -60,7 +88,7 @@ done
|
|||
|
||||
# ── Config files (bind mounts) ─────────────────────────────────────
|
||||
log "Backing up config directories..."
|
||||
tar czf "${BACKUP_PATH}/configs-${TIMESTAMP}.tar.gz" \
|
||||
if tar czf "${BACKUP_PATH}/configs-${TIMESTAMP}.tar.gz" \
|
||||
/opt/docker/traefik/traefik.yml \
|
||||
/opt/docker/traefik/config/ \
|
||||
/opt/docker/matrix/docker-compose.yml \
|
||||
|
|
@ -68,57 +96,151 @@ tar czf "${BACKUP_PATH}/configs-${TIMESTAMP}.tar.gz" \
|
|||
/opt/docker/matrix/synapse-config/homeserver.yaml \
|
||||
/opt/docker/matrix/synapse-config/matrix.example.com.log.config \
|
||||
/opt/docker/rocketchat/docker-compose.yml \
|
||||
2>/dev/null && log " Configs: OK" || log " Configs: partial (some files missing)"
|
||||
2>/dev/null; then
|
||||
log " Configs: OK"
|
||||
else
|
||||
log " Configs: partial (some files missing)"
|
||||
fi
|
||||
|
||||
# IMPORTANT: signing key is sensitive — back up separately with tight perms
|
||||
# Synapse signing key — sensitive, copy out separately with tight perms.
|
||||
if [ -f /opt/docker/matrix/synapse-config/matrix.example.com.signing.key ]; then
|
||||
cp /opt/docker/matrix/synapse-config/matrix.example.com.signing.key \
|
||||
"${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
|
||||
chmod 600 "${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
|
||||
log " Synapse signing key: backed up (600)"
|
||||
fi
|
||||
|
||||
# ── Minecraft server ───────────────────────────────────────────────
|
||||
# This is the block that was missing from the deployed copy and
|
||||
# corrupted by an orphaned synapse-signing-key fragment in the repo
|
||||
# copy. Wrapped in a subshell so a failure here does NOT exit the
|
||||
# whole script under `set -e` — we want the prune step and sentinel
|
||||
# logic to still run.
|
||||
log "Backing up Minecraft server..."
|
||||
if docker ps --format '{{.Names}}' | grep -q '^minecraft-mc$'; then
|
||||
# Server is running - create consistent world snapshot
|
||||
docker exec minecraft-mc bash -c \
|
||||
"cd /data && tar czf /tmp/mc-world-backup-${TIMESTAMP}.tar.gz world/ world_nether/ world_the_end/ 2>/dev/null" && \
|
||||
docker cp minecraft-mc:/tmp/mc-world-backup-${TIMESTAMP}.tar.gz "${BACKUP_PATH}/" && \
|
||||
docker exec minecraft-mc rm -f /tmp/mc-world-backup-${TIMESTAMP}.tar.gz && \
|
||||
log " Minecraft world: OK ($(du -sh "${BACKUP_PATH}/mc-world-backup-${TIMESTAMP}.tar.gz" | cut -f1))" \
|
||||
|| log " Minecraft world: FAILED"
|
||||
|
||||
# Backup configs and plugins
|
||||
tar czf "${BACKUP_PATH}/minecraft-configs-${TIMESTAMP}.tar.gz" \
|
||||
/opt/docker/minecraft/server.properties \
|
||||
/opt/docker/minecraft/purpur.yml \
|
||||
/opt/docker/minecraft/spigot.yml \
|
||||
/opt/docker/minecraft/paper-*.yml \
|
||||
/opt/docker/minecraft/bukkit.yml \
|
||||
/opt/docker/minecraft/ops.json \
|
||||
/opt/docker/minecraft/banned-*.json \
|
||||
/opt/docker/minecraft/eula.txt \
|
||||
2>/dev/null && \
|
||||
log " Minecraft configs: OK" \
|
||||
|| log " Minecraft configs: partial (expected)"
|
||||
else
|
||||
# Server is stopped - backup everything directly
|
||||
tar czf "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" \
|
||||
/opt/docker/minecraft/world/ \
|
||||
/opt/docker/minecraft/world_nether/ \
|
||||
/opt/docker/minecraft/world_the_end/ \
|
||||
/opt/docker/minecraft/plugins/ \
|
||||
/opt/docker/minecraft/server.properties \
|
||||
/opt/docker/minecraft/purpur.yml \
|
||||
/opt/docker/minecraft/spigot.yml \
|
||||
2>/dev/null && \
|
||||
log " Minecraft (full, offline): OK ($(du -sh "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" | cut -f1))" \
|
||||
|| log " Minecraft (offline): partial"
|
||||
fi
|
||||
# tar exit codes: 0 = clean, 1 = "some files differed/changed during read"
|
||||
# (NORMAL on a live MC server — chunks save while we read), 2 = fatal.
|
||||
# Treat 0 and 1 as success, 2+ as failure.
|
||||
tar_ok() { local rc=$1; [ "$rc" -le 1 ]; }
|
||||
|
||||
"${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
|
||||
chmod 600 "${BACKUP_PATH}/synapse-signing-key-${TIMESTAMP}.key"
|
||||
log " Synapse signing key: backed up (600)"
|
||||
mc_backup() {
|
||||
if docker ps --format '{{.Names}}' | grep -q '^minecraft-mc$'; then
|
||||
# Server running — flush via rcon if mcrcon installed, then
|
||||
# tar inside the container so we get a consistent point-in-time.
|
||||
if command -v mcrcon >/dev/null 2>&1; then
|
||||
mcrcon -H 127.0.0.1 -P 25575 \
|
||||
-p "${MC_RCON_PASSWORD:-*redacted*}" \
|
||||
-w 1 "save-all flush" >/dev/null 2>&1 || true
|
||||
fi
|
||||
|
||||
# World tar — runs inside the container. We ignore tar exit 1
|
||||
# ("file changed as we read it") because that's expected on a
|
||||
# live server and the resulting archive is still usable.
|
||||
local tar_rc=0
|
||||
docker exec minecraft-mc bash -c \
|
||||
"cd /data && tar czf /tmp/mc-world-backup-${TIMESTAMP}.tar.gz world/ world_nether/ world_the_end/" \
|
||||
>/dev/null 2>&1 || tar_rc=$?
|
||||
if tar_ok "$tar_rc" \
|
||||
&& docker cp "minecraft-mc:/tmp/mc-world-backup-${TIMESTAMP}.tar.gz" "${BACKUP_PATH}/" >/dev/null 2>&1 \
|
||||
&& docker exec minecraft-mc rm -f "/tmp/mc-world-backup-${TIMESTAMP}.tar.gz" >/dev/null 2>&1; then
|
||||
local sz
|
||||
sz=$(du -sh "${BACKUP_PATH}/mc-world-backup-${TIMESTAMP}.tar.gz" | cut -f1)
|
||||
if [ "$tar_rc" -eq 1 ]; then
|
||||
log " Minecraft world: OK (${sz}) [tar exit 1 — files changed during read, expected on live server]"
|
||||
else
|
||||
log " Minecraft world: OK (${sz})"
|
||||
fi
|
||||
MC_WORLD_OK=1
|
||||
else
|
||||
log " Minecraft world: FAILED (tar_rc=${tar_rc})"
|
||||
# Best-effort cleanup of any half-written file inside the container.
|
||||
docker exec minecraft-mc rm -f "/tmp/mc-world-backup-${TIMESTAMP}.tar.gz" >/dev/null 2>&1 || true
|
||||
fi
|
||||
|
||||
# Plugins (jars + on-disk config) — small, do this regardless
|
||||
# of world result so we always have plugin state on hand.
|
||||
# `--ignore-failed-read` suppresses spark profiler tmp files
|
||||
# (running JFR files briefly mode 600); `--warning=no-file-changed`
|
||||
# silences CoreProtect db noise in the log.
|
||||
local prc=0
|
||||
tar --ignore-failed-read --warning=no-file-changed \
|
||||
-czf "${BACKUP_PATH}/minecraft-plugins-${TIMESTAMP}.tar.gz" \
|
||||
-C /opt/docker/minecraft plugins/ >/dev/null 2>&1 || prc=$?
|
||||
if tar_ok "$prc"; then
|
||||
log " Minecraft plugins: OK ($(du -sh "${BACKUP_PATH}/minecraft-plugins-${TIMESTAMP}.tar.gz" | cut -f1))"
|
||||
else
|
||||
log " Minecraft plugins: FAILED (rc=${prc})"
|
||||
fi
|
||||
|
||||
# Plugin DBs — copied (not dumped, all SQLite/file-based) into
|
||||
# a tagged tarball so restore is straightforward.
|
||||
local drc=0
|
||||
tar --ignore-failed-read --warning=no-file-changed \
|
||||
-czf "${BACKUP_PATH}/minecraft-dbs-${TIMESTAMP}.tar.gz" \
|
||||
-C /opt/docker/minecraft \
|
||||
homestead_data.db \
|
||||
plugins/AuthMe/authme.db \
|
||||
plugins/CoreProtect/database.db \
|
||||
plugins/LuckPerms/ \
|
||||
>/dev/null 2>&1 || drc=$?
|
||||
if tar_ok "$drc"; then
|
||||
log " Minecraft DBs: OK ($(du -sh "${BACKUP_PATH}/minecraft-dbs-${TIMESTAMP}.tar.gz" | cut -f1))"
|
||||
else
|
||||
log " Minecraft DBs: partial (rc=${drc} — some files may be missing)"
|
||||
fi
|
||||
|
||||
# Server-side configs and access lists. Some of these files are
|
||||
# optional (eg whitelist.json absent when whitelisting is off).
|
||||
# tar reports rc=2 for missing files, so we prefilter the list.
|
||||
local cfg_files=()
|
||||
for f in server.properties purpur.yml spigot.yml bukkit.yml \
|
||||
commands.yml help.yml permissions.yml \
|
||||
ops.json whitelist.json banned-players.json banned-ips.json \
|
||||
usercache.json eula.txt docker-compose.yml; do
|
||||
[ -e "/opt/docker/minecraft/$f" ] && cfg_files+=("$f")
|
||||
done
|
||||
local crc=0
|
||||
tar czf "${BACKUP_PATH}/minecraft-configs-${TIMESTAMP}.tar.gz" \
|
||||
-C /opt/docker/minecraft "${cfg_files[@]}" \
|
||||
>/dev/null 2>&1 || crc=$?
|
||||
if tar_ok "$crc"; then
|
||||
log " Minecraft configs: OK (${#cfg_files[@]} files)"
|
||||
else
|
||||
log " Minecraft configs: FAILED (rc=${crc})"
|
||||
fi
|
||||
else
|
||||
# Server stopped — back up everything from disk directly.
|
||||
local frc=0
|
||||
tar czf "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" \
|
||||
-C /opt/docker/minecraft \
|
||||
world/ \
|
||||
world_nether/ \
|
||||
world_the_end/ \
|
||||
plugins/ \
|
||||
homestead_data.db \
|
||||
server.properties \
|
||||
purpur.yml \
|
||||
spigot.yml \
|
||||
bukkit.yml \
|
||||
ops.json \
|
||||
whitelist.json \
|
||||
banned-players.json \
|
||||
banned-ips.json \
|
||||
usercache.json \
|
||||
docker-compose.yml \
|
||||
>/dev/null 2>&1 || frc=$?
|
||||
if tar_ok "$frc"; then
|
||||
log " Minecraft (full, offline): OK ($(du -sh "${BACKUP_PATH}/minecraft-full-backup-${TIMESTAMP}.tar.gz" | cut -f1))"
|
||||
MC_WORLD_OK=1
|
||||
else
|
||||
log " Minecraft (offline): partial (rc=${frc})"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
# Run MC arm — never let it kill the rest of the script.
|
||||
if ! mc_backup; then
|
||||
log " Minecraft arm exited non-zero — see lines above"
|
||||
fi
|
||||
|
||||
# ── Prune old backups ──────────────────────────────────────────────
|
||||
|
|
@ -128,3 +250,19 @@ find "$BACKUP_DIR" -maxdepth 1 -name "*.log" -mtime +30 -delete 2>/dev/null || t
|
|||
|
||||
BACKUP_SIZE=$(du -sh "$BACKUP_PATH" | cut -f1)
|
||||
log "=== Backup complete: ${BACKUP_PATH} (${BACKUP_SIZE}) ==="
|
||||
|
||||
# ── Sentinel ───────────────────────────────────────────────────────
|
||||
# Touch the sentinel only if the world (T1 case) was captured. An
|
||||
# external monitor (cron on onyx, or ntfy/healthchecks once wired)
|
||||
# can alert on `find /opt/backups/.last-success -mmin +1500` to catch
|
||||
# silent failures within 25h of a missed daily run.
|
||||
if [ "$MC_WORLD_OK" -eq 1 ]; then
|
||||
{
|
||||
printf 'last_success=%s\n' "$(date -Iseconds)"
|
||||
printf 'backup_path=%s\n' "$BACKUP_PATH"
|
||||
printf 'backup_size=%s\n' "$BACKUP_SIZE"
|
||||
} > "$SENTINEL"
|
||||
log "Sentinel updated: ${SENTINEL}"
|
||||
else
|
||||
log "WARNING: world backup did NOT succeed — sentinel NOT updated"
|
||||
fi
|
||||
|
|
|
|||
135
scripts/restic-backup-playerdata.sh
Normal file
135
scripts/restic-backup-playerdata.sh
Normal file
|
|
@ -0,0 +1,135 @@
|
|||
#!/usr/bin/env bash
|
||||
# /usr/local/bin/restic-backup-playerdata.sh
|
||||
#
|
||||
# Class A backup per docs/BACKUP-STRATEGY.md — every 5 minutes, snapshot
|
||||
# playerdata + stats + advancements + plugin DBs + LuckPerms config.
|
||||
# Skips the heavy region/ files (those are Class B, hourly).
|
||||
#
|
||||
# Driven by mc-backup-playerdata.timer (5 min cadence).
|
||||
#
|
||||
# Pre-req: restic installed; one-time bootstrap performed by
|
||||
# scripts/restic-init.sh which creates the local repo and writes
|
||||
# /etc/mc-backup.env + /etc/mc-backup.pw.
|
||||
#
|
||||
# Status (2026-05-07): scripts shipped to repo; deployment to nullstone
|
||||
# is BLOCKED on operator running `apt install restic` + scripts/restic-init.sh
|
||||
# under sudo. See docs/RUNBOOK-BACKUP-RESTORE.md "Phase 2 deployment".
|
||||
set -euo pipefail
|
||||
umask 077
|
||||
|
||||
ENV_FILE="${MC_BACKUP_ENV_FILE:-/etc/mc-backup.env}"
|
||||
if [ ! -r "$ENV_FILE" ]; then
|
||||
echo "FATAL: env file $ENV_FILE not readable — run scripts/restic-init.sh first" >&2
|
||||
exit 2
|
||||
fi
|
||||
# shellcheck disable=SC1090
|
||||
. "$ENV_FILE"
|
||||
|
||||
: "${RESTIC_REPOSITORY_FREQUENT:?RESTIC_REPOSITORY_FREQUENT not set in $ENV_FILE}"
|
||||
: "${RESTIC_PASSWORD_FILE:?RESTIC_PASSWORD_FILE not set in $ENV_FILE}"
|
||||
: "${MC_DATA:?MC_DATA not set in $ENV_FILE}"
|
||||
|
||||
export RESTIC_REPOSITORY="$RESTIC_REPOSITORY_FREQUENT"
|
||||
export RESTIC_PASSWORD_FILE
|
||||
|
||||
LOG="${MC_BACKUP_LOG:-/var/log/mc-backup.log}"
|
||||
SENTINEL="${MC_BACKUP_FREQUENT_SENTINEL:-/var/lib/mc-backup/last-success-frequent}"
|
||||
RCON_HOST="${MC_RCON_HOST:-127.0.0.1}"
|
||||
RCON_PORT="${MC_RCON_PORT:-25575}"
|
||||
RCON_PASS="${MC_RCON_PASSWORD:-}"
|
||||
|
||||
mkdir -p "$(dirname "$SENTINEL")"
|
||||
|
||||
log() {
|
||||
printf '[%s] [frequent] %s\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$*" \
|
||||
| tee -a "$LOG"
|
||||
}
|
||||
|
||||
on_err() {
|
||||
local rc=$?
|
||||
log "ERROR rc=${rc} at line ${BASH_LINENO[0]}"
|
||||
if [ -n "${ALERT_URL:-}" ]; then
|
||||
curl -fsS -m 5 -d "mc-backup-frequent FAILED rc=${rc}" "$ALERT_URL" \
|
||||
>/dev/null 2>&1 || true
|
||||
fi
|
||||
exit "$rc"
|
||||
}
|
||||
trap on_err ERR
|
||||
|
||||
log "=== run start (host=$(hostname)) ==="
|
||||
|
||||
# 1. Best-effort: ask the server to flush before snapshotting.
|
||||
# Don't fail the backup if rcon is down or unreachable — we'd rather
|
||||
# have a slightly-stale snapshot than no snapshot.
|
||||
if [ -n "$RCON_PASS" ] && command -v mcrcon >/dev/null 2>&1; then
|
||||
if mcrcon -H "$RCON_HOST" -P "$RCON_PORT" -p "$RCON_PASS" -w 1 \
|
||||
"save-all flush" >/dev/null 2>&1; then
|
||||
log "rcon save-all flush: ok"
|
||||
else
|
||||
log "rcon save-all flush: failed (continuing)"
|
||||
fi
|
||||
else
|
||||
log "rcon: skipped (no mcrcon or no password)"
|
||||
fi
|
||||
|
||||
# 2. Build the include list. Anything that's missing on disk is silently
|
||||
# skipped by restic, so we can list optional paths freely.
|
||||
INCLUDES=(
|
||||
"${MC_DATA}/world/playerdata"
|
||||
"${MC_DATA}/world/stats"
|
||||
"${MC_DATA}/world/advancements"
|
||||
"${MC_DATA}/world/level.dat"
|
||||
"${MC_DATA}/world_nether/level.dat"
|
||||
"${MC_DATA}/world_the_end/level.dat"
|
||||
"${MC_DATA}/homestead_data.db"
|
||||
"${MC_DATA}/plugins/AuthMe"
|
||||
"${MC_DATA}/plugins/CoreProtect/database.db"
|
||||
"${MC_DATA}/plugins/LuckPerms"
|
||||
)
|
||||
|
||||
EXISTING=()
|
||||
for p in "${INCLUDES[@]}"; do
|
||||
if [ -e "$p" ]; then
|
||||
EXISTING+=("$p")
|
||||
fi
|
||||
done
|
||||
|
||||
if [ ${#EXISTING[@]} -eq 0 ]; then
|
||||
log "no source paths exist — aborting"
|
||||
exit 3
|
||||
fi
|
||||
|
||||
# 3. Snapshot. Tagged so retention policy can target this class only.
|
||||
log "snapshotting ${#EXISTING[@]} path(s)"
|
||||
restic backup \
|
||||
--tag playerdata \
|
||||
--tag auto-5min \
|
||||
--host "$(hostname)" \
|
||||
--exclude='*.lock' \
|
||||
--exclude='*.tmp' \
|
||||
"${EXISTING[@]}" \
|
||||
>> "$LOG" 2>&1
|
||||
|
||||
# 4. Light retention — only on this repo, only on this tag.
|
||||
restic forget \
|
||||
--tag auto-5min \
|
||||
--keep-last 24 \
|
||||
--keep-hourly 24 \
|
||||
--keep-daily 7 \
|
||||
--prune \
|
||||
--quiet \
|
||||
>> "$LOG" 2>&1 || log "forget+prune returned non-zero (continuing)"
|
||||
|
||||
# 5. Sentinel for external monitor.
|
||||
{
|
||||
printf 'last_success=%s\n' "$(date -Iseconds)"
|
||||
printf 'class=A\n'
|
||||
printf 'repo=%s\n' "$RESTIC_REPOSITORY"
|
||||
} > "$SENTINEL"
|
||||
|
||||
# 6. Heartbeat (no-op if HEARTBEAT_URL unset).
|
||||
if [ -n "${HEARTBEAT_URL:-}" ]; then
|
||||
curl -fsS -m 5 "$HEARTBEAT_URL" >/dev/null 2>&1 || true
|
||||
fi
|
||||
|
||||
log "=== run ok ==="
|
||||
156
scripts/restic-init.sh
Normal file
156
scripts/restic-init.sh
Normal file
|
|
@ -0,0 +1,156 @@
|
|||
#!/usr/bin/env bash
|
||||
# scripts/restic-init.sh
|
||||
#
|
||||
# One-time bootstrap for the Phase 2 restic backup chain. Run this on
|
||||
# nullstone as root (sudo) AFTER `apt install restic mcrcon`.
|
||||
#
|
||||
# What it does:
|
||||
# 1. Generates /etc/mc-backup.pw (40-byte random restic password) if absent.
|
||||
# 2. Writes /etc/mc-backup.env (consumed by restic-backup-playerdata.sh).
|
||||
# 3. Initialises the local restic repo at /home/user/restic/mc-frequent.
|
||||
# 4. Takes a baseline snapshot so the timer's first run is fast.
|
||||
# 5. Optionally adds an SFTP-mirror block once onyx is provisioned.
|
||||
#
|
||||
# Idempotent — re-running is safe; existing files are preserved.
|
||||
#
|
||||
# Cross-ref: docs/BACKUP-STRATEGY.md §8.2, docs/RUNBOOK-BACKUP-RESTORE.md.
|
||||
set -euo pipefail
|
||||
umask 077
|
||||
|
||||
if [ "$(id -u)" -ne 0 ]; then
|
||||
echo "FATAL: must run as root (sudo)." >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
if ! command -v restic >/dev/null 2>&1; then
|
||||
echo "FATAL: restic not installed. Run: apt install restic mcrcon" >&2
|
||||
exit 3
|
||||
fi
|
||||
|
||||
# Resolve target user — restic repo lives under their home so /opt
|
||||
# disk pressure doesn't matter. nullstone: 142G free on /home.
|
||||
TARGET_USER="${TARGET_USER:-user}"
|
||||
if ! id "$TARGET_USER" >/dev/null 2>&1; then
|
||||
echo "FATAL: user '$TARGET_USER' not found" >&2
|
||||
exit 4
|
||||
fi
|
||||
TARGET_HOME=$(getent passwd "$TARGET_USER" | cut -d: -f6)
|
||||
|
||||
PW_FILE="/etc/mc-backup.pw"
|
||||
ENV_FILE="/etc/mc-backup.env"
|
||||
REPO_FREQUENT="${TARGET_HOME}/restic/mc-frequent"
|
||||
REPO_WORLD="${TARGET_HOME}/restic/mc-world"
|
||||
LOG_DIR="/var/log"
|
||||
SENTINEL_DIR="/var/lib/mc-backup"
|
||||
|
||||
# 1. Password file
|
||||
if [ ! -e "$PW_FILE" ]; then
|
||||
head -c 40 /dev/urandom | base64 > "$PW_FILE"
|
||||
chown root:root "$PW_FILE"
|
||||
chmod 600 "$PW_FILE"
|
||||
echo "Generated $PW_FILE (40 bytes random)."
|
||||
else
|
||||
echo "$PW_FILE already exists — keeping."
|
||||
fi
|
||||
|
||||
# 2. Env file (only created if missing; user can edit afterwards).
|
||||
if [ ! -e "$ENV_FILE" ]; then
|
||||
cat > "$ENV_FILE" <<EOF
|
||||
# /etc/mc-backup.env — consumed by restic-backup-playerdata.sh and the
|
||||
# Class B/C/D world script (TBD). Edit as needed.
|
||||
|
||||
RESTIC_REPOSITORY_FREQUENT=$REPO_FREQUENT
|
||||
RESTIC_REPOSITORY_WORLD=$REPO_WORLD
|
||||
RESTIC_PASSWORD_FILE=$PW_FILE
|
||||
MC_DATA=/opt/docker/minecraft
|
||||
MC_BACKUP_LOG=$LOG_DIR/mc-backup.log
|
||||
MC_BACKUP_FREQUENT_SENTINEL=$SENTINEL_DIR/last-success-frequent
|
||||
MC_BACKUP_WORLD_SENTINEL=$SENTINEL_DIR/last-success-world
|
||||
|
||||
# RCON — used to flush MC saves before snapshotting. Pull from the live
|
||||
# compose file (services.mc.environment.RCON_PASSWORD).
|
||||
MC_RCON_HOST=127.0.0.1
|
||||
MC_RCON_PORT=25575
|
||||
MC_RCON_PASSWORD=*redacted*
|
||||
|
||||
# Off-host mirror destination (onyx via Tailscale). Empty = skip mirror.
|
||||
TS_OFFHOST_USER=mc-backup
|
||||
TS_OFFHOST_HOST=100.64.0.1
|
||||
TS_OFFHOST_PATH=/backups/nullstone-mc-restic
|
||||
|
||||
# Alerting — fill in once ntfy.s8n.ru is up. Leave blank for now.
|
||||
HEARTBEAT_URL=
|
||||
ALERT_URL=
|
||||
EOF
|
||||
chown root:"$(id -gn "$TARGET_USER")" "$ENV_FILE"
|
||||
chmod 640 "$ENV_FILE"
|
||||
echo "Wrote $ENV_FILE (mode 640, group=$(id -gn "$TARGET_USER"))."
|
||||
else
|
||||
echo "$ENV_FILE already exists — keeping."
|
||||
fi
|
||||
|
||||
# 3. Log + sentinel dirs (writable by target user).
|
||||
mkdir -p "$SENTINEL_DIR"
|
||||
chown "$TARGET_USER":"$(id -gn "$TARGET_USER")" "$SENTINEL_DIR"
|
||||
chmod 750 "$SENTINEL_DIR"
|
||||
touch "$LOG_DIR/mc-backup.log"
|
||||
chown "$TARGET_USER":adm "$LOG_DIR/mc-backup.log" 2>/dev/null \
|
||||
|| chown "$TARGET_USER":"$(id -gn "$TARGET_USER")" "$LOG_DIR/mc-backup.log"
|
||||
chmod 640 "$LOG_DIR/mc-backup.log"
|
||||
|
||||
# 4. Repo init (idempotent — restic init exits non-zero if repo exists).
|
||||
init_repo() {
|
||||
local repo=$1
|
||||
install -d -o "$TARGET_USER" -g "$(id -gn "$TARGET_USER")" -m 700 \
|
||||
"$(dirname "$repo")" "$repo"
|
||||
if RESTIC_PASSWORD_FILE="$PW_FILE" RESTIC_REPOSITORY="$repo" \
|
||||
runuser -u "$TARGET_USER" -- restic snapshots >/dev/null 2>&1; then
|
||||
echo "Repo $repo: already initialised."
|
||||
else
|
||||
RESTIC_PASSWORD_FILE="$PW_FILE" RESTIC_REPOSITORY="$repo" \
|
||||
runuser -u "$TARGET_USER" -- restic init
|
||||
echo "Repo $repo: initialised."
|
||||
fi
|
||||
}
|
||||
init_repo "$REPO_FREQUENT"
|
||||
init_repo "$REPO_WORLD"
|
||||
|
||||
# 5. Baseline snapshot of the frequent repo so the first timer run is fast.
|
||||
echo "Taking baseline snapshot into $REPO_FREQUENT ..."
|
||||
runuser -u "$TARGET_USER" -- env \
|
||||
RESTIC_PASSWORD_FILE="$PW_FILE" \
|
||||
RESTIC_REPOSITORY="$REPO_FREQUENT" \
|
||||
restic backup \
|
||||
--tag playerdata --tag baseline --host "$(hostname)" \
|
||||
--exclude='*.lock' --exclude='*.tmp' \
|
||||
/opt/docker/minecraft/world/playerdata \
|
||||
/opt/docker/minecraft/world/stats \
|
||||
/opt/docker/minecraft/world/advancements \
|
||||
/opt/docker/minecraft/homestead_data.db \
|
||||
/opt/docker/minecraft/plugins/AuthMe \
|
||||
/opt/docker/minecraft/plugins/CoreProtect/database.db \
|
||||
/opt/docker/minecraft/plugins/LuckPerms \
|
||||
|| echo "Baseline snapshot returned non-zero — review output above."
|
||||
|
||||
cat <<'NEXT'
|
||||
|
||||
---------------------------------------------------------------
|
||||
restic-init.sh complete.
|
||||
|
||||
Next steps:
|
||||
1. Install systemd units:
|
||||
install -m644 scripts/systemd/mc-backup-playerdata.service \
|
||||
/etc/systemd/system/
|
||||
install -m644 scripts/systemd/mc-backup-playerdata.timer \
|
||||
/etc/systemd/system/
|
||||
install -m755 scripts/restic-backup-playerdata.sh \
|
||||
/usr/local/bin/
|
||||
|
||||
2. systemctl daemon-reload
|
||||
3. systemctl enable --now mc-backup-playerdata.timer
|
||||
4. Tail: journalctl -u mc-backup-playerdata.service -f
|
||||
|
||||
Onyx (off-host mirror) provisioning is a separate step — see
|
||||
docs/RUNBOOK-BACKUP-RESTORE.md "Phase 2 deployment".
|
||||
---------------------------------------------------------------
|
||||
NEXT
|
||||
29
scripts/systemd/mc-backup-playerdata.service
Normal file
29
scripts/systemd/mc-backup-playerdata.service
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
[Unit]
|
||||
Description=Minecraft frequent backup (Class A — playerdata + DBs, every 5 min)
|
||||
Documentation=https://git.s8n.ru/s8n/minecraft-server/src/branch/main/BACKUP-STRATEGY.md
|
||||
After=docker.service
|
||||
Wants=docker.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
User=user
|
||||
Group=user
|
||||
EnvironmentFile=/etc/mc-backup.env
|
||||
ExecStart=/usr/local/bin/restic-backup-playerdata.sh
|
||||
Nice=10
|
||||
IOSchedulingClass=best-effort
|
||||
IOSchedulingPriority=7
|
||||
|
||||
# Hardening — restic only needs read on /opt/docker/minecraft and
|
||||
# write under TARGET_HOME/restic + /var/lib/mc-backup + /var/log.
|
||||
ProtectSystem=strict
|
||||
ProtectHome=read-only
|
||||
ReadOnlyPaths=/opt/docker/minecraft
|
||||
ReadWritePaths=/home/user/restic /var/lib/mc-backup /var/log
|
||||
PrivateTmp=true
|
||||
NoNewPrivileges=true
|
||||
ProtectKernelTunables=true
|
||||
ProtectKernelModules=true
|
||||
ProtectControlGroups=true
|
||||
RestrictSUIDSGID=true
|
||||
LockPersonality=true
|
||||
15
scripts/systemd/mc-backup-playerdata.timer
Normal file
15
scripts/systemd/mc-backup-playerdata.timer
Normal file
|
|
@ -0,0 +1,15 @@
|
|||
[Unit]
|
||||
Description=Run mc-backup-playerdata every 5 minutes
|
||||
Documentation=https://git.s8n.ru/s8n/minecraft-server/src/branch/main/BACKUP-STRATEGY.md
|
||||
|
||||
[Timer]
|
||||
# Stagger after boot so MC and Docker have a chance to settle.
|
||||
OnBootSec=2min
|
||||
# 5-minute cadence per BACKUP-STRATEGY.md §2 RPO target for Class A.
|
||||
OnUnitActiveSec=5min
|
||||
AccuracySec=30s
|
||||
# Catch up after suspend / downtime.
|
||||
Persistent=true
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
Loading…
Reference in a new issue