485 lines
20 KiB
Markdown
485 lines
20 KiB
Markdown
|
|
# AuthLimbo v2 — Architecture
|
||
|
|
|
||
|
|
Status: **Design draft** (no code). Drafted 2026-05-07 by the auth-limbo
|
||
|
|
v2 design pass after the YOU500 / second-player void-death incidents.
|
||
|
|
Audience: operator (P) and future contributors.
|
||
|
|
|
||
|
|
Companion docs:
|
||
|
|
- [`AUDIT-2026-05-07.md`](../AUDIT-2026-05-07.md) — root-cause forensic.
|
||
|
|
- [`ROADMAP.md`](../ROADMAP.md) — v1.x tracking (F1-F7).
|
||
|
|
- [`V2-ROADMAP.md`](V2-ROADMAP.md) — milestones M0-M5 for v2.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 1. Why v2
|
||
|
|
|
||
|
|
v1 is a single-jar Paper plugin glued onto AuthMe. It works *most* of
|
||
|
|
the time, but its core failure modes are now well-understood and can't
|
||
|
|
be patched away inside the v1 design:
|
||
|
|
|
||
|
|
| v1 limitation | v2 must address |
|
||
|
|
|---------------|------------------|
|
||
|
|
| Player object exists on the main server *before* auth — coords/inventory technically restorable from RAM by buggy plugins, world chunk activity is observable. | Strong isolation: limbo is the only state the player can touch pre-auth. |
|
||
|
|
| Restore relies on AuthMe firing `LoginEvent`. AuthMe's own broken teleport runs in the same window — F4 pre-empts it but the design still races. | Authoritative state machine that doesn't trust AuthMe's teleport at all. |
|
||
|
|
| Inventory loss on transit-death depends on F1 + F5 holding. There is no inventory-of-record outside live game state. | Snapshot-on-pre-login + snapshot-restore is a first-class subsystem, not a defensive add-on. |
|
||
|
|
| No metrics, no audit log, no admin alerting. Bugs only surface when a player loses gear. | Built-in observability: Prometheus + JSON-Lines audit + Discord webhook. |
|
||
|
|
| No queue / login-throttle. If 50 bots connect at once, AuthMe stalls. | Bounded concurrency with transparent FIFO and trust tiers (NOT pay tiers). |
|
||
|
|
|
||
|
|
v2 is a clean break (`v2.0.0`), not a v1 patch. v1 stays receiving F3,
|
||
|
|
F5, F6, F7 backports for as long as racked.ru still runs the old jar.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 2. Stack decision — **Paper-only**, with a Velocity-ready seam
|
||
|
|
|
||
|
|
**Recommendation: Paper-only single-server plugin for v2.0.0.**
|
||
|
|
Velocity-mode is a v2.x deferrable behind a feature flag.
|
||
|
|
|
||
|
|
### Reasoning
|
||
|
|
|
||
|
|
racked.ru today is one Purpur 1.21.11 server in `minecraft-mc` itzg
|
||
|
|
container on nullstone. There is no Velocity / BungeeCord, no second
|
||
|
|
backend, no Forced Hosts, no proxy network. Adding Velocity to ship a
|
||
|
|
gatekeeper plugin would mean:
|
||
|
|
|
||
|
|
- standing up a new container, opening a new public port (or keeping
|
||
|
|
25565 on the proxy and 25566 internal),
|
||
|
|
- migrating the 12+ existing Paper plugins through the velocity-paper
|
||
|
|
bridge contract for chat / commands / placeholders,
|
||
|
|
- new TLS / RCON / proxy-protocol surface to harden,
|
||
|
|
- breaking changes to AuthMe's data flow (proxy-side login flow vs
|
||
|
|
paper-side `AuthMeAsyncPreLoginEvent`),
|
||
|
|
- one more thing for the operator to babysit.
|
||
|
|
|
||
|
|
The privacy property the operator cares about — *no other player sees
|
||
|
|
pre-auth coords / inventory* — is achievable on Paper-only via a
|
||
|
|
strictly isolated limbo world + audience scoping (see §4). Velocity adds
|
||
|
|
*stronger* isolation (player never reaches the backend at all) but the
|
||
|
|
incremental privacy gain is small for a 0-10 player community, and the
|
||
|
|
operational cost is large.
|
||
|
|
|
||
|
|
### When Velocity becomes worth it
|
||
|
|
|
||
|
|
Codify trip-wires up front so the decision isn't dragged out:
|
||
|
|
|
||
|
|
1. racked.ru splits into ≥2 backends (e.g. `survival` + `creative`) —
|
||
|
|
you need a proxy anyway.
|
||
|
|
2. cobblestone server comes online and shares an account/auth pool.
|
||
|
|
3. Botting attempts cross 100 connections / minute and `connection-throttle` +
|
||
|
|
`firewalld rate-limit` are no longer enough. Velocity + a queue
|
||
|
|
plugin (Ajax / VeloctyQueue) become operationally cheaper than
|
||
|
|
chasing botnets at the application layer.
|
||
|
|
|
||
|
|
Until any of those, Paper-only is the right answer.
|
||
|
|
|
||
|
|
### The Velocity-ready seam
|
||
|
|
|
||
|
|
v2 internal API is split into two layers so the proxy migration is
|
||
|
|
mechanical:
|
||
|
|
|
||
|
|
```
|
||
|
|
+-------------------------------+ +-------------------------------+
|
||
|
|
| Gatekeeper (proxy or paper) | | Restore (paper only) |
|
||
|
|
| - accept connection | | - read snapshot |
|
||
|
|
| - check ban / rate limit | | - chunk preload |
|
||
|
|
| - hold in limbo / queue | | - authoritative TP |
|
||
|
|
| - hand off on auth-success | | - publish metrics |
|
||
|
|
+--------------+----------------+ +-------------------------------+
|
||
|
|
| hand-off event (UUID, target Location, source IP)
|
||
|
|
v
|
||
|
|
```
|
||
|
|
|
||
|
|
In v2.0 both layers live in the Paper plugin and the hand-off is just a
|
||
|
|
local method call. In a future "v2-velo" both layers split: gatekeeper
|
||
|
|
runs as a Velocity plugin, restore stays on Paper, hand-off becomes a
|
||
|
|
plugin-message channel. No code outside those two layers needs to
|
||
|
|
change.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 3. Queue model — login-throttle + transparent trust tiers, NO 2b2t-style sale
|
||
|
|
|
||
|
|
**For 0-10 player normal load: queue depth is always 0 and players
|
||
|
|
never see "queued" UI. The queue exists for crisis scenarios (bot
|
||
|
|
flood, restart drain, AuthMe DB stall) and to define explicit policy
|
||
|
|
even if it's rarely hit.**
|
||
|
|
|
||
|
|
### Policy
|
||
|
|
|
||
|
|
| Tier | Definition | Effect |
|
||
|
|
|------|-----------|--------|
|
||
|
|
| `staff` | Player has `authlimbo.queue.priority.staff` permission (LP-managed). | Always passes. Bypasses queue entirely. |
|
||
|
|
| `returning` | Player is in AuthMe DB AND has logged in within last 30 days. | Default tier for everyone who isn't new. Normal FIFO ordering by connect-time. |
|
||
|
|
| `new` | Player is NOT in AuthMe DB OR last seen >30 days ago. | Same FIFO as `returning` BUT with a per-IP 1/minute throttle. Stops bot-floods. |
|
||
|
|
| `flagged` | Player IP matches a Pi-hole/CrowdSec/abuse-DB block. | Rejected at gatekeeper, never enters the queue. |
|
||
|
|
|
||
|
|
Hard rules — written into `V2-ARCHITECTURE.md` so they outlive any one
|
||
|
|
operator's mood:
|
||
|
|
|
||
|
|
1. **No paid priority. Ever.** No "priority queue pass", no
|
||
|
|
"supporter rank skip", no Patreon tier. The 2b2t community
|
||
|
|
collapsed under that grift; we don't repeat it.
|
||
|
|
2. **No hidden veteran tier.** Every tier is documented in this file
|
||
|
|
and in `/authlimbo queue policy` in-game. If a player can't see why
|
||
|
|
they're in tier X, the tier is illegitimate.
|
||
|
|
3. **No in-game bidding / griefing for queue spots.** Queue position
|
||
|
|
is purely connect-time + tier; no player action affects it.
|
||
|
|
4. **Ops-staff bypass is logged.** Every staff bypass writes a JSON-L
|
||
|
|
audit row.
|
||
|
|
|
||
|
|
### Capacity
|
||
|
|
|
||
|
|
- `gatekeeper.max-concurrent-auth: 5` — at most 5 players in the
|
||
|
|
pre-auth limbo at once. Defaults sized for racked.ru. AuthMe DB
|
||
|
|
reads + chunk pins per concurrent player are roughly free, but bound
|
||
|
|
it anyway.
|
||
|
|
- `gatekeeper.max-queue-depth: 50` — beyond 50 waiting, new
|
||
|
|
connections get a "server is starting up, try again in 30s" kick.
|
||
|
|
Better UX than a 5-minute black-screen wait.
|
||
|
|
- `gatekeeper.queue-timeout-seconds: 120` — anyone in the queue >2
|
||
|
|
minutes gets the same kick + a Discord webhook fires.
|
||
|
|
|
||
|
|
### What queue UX looks like
|
||
|
|
|
||
|
|
In limbo, a `BossBar` (Adventure API) shows tier + position:
|
||
|
|
|
||
|
|
```
|
||
|
|
[returning] Queue position: 3 / 7 ETA: ~15s
|
||
|
|
```
|
||
|
|
|
||
|
|
When position == 0 and AuthMe accepts, the bar disappears. There's no
|
||
|
|
hidden state. `/queue` in-chat re-displays the same info.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 4. Privacy isolation
|
||
|
|
|
||
|
|
This is the original feature; v2 must not regress it.
|
||
|
|
|
||
|
|
### Limbo world
|
||
|
|
|
||
|
|
- Separate Bukkit world `auth_limbo`, `Environment.THE_END`,
|
||
|
|
`VoidGenerator`. Same as v1.
|
||
|
|
- `keepSpawnInMemory=true`. Game-rules: no daylight, no weather, no
|
||
|
|
mobs, no fire-tick, no PvP, `doImmediateRespawn=true`,
|
||
|
|
`keepInventory=true` (defence-in-depth — limbo never *should* see a
|
||
|
|
death event but if it does, no item drops happen).
|
||
|
|
- Per-player view-distance forced to 2 in limbo via Paper's
|
||
|
|
`Player#setViewDistance`. They see 5x5 chunks, all empty.
|
||
|
|
- Limbo platform: 5x5 of `BARRIER` blocks at y=127, single block of
|
||
|
|
`BARRIER` ceiling at y=129 to prevent flying out. y=0..126 and
|
||
|
|
y=130+ are pure void.
|
||
|
|
|
||
|
|
### Adventure-API audience scoping
|
||
|
|
|
||
|
|
`PlayerChatEvent` listener at `EventPriority.HIGHEST`:
|
||
|
|
|
||
|
|
- If sender is in main worlds, recipient list is filtered: anyone
|
||
|
|
whose `World#getName().equals("auth_limbo")` is dropped. Pre-auth
|
||
|
|
players never see overworld chat.
|
||
|
|
- If sender is in limbo (would normally not chat — AuthMe blocks it
|
||
|
|
— but defence in depth), recipient list is set to *only* the
|
||
|
|
sender. They cannot leak messages to the main world.
|
||
|
|
- `PlayerJoinEvent` join messages are suppressed for
|
||
|
|
`auth_limbo`-spawn joins. Main world only sees a join announcement
|
||
|
|
*after* the authoritative restore TP succeeds (M2 §"join-message
|
||
|
|
shifting" below).
|
||
|
|
|
||
|
|
### Tablist scoping
|
||
|
|
|
||
|
|
Hook `PaperPlayerListEntryEvent` (or fall back to
|
||
|
|
`PlayerJoinEvent` + `Player#hidePlayer`):
|
||
|
|
|
||
|
|
- Limbo players are hidden from main-world tablist.
|
||
|
|
- Main-world players are hidden from limbo tablist.
|
||
|
|
- Limbo players cannot see each other (each limbo player sees only
|
||
|
|
themselves).
|
||
|
|
|
||
|
|
### What main world observers can detect
|
||
|
|
|
||
|
|
After scoping:
|
||
|
|
|
||
|
|
- They cannot see the player's name in tablist pre-auth.
|
||
|
|
- They cannot see chat from the player.
|
||
|
|
- They cannot see the player's world or coordinates (AuthMe blocks
|
||
|
|
movement output anyway, but we don't rely on it).
|
||
|
|
- They CAN see the connection event in server logs (operator-only).
|
||
|
|
- They can see "PLAYER joined the game" only AFTER restore succeeds
|
||
|
|
— join message is shifted to fire on restore-success, not on
|
||
|
|
initial connect.
|
||
|
|
|
||
|
|
This matches the v1 privacy posture and tightens the join-message
|
||
|
|
leak.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 5. Login flow — explicit state machine
|
||
|
|
|
||
|
|
```
|
||
|
|
[CONNECT] ---throttle ok---> [GATE]
|
||
|
|
|
|
||
|
|
failed throttle / ban |
|
||
|
|
v v
|
||
|
|
[REJECTED] [SNAPSHOT] <-- read AuthMe DB,
|
||
|
|
| dump current invent + xp + loc
|
||
|
|
v to plugins/AuthLimbo/snapshots/<uuid>.nbt
|
||
|
|
[LIMBO]
|
||
|
|
|
|
||
|
|
AuthMe /login ok
|
||
|
|
|
|
||
|
|
v
|
||
|
|
[PRELOAD] <-- 3x3 chunk pin around target
|
||
|
|
|
|
||
|
|
v
|
||
|
|
[RESTORE] <-- teleportAsync, retry up to 3
|
||
|
|
|
|
||
|
|
+-----+-----+
|
||
|
|
| |
|
||
|
|
success fail x3
|
||
|
|
| |
|
||
|
|
v v
|
||
|
|
[LIVE] [SPECTATOR-AT-LIMBO + admin alert]
|
||
|
|
```
|
||
|
|
|
||
|
|
Each transition has:
|
||
|
|
|
||
|
|
1. **Trigger event** (e.g. `LoginEvent` MONITOR).
|
||
|
|
2. **Pre-conditions** (e.g. UUID in `pendingTransit`).
|
||
|
|
3. **Side-effects** (e.g. metric counter, audit-log row).
|
||
|
|
4. **Failure handler** (next state on error).
|
||
|
|
|
||
|
|
States persist in `plugins/AuthLimbo/state/<uuid>.json` so a plugin
|
||
|
|
crash mid-flow can resume on rejoin. State file is deleted on
|
||
|
|
[LIVE] entry.
|
||
|
|
|
||
|
|
### Snapshot subsystem
|
||
|
|
|
||
|
|
**This is the operator-bug-survives-everything layer.**
|
||
|
|
|
||
|
|
- On `AuthMeAsyncPreLoginEvent` (player just connected, NOT yet
|
||
|
|
auth'd): if a player file `world/playerdata/<uuid>.dat` exists,
|
||
|
|
read it and shadow-copy to `plugins/AuthLimbo/snapshots/<uuid>.nbt`
|
||
|
|
with timestamp. SHA-256 of file content is logged.
|
||
|
|
- `/authlimbo restore <player>` can roll back any restore by
|
||
|
|
feeding the snapshot through nbtlib (same as the void-death recovery
|
||
|
|
protocol from `feedback_mc_tp_safety.md`).
|
||
|
|
- Snapshots retained 7 days, then GC'd. Configurable.
|
||
|
|
- On `PlayerDeathEvent` while UUID in `pendingTransit`:
|
||
|
|
`keepInventory=true`, `event.getDrops().clear()`, log SEVERE,
|
||
|
|
trigger Discord webhook, schedule restore-from-snapshot on respawn.
|
||
|
|
|
||
|
|
### Restore step (replaces v1's `doTeleport` + 10-tick delay)
|
||
|
|
|
||
|
|
1. Read saved location from AuthMe DB (cached from pre-login —
|
||
|
|
single in-memory hashmap keyed by UUID, evicted on transit clear).
|
||
|
|
2. Compute 3x3 chunk grid centred on saved location.
|
||
|
|
3. `addPluginChunkTicket` on all 9 chunks.
|
||
|
|
4. `CompletableFuture.allOf(getChunkAtAsyncUrgently x9)` — wait for
|
||
|
|
all 9 to actually be loaded, not just the centre one (closes the
|
||
|
|
"loaded but neighbour unloaded" race).
|
||
|
|
5. `teleportAsync(saved, PLUGIN)`. If `false`: F2 retry loop (already
|
||
|
|
in v1.1.0, carries over).
|
||
|
|
6. On success: 5-tick delay, then verify
|
||
|
|
`player.getLocation().distance(saved) < 2.0`. If not, treat as a
|
||
|
|
silent failure → retry.
|
||
|
|
7. Release tickets 5s post-success.
|
||
|
|
8. Mark transition [LIVE], publish `auth_login_success_total`
|
||
|
|
metric, write audit-log row, send delayed join-message to main
|
||
|
|
world, clear snapshot.
|
||
|
|
|
||
|
|
### F8 — drop the SPECTATOR pre-TP trick
|
||
|
|
|
||
|
|
v1 considered "set GameMode.SPECTATOR before TP, revert after". v2
|
||
|
|
does NOT do this — spectator mode has its own client-side render races
|
||
|
|
on chunk-load and silently swallows damage events that the F1 guard
|
||
|
|
*needs to see*. Instead: invariant-driven recovery (snapshot + retry +
|
||
|
|
admin alert) is the safety net. SPECTATOR is the final fallback after
|
||
|
|
3 failed retries (F6 in v1, kept for v2).
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 6. Anti-drama checklist (2b2t lessons)
|
||
|
|
|
||
|
|
Codified up-front so future "monetisation" pressure is rejected by
|
||
|
|
reference, not by argument.
|
||
|
|
|
||
|
|
- [x] No pay-to-skip. Tier list above is the entire policy.
|
||
|
|
- [x] No hidden tier or undocumented bypass (staff bypass is logged).
|
||
|
|
- [x] No queue spot trading / selling.
|
||
|
|
- [x] No "queue position visible to others" — your position is only
|
||
|
|
visible to you. No social pressure surface.
|
||
|
|
- [x] Queue is purely FIFO + tier; no algorithm tweaks, no "lottery".
|
||
|
|
- [x] AGPL-3.0 means anyone can fork and self-host an alt
|
||
|
|
gatekeeper if they distrust ours. Operator-friendly.
|
||
|
|
- [x] Audit log is local-file JSON-L, not phoned home, not
|
||
|
|
centralised. Operator-readable, no hidden telemetry.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 7. Operational surface
|
||
|
|
|
||
|
|
### Metrics (Prometheus)
|
||
|
|
|
||
|
|
Exposed via embedded HTTP server bound to `127.0.0.1:9091` (loopback
|
||
|
|
only — Prometheus on nullstone scrapes via localhost):
|
||
|
|
|
||
|
|
| Metric | Type | Labels |
|
||
|
|
|--------|------|--------|
|
||
|
|
| `authlimbo_connections_total` | counter | `tier`, `outcome={accepted, queued, rejected}` |
|
||
|
|
| `authlimbo_queue_depth` | gauge | — |
|
||
|
|
| `authlimbo_login_success_total` | counter | `tier` |
|
||
|
|
| `authlimbo_login_fail_total` | counter | `reason={timeout, authme_db, tp_failed_3x, ...}` |
|
||
|
|
| `authlimbo_void_damage_blocked_total` | counter | — |
|
||
|
|
| `authlimbo_snapshot_restored_total` | counter | — |
|
||
|
|
| `authlimbo_restore_duration_seconds` | histogram | `tier` |
|
||
|
|
|
||
|
|
Trip-wire alerts (configured server-side, in
|
||
|
|
`prometheus/alerts.yml`, not in the plugin):
|
||
|
|
|
||
|
|
- `authlimbo_login_fail_total{reason="tp_failed_3x"}` rate > 0 for 5m.
|
||
|
|
- `authlimbo_void_damage_blocked_total` rate > 0 for 1m.
|
||
|
|
- `authlimbo_queue_depth` > 10 for 5m.
|
||
|
|
|
||
|
|
### Discord webhooks
|
||
|
|
|
||
|
|
Plugin-side webhook fires on:
|
||
|
|
|
||
|
|
- Snapshot restored (gear was about to be lost).
|
||
|
|
- 3x retry give-up (manual `/authlimbo tp` needed).
|
||
|
|
- Queue depth > config threshold.
|
||
|
|
- AuthMe DB unreachable.
|
||
|
|
- Plugin reload / crash.
|
||
|
|
|
||
|
|
Webhook URL is in config, redacted from `/authlimbo dump`.
|
||
|
|
|
||
|
|
### Audit log
|
||
|
|
|
||
|
|
`plugins/AuthLimbo/audit.log` — JSON Lines, one row per state
|
||
|
|
transition. Fields: `ts`, `uuid`, `name`, `ip`, `tier`, `state`,
|
||
|
|
`prev_state`, `extra` (free-form JSON). Logrotate-compatible; rotates
|
||
|
|
at 100MB, keeps 7 files.
|
||
|
|
|
||
|
|
### Reload-without-restart
|
||
|
|
|
||
|
|
`/authlimbo reload`:
|
||
|
|
|
||
|
|
- Re-reads `config.yml`.
|
||
|
|
- Drains in-flight transits to completion (no new joins accepted
|
||
|
|
during drain, max 30s wait).
|
||
|
|
- Re-binds metrics HTTP server if port changed.
|
||
|
|
- Re-creates limbo world if name/spawn changed.
|
||
|
|
- Discord webhook fires "reload completed in Xs".
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 8. Failure modes & recovery
|
||
|
|
|
||
|
|
| Failure | Detection | Recovery |
|
||
|
|
|---------|-----------|----------|
|
||
|
|
| Plugin crashes mid-restore | On startup, scan `state/*.json` files older than 30s. | For each: if player offline, leave snapshot; if online, treat as new transit, force re-restore from saved AuthMe loc. |
|
||
|
|
| Snapshot file corrupt / unreadable | NBT parse exception. | Fall back to AuthMe DB saved-loc; log SEVERE; webhook. Player may lose newest items but not entire inventory. |
|
||
|
|
| World save corrupts | Paper World#getChunkAtAsync throws. | After 3 retries: kick player with "server experiencing storage issue, try again in 5min"; webhook. |
|
||
|
|
| AuthMe DB unreachable | JDBC `getConnection` throws / read times out > 5s. | **Fail closed.** Reject connection at gatekeeper with kick: "auth service degraded". Log + webhook. Do NOT let player onto main world without auth. |
|
||
|
|
| Server `/stop` mid-login window | Paper shutdown hook. | `clearTransit` for all UUIDs, force-save snapshots, kick all limbo players with "server restarting, your gear is safe". |
|
||
|
|
| Race: AuthMe LoginEvent fires twice (HaHaWTH bug) | UUID already in `pendingTransit` and not in `RESTORE` state. | Idempotent — restore handler is a no-op if UUID is past [PRELOAD]. Log INFO. |
|
||
|
|
| Player disconnects in [LIMBO] | `PlayerQuitEvent`. | Clear pendingTransit + retry counter. Snapshot retained 7d. State file kept until snapshot GC. |
|
||
|
|
|
||
|
|
`fail-open` is never the right choice for an auth gatekeeper. Every
|
||
|
|
failure mode resolves to either: keep player in limbo, or kick them.
|
||
|
|
Never advance them to main-world unauth'd.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 9. Migration from v1
|
||
|
|
|
||
|
|
In-place upgrade path (`v1.1.x` → `v2.0.0`):
|
||
|
|
|
||
|
|
1. Stop server.
|
||
|
|
2. Drop new jar in `plugins/`. v2 jar is not v1-compatible — old
|
||
|
|
`AuthLimbo-1.x.jar` must be removed.
|
||
|
|
3. v2 detects `plugins/AuthLimbo/config.yml` from v1 and rewrites it
|
||
|
|
to v2 schema, leaving a `config.v1.bak` backup.
|
||
|
|
4. v2 detects `auth_limbo` world dir on disk and re-uses it (no
|
||
|
|
recreation, no data loss).
|
||
|
|
5. AuthMe DB schema unchanged — v2 still treats `authme.db` as
|
||
|
|
read-only authoritative.
|
||
|
|
6. New: `plugins/AuthLimbo/snapshots/` and
|
||
|
|
`plugins/AuthLimbo/state/` directories created, owned by the same
|
||
|
|
uid as the itzg container's runtime user.
|
||
|
|
7. Start server. v2 startup logs walk through migration steps.
|
||
|
|
|
||
|
|
There is no DB migration. No mandatory player action. Permissions
|
||
|
|
node names change (`authlimbo.admin` is now
|
||
|
|
`authlimbo.command.admin`, etc.) — operator must update LP groups
|
||
|
|
(noted in CHANGELOG).
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 10. Test plan
|
||
|
|
|
||
|
|
### Unit (JUnit 5 + Mockito)
|
||
|
|
|
||
|
|
- `LimboWorldManager` — barrier-platform construction is idempotent.
|
||
|
|
- `AuthMeDatabase.getQuitLocation` — returns `Location` for present row,
|
||
|
|
null for absent, null for malformed row.
|
||
|
|
- `Snapshot.serialize` / `deserialize` round-trip.
|
||
|
|
- State-machine: every transition rejects from invalid prev-state.
|
||
|
|
|
||
|
|
### Integration (Paper test-server harness)
|
||
|
|
|
||
|
|
- Stand up Paper 1.21.x in CI (Forgejo Actions runner on nullstone).
|
||
|
|
- Mock AuthMe via a stub plugin that fires `AuthMeAsyncPreLoginEvent`
|
||
|
|
and `LoginEvent` programmatically.
|
||
|
|
- Test scenarios: §5.1-5.6 from `AUDIT-2026-05-07.md` plus
|
||
|
|
v2-specific: queue overflow, snapshot-restore on death,
|
||
|
|
reload-without-restart, fail-closed on AuthMe DB down.
|
||
|
|
|
||
|
|
### Stress (Bot flood)
|
||
|
|
|
||
|
|
- 1000 fake connections in 60s using mineflayer or
|
||
|
|
[`MCBotsPro`](https://github.com/Sammy1Am/MCBotsPro). Verify:
|
||
|
|
- queue-depth bounded (gatekeeper kicks beyond max-queue-depth);
|
||
|
|
- no `pendingTransit` leak (size returns to 0 after);
|
||
|
|
- metrics counters consistent with audit log.
|
||
|
|
|
||
|
|
### Chaos
|
||
|
|
|
||
|
|
- Kill plugin (`/plugman unload AuthLimbo`) mid-restore, verify
|
||
|
|
state recovery on rejoin.
|
||
|
|
- `iptables -A OUTPUT -d <authme-db-host> -j DROP` and verify
|
||
|
|
fail-closed.
|
||
|
|
- `kill -9` itzg container during transit, verify next-startup
|
||
|
|
walks `state/*.json` and recovers.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 11. Versioning + release
|
||
|
|
|
||
|
|
- v2.0.0 = breaking redesign (this doc), AGPL-3.0 retained.
|
||
|
|
- v2.1.0 = polish (BossBar UX, /queue command, more metrics).
|
||
|
|
- v2.2.0 = Velocity-mode behind feature flag.
|
||
|
|
- v1.x = receives F3, F5, F6, F7 backports until racked.ru cuts over
|
||
|
|
to v2; then archived.
|
||
|
|
|
||
|
|
Coordinate naming: when the codename migration completes
|
||
|
|
(onyx→obsidian, nullstone→bedrock per
|
||
|
|
`gravel-laptop-build/ROADMAP.md`), the racked.ru server moves to
|
||
|
|
bedrock. v2.0.0 must run on both naming worlds without config drift.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 12. Open questions
|
||
|
|
|
||
|
|
- BossBar UI — does the operator want it visible to limbo players, or
|
||
|
|
silent? Default proposed: visible.
|
||
|
|
- Snapshot retention — 7 days is the proposed default. Storage cost
|
||
|
|
is ~1 KB/snapshot for vanilla inventories, up to ~50 KB for
|
||
|
|
shulker-stuffed players. 100 active players → ~5 MB max.
|
||
|
|
- Webhook destination — same Discord channel as `s8n-ru` server-status
|
||
|
|
alerts, or a new channel? Default proposed: same channel, prefixed
|
||
|
|
`[AuthLimbo]`.
|
||
|
|
- v2.2 Velocity migration — needs a separate design pass once
|
||
|
|
cobblestone or a second backend is real.
|
||
|
|
|
||
|
|
Sign-off pending operator review.
|