Compare commits

..

No commits in common. "fix/F1-F2-F4-void-death-guard" and "main" have entirely different histories.

7 changed files with 13 additions and 1372 deletions

View file

@ -4,65 +4,6 @@ All notable changes to AuthLimbo are documented here.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and the project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [1.1.0] - 2026-05-07
Data-loss bug fix release. Triggered by the YOU500 incident on
`racked.ru` at 2026-05-07 17:13:39 UTC — full inventory void-death during
AuthMe's post-login teleport. See `AUDIT-2026-05-07.md` for the full
forensic trace and `ROADMAP.md` for tracking.
### Added
- **F1 — VOID-damage guard during post-login restore.** New
`EntityDamageEvent` listener at `EventPriority.HIGHEST,
ignoreCancelled=true`. While a player UUID sits in `pendingTransit`
(post-`LoginEvent`, pre-restore-success), `DamageCause.VOID` events are
cancelled, the player is healed to full, and sync-teleported back to
limbo spawn. Console gets a WARN with the player name + intended TP
target. This single guard would have saved YOU500's inventory.
- **F2 — Recovery on `teleportAsync` future == false.** The previous
log-and-abandon branch in `LoginListener.doTeleport` is replaced with a
retry loop. On any failure (false future, exceptional future, or
chunk-load throw): synchronously snap the player back to limbo spawn,
increment a per-UUID retry counter, schedule another `doTeleport` after
30 ticks (~1.5 s). After 3 failures: leave the player at limbo spawn
in `GameMode.SPECTATOR`, log SEVERE with full saved coords + retry
count, and alert console with manual-intervention instructions
(`/authlimbo tp <player>`). Players stay in `pendingTransit` across
retries so F1 keeps protecting them.
- **F4 — Pre-empt AuthMe's broken teleport.** New `LoginEvent` listener
at `EventPriority.LOWEST` (runs BEFORE AuthMe-ReReloaded's own internal
post-login teleport). Action: synchronously TP the player back to
limbo spawn and add their UUID to `pendingTransit`. AuthMe's
subsequent teleport then operates against an irrelevant location, and
our existing MONITOR handler still wins last with the authoritative
restore. Net effect: closes the void-death window even when the saved
chunk is far out and slow to load.
### Internal
- New `Set<UUID> pendingTransit` (ConcurrentHashMap.newKeySet) and
`Map<UUID, Integer> retryCounts` on `LoginListener`. Both are
watchdog-timed-out after 5 s so we never leak entries on edge cases.
- Constants: `MAX_RETRIES=3`, `RETRY_DELAY_TICKS=30`,
`PENDING_TIMEOUT_TICKS=100`.
- No new Maven dependencies. No new public API.
### Privacy
- Limbo-on-join invariant unchanged. F4 actually *strengthens* it by
guaranteeing the limbo position is reasserted at LOGIN-LOWEST.
### Test plan (reproduces YOU500 in dev Paper 1.21.x + AuthMe-ReReloaded fork b49)
- Set saved coord far out (e.g. X=10000, Z=10000) in `authme.db` for a
test account so the chunk is unloaded at login. Restart server. Login.
Expect: F4 sync-TPs to limbo spawn first; F2 retries on false future;
F1 catches any VOID damage during transit; player ends up at saved
coords with full inventory.
- Set saved Y above world build limit (e.g. 5000). Login. Expect: F2
recovery branch retries up to 3 times, then drops the player into
spectator at limbo spawn with admin alert.
- Trigger a synthetic VOID damage during transit (debug command).
Expect: F1 cancels the damage, snaps player back to limbo spawn at
full health, restore continues.
## [1.0.0] - 2026-04-30
Initial public release.

View file

@ -1,75 +0,0 @@
# Research: 2b2t Queue / Login Gatekeeper
Read-only reference for AuthLimbo v2 design. Last updated 2026-05-07.
## TL;DR
- **Architecture**: BungeeCord-style proxy plus a separate "queue server" (a stripped-down Minecraft instance acting as a holding world); the main Paper server is gated behind it.
- **Drain model**: Slow FIFO with a small reserved pool for paid priority — pacing is what protects main from join-flood crashes more than any explicit packet shaper.
- **Drama**: Almost every controversy (paid priority, veteran-queue removal, prio-strip ban waves) is policy-layer, not technical. Avoid the policies; copy the architecture.
## 1. Architecture
- Two-tier: **Velocity/Bungee proxy** -> **queue server** (limbo holding JVM) -> **main Paper server**. Queue is its own process, not a plugin on main.
- Public clones use the same shape: `PistonQueue` (Bungee+Velocity, v4.0.0 Apr 2026, most production-grade), `AnarchyQueue` (Velocity, pairs with `QueueServerPlugin` on the limbo instance), `LeeesBungeeQueue` (archived 2025-04-28, 1.12.2 cap).
- Queue state is **in-memory** on the proxy; clones don't persist across restart. Disconnect = back of line.
## 2. Queue Mechanics
- Pure FIFO inside each tier. Tiers historically: priority -> veteran -> regular. Today: priority -> regular.
- Slot allocation: ~200 reserved slots for priority on ~1000-cap main; regular advances only when a non-reserved slot frees.
- Drain rate is wall-clock, not packet-throttled — 1000-deep regular queue = 6-12h.
- ETA = naive `position * avg_drain`. Wrong because priority steals slots from above; ETA can go *up*.
## 3. AFK + Reconnect
- 2016 queue: reconnect every ~30s, drove hacked-client adoption. Replaced within a year by limbo-queue with auto-updating position.
- Main: 15-min idle disconnect. Queue: long-lived TCP; drop = position lost. `2bored2wait` (archived) proxies queue locally for headless waiting.
## 4. Priority Queue
- Separate FIFO + reserved slot pool. Tier check = permission/uuid lookup on join.
- Pricing: $19.99/mo originally, now $29/mo via 2b2t.shop.
- TheCampingRusher held add/remove power on priority + veteran lists; Torogadude incident.
- Reserved-slot design means a queue can exist even when main isn't full — structurally pay-to-skip.
## 5. Chunk-Load / Crash Mitigation
- Queue server runs near-empty world; no chunk gen, minimal ticks, absorbs thousands of idle TCP sessions cheaply.
- Pacing the drain protects main's chunk pipeline; no explicit login-packet shaper beyond letting `PlayerJoinEvent` finish before pulling next.
- **Nocom (Jul 2018 - Jul 2021)**: unrate-limited `CPacketPlayerDigging` flood on queue starved keepalives, forced mass disconnects, skipped queue. Hausemaster: 500 pkt/s late-2019; factor-14 May 2020; factor-8 next day; factor-2 Jul 2021; full patch 2021-07-15. Leijurv's Monte Carlo particle-filter tracker (2020-2021) kept working at 2 checks/s.
## 6. Veteran Tier
- Whitelist: `joined_before = 2016-06-01`, offline lookup against historical login data.
- Removed **2017-12-04** explicitly to "increase incentive to buy priority". Trust burned.
## 7. Bot Ecosystem
- Mineflayer / headless clients sit in queue 24/7 — indistinguishable from a human leaving client running.
- Detection: behavior only (instant logout on join, scripted movement). "Good" bot = afk-for-owner; "exploit" bot = multi-account prio-skip or queue-bypass client.
- For AuthLimbo v2: AFK bots in pre-auth limbo cost ~nothing. Gate at promote-to-main, not join-limbo.
## 8. Failure Modes
- Nocom-era queue crashes dropped 1000+ waiting players.
- "Ghost queue" — players queued but TCP dead — caused by keepalive starvation, fixed by rate limits.
- Recovery: full restart loses all positions. No persisted state.
## 9. Public Clones — Survey
- **PistonQueue** — Bungee+Velocity, reserved slots, shadow-ban, pre-queue auth, active.
- **AnarchyQueue** — Velocity, minimal, needs `QueueServerPlugin` companion.
- **LeeesBungeeQueue** — archived 2025.
- **Shirodo-Queue**, **eslym/bungee-queue** — toy reimplementations.
- Common mistakes: in-memory only, no priority-abuse audit log, no rate-limit on queue's own packet handlers (re-creates Nocom-class risk).
## 10. Drama Timeline
- **2016-06** Rusher influx; queue introduced.
- **2016-2017** Rusher holds add/remove power on priority + veteran lists.
- **2017-12-04** Veteran queue removed. Mass quits.
- **2018-07 / 2021-07** Nocom queue-bypass exploit + tracking.
- **2022-04** ~40 prio-stripped + banned over a doxxing chain.
- **2022-12-07** 500+ accounts prio-banned cumulatively; `2builders12rules` discord forms to track strips.
## Drama-Avoidance Principles for AuthLimbo v2
1. **No paid priority. Ever.** FIFO only; no money-tied reserved slots.
2. **No hidden-criteria veteran tier.** If seniority exists, rule is public, automated, irrevocable.
3. **No staff add/remove of queue position.** Admin commands log to append-only audit; no silent privilege.
4. **Persist queue state.** Position survives proxy restart (sqlite/redis).
5. **Rate-limit every packet handler in limbo.** Nocom is the canonical lesson.
6. **Honest ETA or no ETA.** Position only, or confidence interval — no fake countdowns.
7. **Privacy-first limbo (AuthLimbo thesis):** new joiners isolated from main-world coords/inventory until AuthMe login completes.
8. **Bots welcome in limbo, gated at promote.** Don't fight Mineflayer pre-auth.
9. **Open source the gatekeeper.** Hausemaster's plugin is closed; opacity amplifies drama.
10. **Document idle/disconnect rules in-game.** No silent kicks.

View file

@ -1,165 +0,0 @@
# RESEARCH — Limbo / Queue / Auth Plugin Survey
Read-only research feeding **AuthLimbo v2**. 2026-05-07.
---
## 1. TL;DR
**Top-3 STEAL** (vendor / shade / depend):
1. **Elytrium LimboAPI** (AGPL-3.0, Velocity) — virtual fake-server
primitives at the Velocity packet layer. License-compatible, exactly
the abstraction we need for "hold pre-login on the proxy, never let
the player touch the Paper world".
2. **Elytrium LimboAuth** (AGPL-3.0, Velocity) — production auth flow
built on LimboAPI. AuthMe-import path, BCrypt+TOTP, weak-password
list. We can fork or depend; AGPL == AGPL.
3. **PistonQueue** (Apache-2.0, Bungee+Velocity+Bukkit) — closest
open-source 2b2t-style queue, actively maintained, permissive
license (we can shade safely into AGPL).
**Top-3 PATTERN** (read & re-implement):
1. **AnarchyQueue (zeroBzeroT)** — clean Velocity/Paper split, separate
queue-server, position-update cadence; small enough to read
end-to-end.
2. **LeeesVelocityQueue** — minimal MIT priority/bypass model; good
reference for *non-paid* trust-tier permissions.
3. **LimboFilter** — anti-bot CAPTCHA + packet-prep tricks; pattern
only since AGPL fork would entangle us further.
**Stack decision:** **Velocity + Paper, both required.** Pre-auth
holding belongs at the proxy (LimboAPI virtual server) — Paper-only
can't truly hide the world. Paper plugin keeps the post-auth
chunk-preload + void-guard from current AuthLimbo. See §3.
---
## 2. Per-plugin detail
| Plugin | License | Stack | Last release | Status | Rating |
|---|---|---|---|---|---|
| Elytrium LimboAPI | AGPL-3.0 | Velocity | 1.1.26 (2024-09) | Active, slowing | STEAL |
| Elytrium LimboAuth | AGPL-3.0 | Velocity (LimboAPI) | 1.1.14 (2024-06) | Active | STEAL |
| Elytrium LimboFilter | AGPL-3.0 | Velocity (LimboAPI) | 1.1.18 (2024-06) | Active | PATTERN |
| PistonQueue (AlexProgrammerDE) | Apache-2.0 | Velocity+Bungee+Bukkit | 4.0.0 (2026-04) | Very active | STEAL |
| AnarchyQueue (zeroBzeroT) | custom permissive (no-warranty) | Velocity | 3.0.13 (2025-10) | Active | PATTERN |
| LeeesVelocityQueue | MIT | Velocity | 1.0.1 (2025-07) | Light, alive | PATTERN |
| ajQueue | GPL-3.0-only | Velocity+Bungee+Paper | active 2.x | Active | PATTERN (license clash w/ AGPL is one-way OK) |
| McMackety/velocity-queue | GPL-3.0 | Velocity (Kotlin) | 1.1.2 (2021-06) | **Archived** | SKIP |
| Shirodo-Queue | MIT | Bungee | none | Hobby | SKIP |
| ProjectPersistence/queue | n/a | mixed | n/a | **404** | SKIP |
| NanoLimbo (Nan1t) | GPL-3.0 | standalone+proxy fwd | 1.12.0 (2026-04) | Active | PATTERN (no auth/queue, but reference impl) |
| NanoLimboPlugin (bivashy) | GPL-3.0 | Velocity+Bungee | 1.8.1 (2024-06) | Maintenance | PATTERN |
| AuthMe-Reloaded | GPL-3.0 | Spigot/Paper/Folia/Bungee/Velocity | 5.7.0 (2026-04) | Active | KEEP (current dep, not a v2 base) |
| kennytv/Maintenance | GPL-3.0 | Paper/Bungee/Velocity/Sponge | active | Active | PATTERN (motd + whitelist gate UX) |
| EaglerProxy | n/a | JS shim | active | Off-target | SKIP — not our threat model |
| TitanProxy | closed-source | n/a | n/a | n/a | SKIP |
Notes:
- **NanoLimbo ≠ NanoLimboPlugin.** Former is a standalone Netty
server; latter wraps it as a proxy plugin. Neither does auth.
- **ProjectPersistence/queue** URL 404'd; treat as dead.
- **McMackety/velocity-queue** archived 2021-08; Kotlin code is
readable but do not depend.
---
## 3. Recommended architecture for AuthLimbo v2
```
client ──► Velocity proxy ──► [LimboAPI virtual server: auth + queue]
▼ (only after auth+queue cleared)
Paper backend ──► [auth-limbo Paper plugin:
chunk-preload, void-guard,
inventory snapshot]
```
### Velocity side (new module `auth-limbo-velocity`)
- **Depend:** `com.velocitypowered:velocity-api:3.4.x`
- **Depend (compileOnly+shade):** `net.elytrium:limboapi:1.1.26`
(AGPL — fine, we are AGPL).
- **Vendor / fork:** parts of `LimboAuth` for the auth state-machine
(BCrypt verify against AuthMe schema, TOTP, weak-password list). Do
not pull the H2/MySQL stack — read AuthMe's existing SQLite directly
to keep one source of truth.
- **Queue logic:** port PistonQueue's `QueueListener` + position
ticker (Apache-2.0 → AGPL is a clean re-license). Strip its paid
tiers; replace with permission-based trust tiers
(`authlimbo.priority.trusted`, `.regular`, no `.donor`).
- **Anti-bot:** PATTERN from LimboFilter — client-brand check + join
rate-limit; skip the CAPTCHA for now (UX cost too high for a
small server).
### Paper side (existing `auth-limbo` plugin, becomes
`auth-limbo-paper`)
- Keep current chunk-preload + void-world generator.
- Land ROADMAP F1 (void-damage guard), F2 (TP retry), F3 (3×3
preload), F5 (inventory snapshot) — these are *post-auth* defences
and remain Paper-side.
- Drop responsibility for "hide world pre-auth" — Velocity holds it
now.
### Shared
- Plugin-message channel `authlimbo:handshake` carries `{uuid,
trust-tier, reconnect-token}` Velocity → Paper so the Paper side
knows the player already passed auth+queue and skips its own login
gate.
### Maven coords
`net.elytrium:limboapi-api:1.1.26` (AGPL, compileOnly),
`com.velocitypowered:velocity-api:3.4.0` (MIT),
`AlexProgrammerDE/PistonQueue:4.0.0` (Apache-2.0, study),
`io.papermc.paper:paper-api:1.21.11-R0.1` (GPL-3.0, compileOnly).
---
## 4. License compatibility matrix
Outbound: AuthLimbo v2 = **AGPL-3.0**. Inbound combinations:
| Source license | Compatible direction | Action |
|---|---|---|
| AGPL-3.0 (LimboAPI/Auth/Filter) | bidirectional | depend or shade freely |
| GPL-3.0 (NanoLimbo, ajQueue, AuthMe, Maintenance) | one-way (GPL → AGPL ok) | depend; cannot upstream patches without coordination |
| Apache-2.0 (PistonQueue) | one-way (permissive → AGPL) | shade or copy with NOTICE |
| MIT (LeeesVelocityQueue, Shirodo) | one-way | shade or copy with attribution |
| Custom no-warranty (AnarchyQueue) | unclear | **read code, do not vendor**; re-implement |
| Closed (TitanProxy, EaglerProxy logic) | n/a | skip |
AGPL §13 invariant: if we ship a network service modified from
LimboAuth, source must be offered. Forgejo `git.s8n.ru` already
satisfies this for our fleet.
---
## 5. Risks
1. **Elytrium upstream slowdown** — last release mid-2024. Pin to
tag, plan soft-fork at git.s8n.ru for 1.21.11+ protocol fixes.
2. **AGPL §13** — modified network deploys need source-link. Footer
+ `/authlimbo source` covers it.
3. **PistonQueue size** — selective copy beats shading whole jar.
4. **AnarchyQueue licence ambiguity** — no-warranty header not OSI;
read-only.
5. **Velocity↔Paper handshake** is a new failure mode; need
integration test before deploy.
6. **No CAPTCHA** = bot-flood exposure. Acceptable for small private
server; revisit if we open up.
7. **Reconnect token storage** (SQLite vs in-memory) still pending.
---
## 6. Sources
Elytrium/{LimboAPI,LimboAuth,LimboFilter}, Nan1t/NanoLimbo,
bivashy/NanoLimboPlugin, AlexProgrammerDE/PistonQueue,
zeroBzeroT/AnarchyQueue, XeraPlugins/LeeesVelocityQueue,
McMackety/velocity-queue (archived), ShirodoBurak/Shirodo-Queue,
AuthMe/AuthMeReloaded, kennytv/Maintenance, modrinth/ajqueue.

View file

@ -1,484 +0,0 @@
# AuthLimbo v2 — Architecture
Status: **Design draft** (no code). Drafted 2026-05-07 by the auth-limbo
v2 design pass after the YOU500 / second-player void-death incidents.
Audience: operator (P) and future contributors.
Companion docs:
- [`AUDIT-2026-05-07.md`](../AUDIT-2026-05-07.md) — root-cause forensic.
- [`ROADMAP.md`](../ROADMAP.md) — v1.x tracking (F1-F7).
- [`V2-ROADMAP.md`](V2-ROADMAP.md) — milestones M0-M5 for v2.
---
## 1. Why v2
v1 is a single-jar Paper plugin glued onto AuthMe. It works *most* of
the time, but its core failure modes are now well-understood and can't
be patched away inside the v1 design:
| v1 limitation | v2 must address |
|---------------|------------------|
| Player object exists on the main server *before* auth — coords/inventory technically restorable from RAM by buggy plugins, world chunk activity is observable. | Strong isolation: limbo is the only state the player can touch pre-auth. |
| Restore relies on AuthMe firing `LoginEvent`. AuthMe's own broken teleport runs in the same window — F4 pre-empts it but the design still races. | Authoritative state machine that doesn't trust AuthMe's teleport at all. |
| Inventory loss on transit-death depends on F1 + F5 holding. There is no inventory-of-record outside live game state. | Snapshot-on-pre-login + snapshot-restore is a first-class subsystem, not a defensive add-on. |
| No metrics, no audit log, no admin alerting. Bugs only surface when a player loses gear. | Built-in observability: Prometheus + JSON-Lines audit + Discord webhook. |
| No queue / login-throttle. If 50 bots connect at once, AuthMe stalls. | Bounded concurrency with transparent FIFO and trust tiers (NOT pay tiers). |
v2 is a clean break (`v2.0.0`), not a v1 patch. v1 stays receiving F3,
F5, F6, F7 backports for as long as racked.ru still runs the old jar.
---
## 2. Stack decision — **Paper-only**, with a Velocity-ready seam
**Recommendation: Paper-only single-server plugin for v2.0.0.**
Velocity-mode is a v2.x deferrable behind a feature flag.
### Reasoning
racked.ru today is one Purpur 1.21.11 server in `minecraft-mc` itzg
container on nullstone. There is no Velocity / BungeeCord, no second
backend, no Forced Hosts, no proxy network. Adding Velocity to ship a
gatekeeper plugin would mean:
- standing up a new container, opening a new public port (or keeping
25565 on the proxy and 25566 internal),
- migrating the 12+ existing Paper plugins through the velocity-paper
bridge contract for chat / commands / placeholders,
- new TLS / RCON / proxy-protocol surface to harden,
- breaking changes to AuthMe's data flow (proxy-side login flow vs
paper-side `AuthMeAsyncPreLoginEvent`),
- one more thing for the operator to babysit.
The privacy property the operator cares about — *no other player sees
pre-auth coords / inventory* — is achievable on Paper-only via a
strictly isolated limbo world + audience scoping (see §4). Velocity adds
*stronger* isolation (player never reaches the backend at all) but the
incremental privacy gain is small for a 0-10 player community, and the
operational cost is large.
### When Velocity becomes worth it
Codify trip-wires up front so the decision isn't dragged out:
1. racked.ru splits into ≥2 backends (e.g. `survival` + `creative`) —
you need a proxy anyway.
2. cobblestone server comes online and shares an account/auth pool.
3. Botting attempts cross 100 connections / minute and `connection-throttle` +
`firewalld rate-limit` are no longer enough. Velocity + a queue
plugin (Ajax / VeloctyQueue) become operationally cheaper than
chasing botnets at the application layer.
Until any of those, Paper-only is the right answer.
### The Velocity-ready seam
v2 internal API is split into two layers so the proxy migration is
mechanical:
```
+-------------------------------+ +-------------------------------+
| Gatekeeper (proxy or paper) | | Restore (paper only) |
| - accept connection | | - read snapshot |
| - check ban / rate limit | | - chunk preload |
| - hold in limbo / queue | | - authoritative TP |
| - hand off on auth-success | | - publish metrics |
+--------------+----------------+ +-------------------------------+
| hand-off event (UUID, target Location, source IP)
v
```
In v2.0 both layers live in the Paper plugin and the hand-off is just a
local method call. In a future "v2-velo" both layers split: gatekeeper
runs as a Velocity plugin, restore stays on Paper, hand-off becomes a
plugin-message channel. No code outside those two layers needs to
change.
---
## 3. Queue model — login-throttle + transparent trust tiers, NO 2b2t-style sale
**For 0-10 player normal load: queue depth is always 0 and players
never see "queued" UI. The queue exists for crisis scenarios (bot
flood, restart drain, AuthMe DB stall) and to define explicit policy
even if it's rarely hit.**
### Policy
| Tier | Definition | Effect |
|------|-----------|--------|
| `staff` | Player has `authlimbo.queue.priority.staff` permission (LP-managed). | Always passes. Bypasses queue entirely. |
| `returning` | Player is in AuthMe DB AND has logged in within last 30 days. | Default tier for everyone who isn't new. Normal FIFO ordering by connect-time. |
| `new` | Player is NOT in AuthMe DB OR last seen >30 days ago. | Same FIFO as `returning` BUT with a per-IP 1/minute throttle. Stops bot-floods. |
| `flagged` | Player IP matches a Pi-hole/CrowdSec/abuse-DB block. | Rejected at gatekeeper, never enters the queue. |
Hard rules — written into `V2-ARCHITECTURE.md` so they outlive any one
operator's mood:
1. **No paid priority. Ever.** No "priority queue pass", no
"supporter rank skip", no Patreon tier. The 2b2t community
collapsed under that grift; we don't repeat it.
2. **No hidden veteran tier.** Every tier is documented in this file
and in `/authlimbo queue policy` in-game. If a player can't see why
they're in tier X, the tier is illegitimate.
3. **No in-game bidding / griefing for queue spots.** Queue position
is purely connect-time + tier; no player action affects it.
4. **Ops-staff bypass is logged.** Every staff bypass writes a JSON-L
audit row.
### Capacity
- `gatekeeper.max-concurrent-auth: 5` — at most 5 players in the
pre-auth limbo at once. Defaults sized for racked.ru. AuthMe DB
reads + chunk pins per concurrent player are roughly free, but bound
it anyway.
- `gatekeeper.max-queue-depth: 50` — beyond 50 waiting, new
connections get a "server is starting up, try again in 30s" kick.
Better UX than a 5-minute black-screen wait.
- `gatekeeper.queue-timeout-seconds: 120` — anyone in the queue >2
minutes gets the same kick + a Discord webhook fires.
### What queue UX looks like
In limbo, a `BossBar` (Adventure API) shows tier + position:
```
[returning] Queue position: 3 / 7 ETA: ~15s
```
When position == 0 and AuthMe accepts, the bar disappears. There's no
hidden state. `/queue` in-chat re-displays the same info.
---
## 4. Privacy isolation
This is the original feature; v2 must not regress it.
### Limbo world
- Separate Bukkit world `auth_limbo`, `Environment.THE_END`,
`VoidGenerator`. Same as v1.
- `keepSpawnInMemory=true`. Game-rules: no daylight, no weather, no
mobs, no fire-tick, no PvP, `doImmediateRespawn=true`,
`keepInventory=true` (defence-in-depth — limbo never *should* see a
death event but if it does, no item drops happen).
- Per-player view-distance forced to 2 in limbo via Paper's
`Player#setViewDistance`. They see 5x5 chunks, all empty.
- Limbo platform: 5x5 of `BARRIER` blocks at y=127, single block of
`BARRIER` ceiling at y=129 to prevent flying out. y=0..126 and
y=130+ are pure void.
### Adventure-API audience scoping
`PlayerChatEvent` listener at `EventPriority.HIGHEST`:
- If sender is in main worlds, recipient list is filtered: anyone
whose `World#getName().equals("auth_limbo")` is dropped. Pre-auth
players never see overworld chat.
- If sender is in limbo (would normally not chat — AuthMe blocks it
— but defence in depth), recipient list is set to *only* the
sender. They cannot leak messages to the main world.
- `PlayerJoinEvent` join messages are suppressed for
`auth_limbo`-spawn joins. Main world only sees a join announcement
*after* the authoritative restore TP succeeds (M2 §"join-message
shifting" below).
### Tablist scoping
Hook `PaperPlayerListEntryEvent` (or fall back to
`PlayerJoinEvent` + `Player#hidePlayer`):
- Limbo players are hidden from main-world tablist.
- Main-world players are hidden from limbo tablist.
- Limbo players cannot see each other (each limbo player sees only
themselves).
### What main world observers can detect
After scoping:
- They cannot see the player's name in tablist pre-auth.
- They cannot see chat from the player.
- They cannot see the player's world or coordinates (AuthMe blocks
movement output anyway, but we don't rely on it).
- They CAN see the connection event in server logs (operator-only).
- They can see "PLAYER joined the game" only AFTER restore succeeds
— join message is shifted to fire on restore-success, not on
initial connect.
This matches the v1 privacy posture and tightens the join-message
leak.
---
## 5. Login flow — explicit state machine
```
[CONNECT] ---throttle ok---> [GATE]
|
failed throttle / ban |
v v
[REJECTED] [SNAPSHOT] <-- read AuthMe DB,
| dump current invent + xp + loc
v to plugins/AuthLimbo/snapshots/<uuid>.nbt
[LIMBO]
|
AuthMe /login ok
|
v
[PRELOAD] <-- 3x3 chunk pin around target
|
v
[RESTORE] <-- teleportAsync, retry up to 3
|
+-----+-----+
| |
success fail x3
| |
v v
[LIVE] [SPECTATOR-AT-LIMBO + admin alert]
```
Each transition has:
1. **Trigger event** (e.g. `LoginEvent` MONITOR).
2. **Pre-conditions** (e.g. UUID in `pendingTransit`).
3. **Side-effects** (e.g. metric counter, audit-log row).
4. **Failure handler** (next state on error).
States persist in `plugins/AuthLimbo/state/<uuid>.json` so a plugin
crash mid-flow can resume on rejoin. State file is deleted on
[LIVE] entry.
### Snapshot subsystem
**This is the operator-bug-survives-everything layer.**
- On `AuthMeAsyncPreLoginEvent` (player just connected, NOT yet
auth'd): if a player file `world/playerdata/<uuid>.dat` exists,
read it and shadow-copy to `plugins/AuthLimbo/snapshots/<uuid>.nbt`
with timestamp. SHA-256 of file content is logged.
- `/authlimbo restore <player>` can roll back any restore by
feeding the snapshot through nbtlib (same as the void-death recovery
protocol from `feedback_mc_tp_safety.md`).
- Snapshots retained 7 days, then GC'd. Configurable.
- On `PlayerDeathEvent` while UUID in `pendingTransit`:
`keepInventory=true`, `event.getDrops().clear()`, log SEVERE,
trigger Discord webhook, schedule restore-from-snapshot on respawn.
### Restore step (replaces v1's `doTeleport` + 10-tick delay)
1. Read saved location from AuthMe DB (cached from pre-login —
single in-memory hashmap keyed by UUID, evicted on transit clear).
2. Compute 3x3 chunk grid centred on saved location.
3. `addPluginChunkTicket` on all 9 chunks.
4. `CompletableFuture.allOf(getChunkAtAsyncUrgently x9)` — wait for
all 9 to actually be loaded, not just the centre one (closes the
"loaded but neighbour unloaded" race).
5. `teleportAsync(saved, PLUGIN)`. If `false`: F2 retry loop (already
in v1.1.0, carries over).
6. On success: 5-tick delay, then verify
`player.getLocation().distance(saved) < 2.0`. If not, treat as a
silent failure → retry.
7. Release tickets 5s post-success.
8. Mark transition [LIVE], publish `auth_login_success_total`
metric, write audit-log row, send delayed join-message to main
world, clear snapshot.
### F8 — drop the SPECTATOR pre-TP trick
v1 considered "set GameMode.SPECTATOR before TP, revert after". v2
does NOT do this — spectator mode has its own client-side render races
on chunk-load and silently swallows damage events that the F1 guard
*needs to see*. Instead: invariant-driven recovery (snapshot + retry +
admin alert) is the safety net. SPECTATOR is the final fallback after
3 failed retries (F6 in v1, kept for v2).
---
## 6. Anti-drama checklist (2b2t lessons)
Codified up-front so future "monetisation" pressure is rejected by
reference, not by argument.
- [x] No pay-to-skip. Tier list above is the entire policy.
- [x] No hidden tier or undocumented bypass (staff bypass is logged).
- [x] No queue spot trading / selling.
- [x] No "queue position visible to others" — your position is only
visible to you. No social pressure surface.
- [x] Queue is purely FIFO + tier; no algorithm tweaks, no "lottery".
- [x] AGPL-3.0 means anyone can fork and self-host an alt
gatekeeper if they distrust ours. Operator-friendly.
- [x] Audit log is local-file JSON-L, not phoned home, not
centralised. Operator-readable, no hidden telemetry.
---
## 7. Operational surface
### Metrics (Prometheus)
Exposed via embedded HTTP server bound to `127.0.0.1:9091` (loopback
only — Prometheus on nullstone scrapes via localhost):
| Metric | Type | Labels |
|--------|------|--------|
| `authlimbo_connections_total` | counter | `tier`, `outcome={accepted, queued, rejected}` |
| `authlimbo_queue_depth` | gauge | — |
| `authlimbo_login_success_total` | counter | `tier` |
| `authlimbo_login_fail_total` | counter | `reason={timeout, authme_db, tp_failed_3x, ...}` |
| `authlimbo_void_damage_blocked_total` | counter | — |
| `authlimbo_snapshot_restored_total` | counter | — |
| `authlimbo_restore_duration_seconds` | histogram | `tier` |
Trip-wire alerts (configured server-side, in
`prometheus/alerts.yml`, not in the plugin):
- `authlimbo_login_fail_total{reason="tp_failed_3x"}` rate > 0 for 5m.
- `authlimbo_void_damage_blocked_total` rate > 0 for 1m.
- `authlimbo_queue_depth` > 10 for 5m.
### Discord webhooks
Plugin-side webhook fires on:
- Snapshot restored (gear was about to be lost).
- 3x retry give-up (manual `/authlimbo tp` needed).
- Queue depth > config threshold.
- AuthMe DB unreachable.
- Plugin reload / crash.
Webhook URL is in config, redacted from `/authlimbo dump`.
### Audit log
`plugins/AuthLimbo/audit.log` — JSON Lines, one row per state
transition. Fields: `ts`, `uuid`, `name`, `ip`, `tier`, `state`,
`prev_state`, `extra` (free-form JSON). Logrotate-compatible; rotates
at 100MB, keeps 7 files.
### Reload-without-restart
`/authlimbo reload`:
- Re-reads `config.yml`.
- Drains in-flight transits to completion (no new joins accepted
during drain, max 30s wait).
- Re-binds metrics HTTP server if port changed.
- Re-creates limbo world if name/spawn changed.
- Discord webhook fires "reload completed in Xs".
---
## 8. Failure modes & recovery
| Failure | Detection | Recovery |
|---------|-----------|----------|
| Plugin crashes mid-restore | On startup, scan `state/*.json` files older than 30s. | For each: if player offline, leave snapshot; if online, treat as new transit, force re-restore from saved AuthMe loc. |
| Snapshot file corrupt / unreadable | NBT parse exception. | Fall back to AuthMe DB saved-loc; log SEVERE; webhook. Player may lose newest items but not entire inventory. |
| World save corrupts | Paper World#getChunkAtAsync throws. | After 3 retries: kick player with "server experiencing storage issue, try again in 5min"; webhook. |
| AuthMe DB unreachable | JDBC `getConnection` throws / read times out > 5s. | **Fail closed.** Reject connection at gatekeeper with kick: "auth service degraded". Log + webhook. Do NOT let player onto main world without auth. |
| Server `/stop` mid-login window | Paper shutdown hook. | `clearTransit` for all UUIDs, force-save snapshots, kick all limbo players with "server restarting, your gear is safe". |
| Race: AuthMe LoginEvent fires twice (HaHaWTH bug) | UUID already in `pendingTransit` and not in `RESTORE` state. | Idempotent — restore handler is a no-op if UUID is past [PRELOAD]. Log INFO. |
| Player disconnects in [LIMBO] | `PlayerQuitEvent`. | Clear pendingTransit + retry counter. Snapshot retained 7d. State file kept until snapshot GC. |
`fail-open` is never the right choice for an auth gatekeeper. Every
failure mode resolves to either: keep player in limbo, or kick them.
Never advance them to main-world unauth'd.
---
## 9. Migration from v1
In-place upgrade path (`v1.1.x` → `v2.0.0`):
1. Stop server.
2. Drop new jar in `plugins/`. v2 jar is not v1-compatible — old
`AuthLimbo-1.x.jar` must be removed.
3. v2 detects `plugins/AuthLimbo/config.yml` from v1 and rewrites it
to v2 schema, leaving a `config.v1.bak` backup.
4. v2 detects `auth_limbo` world dir on disk and re-uses it (no
recreation, no data loss).
5. AuthMe DB schema unchanged — v2 still treats `authme.db` as
read-only authoritative.
6. New: `plugins/AuthLimbo/snapshots/` and
`plugins/AuthLimbo/state/` directories created, owned by the same
uid as the itzg container's runtime user.
7. Start server. v2 startup logs walk through migration steps.
There is no DB migration. No mandatory player action. Permissions
node names change (`authlimbo.admin` is now
`authlimbo.command.admin`, etc.) — operator must update LP groups
(noted in CHANGELOG).
---
## 10. Test plan
### Unit (JUnit 5 + Mockito)
- `LimboWorldManager` — barrier-platform construction is idempotent.
- `AuthMeDatabase.getQuitLocation` — returns `Location` for present row,
null for absent, null for malformed row.
- `Snapshot.serialize` / `deserialize` round-trip.
- State-machine: every transition rejects from invalid prev-state.
### Integration (Paper test-server harness)
- Stand up Paper 1.21.x in CI (Forgejo Actions runner on nullstone).
- Mock AuthMe via a stub plugin that fires `AuthMeAsyncPreLoginEvent`
and `LoginEvent` programmatically.
- Test scenarios: §5.1-5.6 from `AUDIT-2026-05-07.md` plus
v2-specific: queue overflow, snapshot-restore on death,
reload-without-restart, fail-closed on AuthMe DB down.
### Stress (Bot flood)
- 1000 fake connections in 60s using mineflayer or
[`MCBotsPro`](https://github.com/Sammy1Am/MCBotsPro). Verify:
- queue-depth bounded (gatekeeper kicks beyond max-queue-depth);
- no `pendingTransit` leak (size returns to 0 after);
- metrics counters consistent with audit log.
### Chaos
- Kill plugin (`/plugman unload AuthLimbo`) mid-restore, verify
state recovery on rejoin.
- `iptables -A OUTPUT -d <authme-db-host> -j DROP` and verify
fail-closed.
- `kill -9` itzg container during transit, verify next-startup
walks `state/*.json` and recovers.
---
## 11. Versioning + release
- v2.0.0 = breaking redesign (this doc), AGPL-3.0 retained.
- v2.1.0 = polish (BossBar UX, /queue command, more metrics).
- v2.2.0 = Velocity-mode behind feature flag.
- v1.x = receives F3, F5, F6, F7 backports until racked.ru cuts over
to v2; then archived.
Coordinate naming: when the codename migration completes
(onyx→obsidian, nullstone→bedrock per
`gravel-laptop-build/ROADMAP.md`), the racked.ru server moves to
bedrock. v2.0.0 must run on both naming worlds without config drift.
---
## 12. Open questions
- BossBar UI — does the operator want it visible to limbo players, or
silent? Default proposed: visible.
- Snapshot retention — 7 days is the proposed default. Storage cost
is ~1 KB/snapshot for vanilla inventories, up to ~50 KB for
shulker-stuffed players. 100 active players → ~5 MB max.
- Webhook destination — same Discord channel as `s8n-ru` server-status
alerts, or a new channel? Default proposed: same channel, prefixed
`[AuthLimbo]`.
- v2.2 Velocity migration — needs a separate design pass once
cobblestone or a second backend is real.
Sign-off pending operator review.

View file

@ -1,309 +0,0 @@
# AuthLimbo v2 — Roadmap (M0-M5)
Companion to [`V2-ARCHITECTURE.md`](V2-ARCHITECTURE.md). Tracks the
v2.0.0 implementation as ordered milestones with explicit acceptance
criteria, dependencies, and parking lots for non-blocking work.
Status legend: `OPEN`, `WIP`, `BLOCKED`, `DONE`.
Owner: Claude Code agents under operator review.
Branching: every milestone lands on a feature branch
`v2/M{N}-<slug>` and merges into `v2-main` after acceptance. `v2-main`
becomes `main` at v2.0.0 release.
Pre-requisite: v1.1.0 (F1 + F2 + F4) is on `main` and tagged.
v2 work begins on a fresh `v2-main` branch.
---
## M0 · Foundations · OPEN
**Goal:** Land the v2 skeleton so all later milestones plug into a
shared backbone. No behaviour changes for end-users.
### Deliverables
- New maven module `core` for the gatekeeper/restore split (Velocity-ready
seam). Existing `ru.authlimbo` package becomes `ru.authlimbo.paper`.
- `State` enum + `StateMachine` class (`CONNECT → GATE → SNAPSHOT
→ LIMBO → PRELOAD → RESTORE → LIVE | REJECTED | SPECTATOR_FAIL`)
with persistence to `plugins/AuthLimbo/state/<uuid>.json`.
- `AuditLog` writer (JSON-Lines append-only, logrotate-compatible).
- `MetricsRegistry` skeleton (counters, gauges, histograms — no HTTP
server yet, just in-memory accounting).
- Config-v2 schema + automatic v1→v2 migration with backup.
- Build: maven multi-module, sqlite-jdbc still shaded, Adventure API
brought in via Paper API (no extra shade).
### Acceptance
1. Plugin loads on Paper 1.21.11 with v1 config; v1→v2 migration runs
exactly once and writes `config.v1.bak`.
2. `/authlimbo state <player>` shows current state for any in-flight
player.
3. `audit.log` is created and rotates at 100MB (verified by manual
100MB-noise injection).
4. All v1.1.0 behaviour is preserved (F1, F2, F4 still work
end-to-end on a stub-AuthMe test server).
5. Unit tests for state-machine transition validity pass in CI.
### Dependencies
None. M0 is the foundation.
---
## M1 · Snapshot subsystem · OPEN
**Goal:** Make inventory loss impossible regardless of any chunk /
teleport / damage bug downstream.
### Deliverables
- On `AuthMeAsyncPreLoginEvent`: copy `world/playerdata/<uuid>.dat`
to `plugins/AuthLimbo/snapshots/<uuid>-<timestamp>.nbt`, log
SHA-256.
- On `PlayerDeathEvent` while UUID is in `pendingTransit`:
`keepInventory=true`, drops cleared, SEVERE logged, Discord webhook
fired, schedule restore-from-snapshot on respawn.
- New command `/authlimbo restore <player> [--snapshot=<file>]` that
rolls back to a snapshot (uses bundled nbtlib equivalent or an
embedded reader).
- Snapshot retention GC: 7-day default, configurable, runs hourly.
- Metric: `authlimbo_snapshot_restored_total`.
### Acceptance
1. Forced-void-death during transit (test-harness `/limbo void <player>`):
player respawns with full inventory + xp.
2. Snapshot files appear in `snapshots/`, SHA-256 logged on creation
and on read-back.
3. GC removes >7-day snapshots; verified by setting retention=10s in
test config.
4. `/authlimbo restore <player>` after a successful login restores
the pre-login inventory and sends an audit-log entry.
### Dependencies
M0 (audit log + state machine).
---
## M2 · Privacy-isolation hardening · OPEN
**Goal:** Tighten the limbo-world isolation surface — no leaks of
chat, tablist, or join messages between limbo and main world. Make
the privacy invariant testable.
### Deliverables
- `PlayerChatEvent` listener (HIGHEST): drop limbo-world recipients
from main-world chat; main-world recipients from limbo chat.
- Tablist scoping via `Player#hidePlayer(plugin, target)`:
- limbo players hidden from main-world tablist;
- main-world players hidden from limbo tablist;
- limbo players hidden from each other.
- Join-message shifting: suppress vanilla join message on initial
connect; fire delayed join message at state-machine [LIVE]
transition.
- Per-player view-distance forced to 2 in limbo
(`Player#setViewDistance(2)` on limbo entry, restore on exit).
- Limbo BARRIER ceiling at y=129 added to `LimboWorldManager`.
### Acceptance
1. With two test accounts (`alice` in main world, `bob` connecting
to limbo): `alice` does not see `bob` in tablist before `bob`
completes login. After login, `alice` sees `bob`'s join message
exactly once.
2. `bob` in limbo cannot see chat from `alice`. Verified via
integration test.
3. `bob` cannot fly out of limbo via creative/elytra (server starts
bob in survival; barrier ceiling prevents y>129).
4. Privacy invariant test (`PrivacyInvariantTest`) covers all six
scope boundaries (chat in/out, tablist in/out, join-msg before/after).
### Dependencies
M0.
---
## M3 · Restore reliability (3x3 preload + chunk-ready verification) · OPEN
**Goal:** Make the restore-teleport bullet-proof against the
"loaded-but-neighbour-unloaded" race that v1's F3 was designed for,
plus the silent-failure case where `teleportAsync` returns true but
the player is still at the old position.
### Deliverables
- 3x3 chunk preload around target (`addPluginChunkTicket` x9 +
`CompletableFuture.allOf(getChunkAtAsyncUrgently x9)`).
- Post-TP verification: 5 ticks after `teleportAsync` returns true,
check `player.getLocation().distance(saved) < 2.0`. If not, treat
as silent fail and retry.
- F2-style retry loop already from v1.1 carried over with v2 metrics
+ audit log integration.
- Drop the SPECTATOR pre-TP trick (v1's F8 redesign): rely on the
snapshot + damage-guard layers instead.
- Metric: `authlimbo_restore_duration_seconds` histogram.
### Acceptance
1. AUDIT-2026-05-07 §5.1 (unloaded-chunk void) reproduces no
void-death and no inventory loss. Player lands at saved coords.
2. AUDIT-2026-05-07 §5.2 (invalid Y) escalates to
`SPECTATOR_FAIL` after 3 retries with audit-log + webhook.
3. New scenario: target at chunk-section boundary
(e.g. (16, 70, 16)) — 3x3 preload makes this work first try.
4. Histogram p99 restore duration < 2.5s under normal load (no bot
flood).
### Dependencies
M0, M1 (snapshot is the safety net while M3 retry-loops).
---
## M4 · Gatekeeper + queue + observability · OPEN
**Goal:** Bring the queue, trust tiers, metrics endpoint, and
Discord webhook online. After M4 the operator has full visibility
without needing to grep logs.
### Deliverables
- Gatekeeper interface (`Gatekeeper.accept(connection) → Decision`)
with Paper-side implementation. Decision: `accept`, `queue`,
`reject`.
- Trust-tier resolver: reads LP permissions for `staff`,
AuthMe-DB last-seen for `returning` vs `new`, IP-block list for
`flagged`. Cacheable.
- Bounded queue with FIFO ordering by connect-time + tier priority.
Configurable `max-concurrent-auth`, `max-queue-depth`,
`queue-timeout-seconds`.
- BossBar UI in limbo: shows tier + position + ETA. Updates every
second.
- `/queue` command in-chat re-displays state.
- Prometheus HTTP server bound to `127.0.0.1:9091` (loopback only).
- Discord webhook config + plumbing for the alert categories from
ARCHITECTURE §7.
- `/authlimbo queue policy` command — prints the tier policy
in-game so players can self-verify they're not in a hidden tier.
### Acceptance
1. Stress test: 1000 simulated connections in 60s.
`authlimbo_queue_depth` peaks at `max-queue-depth`, never higher.
No `pendingTransit` leak (returns to 0 within 30s of flood end).
2. Staff bypass: a player with `authlimbo.queue.priority.staff`
skips even a full queue. Audit log records the bypass.
3. Pi-hole-style IP blocklist drops a connection at gatekeeper —
never enters limbo. `authlimbo_connections_total{outcome="rejected"}`
increments.
4. Prometheus scrape of `localhost:9091/metrics` returns OpenMetrics
format with all metrics from ARCHITECTURE §7.
5. `/authlimbo queue policy` output matches ARCHITECTURE §3 tier table
verbatim (rendered from a single source-of-truth string).
### Dependencies
M0 (state machine + audit log), M3 (so legitimate logins still
flow correctly through the new gatekeeper layer).
---
## M5 · Hardening, drama-avoidance lock-in, release · OPEN
**Goal:** Lock in the anti-drama policy so it can't drift. Ship v2.0.0.
### Deliverables
- Anti-drama policy constants in code (not config) — paid-tier and
hidden-tier escape hatches do not exist as configurable knobs.
Adding one would require a code change + AGPL fork.
- Reload-without-restart (`/authlimbo reload`) with in-flight transit
drain (max 30s wait).
- Fail-closed implementation for AuthMe DB unreachable case (kick
with operator-friendly message + webhook).
- Server-shutdown drain hook: clear transit, save snapshots, kick
limbo players with "server restarting" message.
- Chaos-test suite: kill-plugin-mid-login, kill-container, AuthMe-DB
network-drop. All recoverable.
- Documentation: `V2-ARCHITECTURE.md` (this milestone's companion),
`V2-RELEASE.md` migration guide for operators, updated
`compatibility.md` and `installation.md`.
- Tag v2.0.0, push to git.s8n.ru/s8n/auth-limbo, GitHub
push-mirror, attach jar to release.
### Acceptance
1. Plugin reload during a live transit completes the in-flight
restore correctly, no inventory loss.
2. Killing the plugin (`/plugman unload`) during [LIMBO] state and
restarting the server: rejoining player is restored from state +
snapshot.
3. AuthMe DB hard-down: connection rejected at gatekeeper, never
reaches main world. Operator gets webhook within 30s.
4. CHANGELOG documents every breaking change, every renamed
permission node, every config schema change.
5. v2.0.0 jar runs end-to-end on the racked.ru staging container
(parallel to v1 prod) for 7 days with zero void-deaths and zero
inventory losses.
### Dependencies
M0-M4. M5 is the gate to release.
---
## Parked / non-blocking
These items are **not** in the v2.0.0 critical path. Tracked here so
they aren't lost.
- `P-VELO` · Velocity-mode behind feature flag (target: v2.2.0).
Requires a real second backend or proxy mesh first.
- `P-COBBLE` · cobblestone-server interop. Wait for cobblestone
intake to land in `_github/infra/`.
- `P-PLUGIN-MSG` · Plugin-message channel between paper-side and
proxy-side gatekeepers (prep for `P-VELO`).
- `P-WEB-UI` · Read-only web dashboard for queue + metrics. Defer
until operator asks.
- `P-CROWDSEC` · Pluggable IP-blocklist source (CrowdSec API). v2.0.0
uses static config + Pi-hole hosts file.
- `P-MOJANG-BAN-CHECK` · Honor Mojang's name-changed-but-banned
blocklist. Niche, defer.
---
## Cross-cutting acceptance: privacy invariant
Every milestone must preserve the v1 privacy invariant: *no
main-world player can observe any pre-auth player's coordinates,
inventory, or chat*.
A dedicated `PrivacyInvariantTest` (introduced in M2) runs on every
PR and must pass for merge. The test enumerates the six scope
boundaries from ARCHITECTURE §4 and asserts no leak in either
direction.
If a milestone would relax any boundary, it MUST be flagged in the PR
description and reviewed against `feedback_audit_then_plan.md`
(audit-then-fix workflow).
---
## Release plan
| Tag | Contents | Target |
|-----|----------|--------|
| v2.0.0-rc1 | M0 + M1 + M2 + M3 | end of week 1 |
| v2.0.0-rc2 | + M4 | end of week 2 |
| v2.0.0 | + M5, 7-day staging soak | end of week 3 |
| v2.1.0 | parked items as operator pulls them in | opportunistic |
All releases tagged on `git.s8n.ru/s8n/auth-limbo` first; GitHub is
push-mirror per `feedback_my_git_is_forgejo.md`.
Operator handles end-of-session push.

View file

@ -6,7 +6,7 @@
<groupId>ru.authlimbo</groupId>
<artifactId>AuthLimbo</artifactId>
<version>1.1.0</version>
<version>1.0.0</version>
<packaging>jar</packaging>
<name>AuthLimbo</name>

View file

@ -20,45 +20,29 @@ import fr.xephi.authme.events.AuthMeAsyncPreLoginEvent;
import fr.xephi.authme.events.LoginEvent;
import org.bukkit.Bukkit;
import org.bukkit.Chunk;
import org.bukkit.GameMode;
import org.bukkit.Location;
import org.bukkit.World;
import org.bukkit.entity.Player;
import org.bukkit.event.EventHandler;
import org.bukkit.event.EventPriority;
import org.bukkit.event.Listener;
import org.bukkit.event.entity.EntityDamageEvent;
import org.bukkit.event.entity.EntityDamageEvent.DamageCause;
import org.bukkit.event.player.PlayerTeleportEvent;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.UUID;
import java.util.concurrent.ConcurrentHashMap;
/**
* Listens for AuthMe's two relevant events plus a damage guard:
* Listens for AuthMe's two relevant events:
*
* 1. {@link AuthMeAsyncPreLoginEvent} fired before AuthMe authenticates
* the password. We pin the destination chunk via a plugin chunk-ticket
* so it's fully loaded by the time the actual teleport runs.
*
* 2. {@link LoginEvent} at {@link EventPriority#LOWEST} (F4) fired BEFORE
* AuthMe-ReReloaded's own internal post-login teleport. We immediately
* sync-TP the player back to limbo spawn so AuthMe's broken teleport
* operates against an irrelevant location. The MONITOR handler then
* runs the authoritative restore.
*
* 3. {@link LoginEvent} at {@link EventPriority#MONITOR} fires LAST in
* the chain, schedules the authoritative teleport to the saved
* quit-location after a configurable tick delay.
*
* 4. {@link EntityDamageEvent} at {@link EventPriority#HIGHEST} (F1)
* while a player is in {@code pendingTransit}, cancels {@code VOID}
* damage and snaps them back to limbo spawn at full health. This is
* the YOU500-incident fix: even if AuthMe's broken teleport drops the
* player into an unloaded section, the void death is intercepted.
* 2. {@link LoginEvent} fired AFTER AuthMe successfully authenticates
* and runs its own (often broken) post-login teleport. We listen at
* MONITOR priority so we are LAST in the chain, then fire an
* authoritative teleport that overrides whatever AuthMe / Paper safety
* checks did to the player's location.
*
* Threading:
* - AuthMeAsyncPreLoginEvent fires async (AuthMe worker thread).
@ -78,32 +62,12 @@ import java.util.concurrent.ConcurrentHashMap;
*/
public final class LoginListener implements Listener {
/** Hard cap on teleportAsync retries before falling back to spectator at limbo spawn (F2). */
private static final int MAX_RETRIES = 3;
/** Tick delay between a failed teleport attempt and the next retry (F2). 1500 ms. */
private static final long RETRY_DELAY_TICKS = 30L;
/** Safety timeout before pendingTransit entries expire even if no callback fires (F1). 5 s. */
private static final long PENDING_TIMEOUT_TICKS = 20L * 5L;
private final AuthLimbo plugin;
private final AuthMeDatabase db;
/** Tracks active plugin-chunk-tickets so we don't double-add or fail to release. */
private final Set<String> activeTickets = new HashSet<>();
/**
* F1 UUIDs of players currently in the post-login restore window.
* While present, the player is protected from VOID damage by
* {@link #onEntityDamage}. Removed on successful TP, on final retry
* give-up, or by the {@link #PENDING_TIMEOUT_TICKS} watchdog.
*/
private final Set<UUID> pendingTransit = ConcurrentHashMap.newKeySet();
/** F2 — per-UUID retry counter for failed teleportAsync attempts. */
private final Map<UUID, Integer> retryCounts = new ConcurrentHashMap<>();
public LoginListener(AuthLimbo plugin, AuthMeDatabase db) {
this.plugin = plugin;
this.db = db;
@ -144,50 +108,6 @@ public final class LoginListener implements Listener {
});
}
/* ---------------- F4: pre-empt AuthMe's broken teleport ---------------- */
/**
* Runs at LOWEST priority BEFORE AuthMe-ReReloaded's own LoginEvent
* handler runs its broken teleport. We immediately sync-TP the player
* back to limbo spawn so AuthMe's subsequent teleport runs against an
* irrelevant location. Combined with F1, this closes the void-death
* window even when the saved chunk is far out and slow to load.
*
* Adds the player to {@code pendingTransit} here too F1 must protect
* the player across the entire LOWESTMONITOR window.
*/
@EventHandler(priority = EventPriority.LOWEST)
public void onLoginPreEmpt(LoginEvent event) {
Player player = event.getPlayer();
if (player == null) return;
final UUID id = player.getUniqueId();
// Mark in-transit BEFORE AuthMe's teleport runs so F1 catches any
// void damage from that teleport.
pendingTransit.add(id);
scheduleTransitTimeout(id);
Location limbo = plugin.limbo().spawn();
if (limbo == null) {
plugin.getLogger().warning("[AuthLimbo] limbo spawn is null at LOWEST pre-empt for "
+ player.getName() + " — cannot pre-empt AuthMe teleport.");
return;
}
// Synchronous teleport so we land BEFORE AuthMe's own MONITOR/HIGH
// handler fires its own teleport. teleportAsync would be racy here.
try {
player.teleport(limbo, PlayerTeleportEvent.TeleportCause.PLUGIN);
if (plugin.debug()) {
plugin.getLogger().info("[AuthLimbo][debug] Pre-empted AuthMe TP for "
+ player.getName() + " — pinned at limbo spawn.");
}
} catch (Throwable t) {
plugin.getLogger().warning("[AuthLimbo] pre-empt teleport failed for "
+ player.getName() + ": " + t.getMessage());
}
}
/* ---------------- Post-login: authoritative teleport ---------------- */
@EventHandler(priority = EventPriority.MONITOR)
@ -195,93 +115,31 @@ public final class LoginListener implements Listener {
Player player = event.getPlayer();
if (player == null) return;
final String name = player.getName();
final UUID id = player.getUniqueId();
final Location saved = db.getQuitLocation(name);
if (saved == null) {
plugin.getLogger().info("[AuthLimbo] No saved location for "
+ name + " — leaving where AuthMe put them.");
// No restore needed clear transit guard.
pendingTransit.remove(id);
retryCounts.remove(id);
return;
}
// Defensive: pre-empt handler should have added this already, but ensure.
pendingTransit.add(id);
retryCounts.putIfAbsent(id, 0);
long delay = Math.max(0, plugin.getConfig().getLong("authme.teleport-delay-ticks", 10L));
Bukkit.getScheduler().runTaskLater(plugin, () -> doTeleport(player, name, saved), delay);
}
/* ---------------- F1: void-damage guard during transit ---------------- */
/**
* F1 while a player is mid-restore, intercept VOID damage and snap
* them back to limbo spawn at full health. This is the primary fix for
* the YOU500 incident: even if AuthMe's own broken teleport drops the
* player into an unloaded section that triggers "left the confines of
* this world", we cancel the damage and recover the player.
*
* HIGHEST + ignoreCancelled=true so we run last among damage handlers
* and don't double-process events already cancelled by other plugins.
*/
@EventHandler(priority = EventPriority.HIGHEST, ignoreCancelled = true)
public void onEntityDamage(EntityDamageEvent event) {
if (!(event.getEntity() instanceof Player player)) return;
final UUID id = player.getUniqueId();
if (!pendingTransit.contains(id)) return;
if (event.getCause() != DamageCause.VOID) return;
event.setCancelled(true);
Location limbo = plugin.limbo().spawn();
plugin.getLogger().warning(String.format(
"[AuthLimbo] VOID damage intercepted for %s during post-login restore"
+ " (intended TP target: %s) — relocating to limbo spawn at %s.",
player.getName(),
describeIntendedTarget(player),
limbo == null ? "<null>" : describe(limbo)));
Bukkit.getConsoleSender().sendMessage(
"[AuthLimbo] WARN: void-damage guard saved " + player.getName()
+ " from inventory loss during AuthMe restore.");
// Heal first so the next-tick snap doesn't carry residual damage state.
try {
player.setHealth(20.0);
player.setFireTicks(0);
player.setFallDistance(0f);
} catch (Throwable t) {
// best-effort
}
if (limbo != null) {
try {
player.teleport(limbo, PlayerTeleportEvent.TeleportCause.PLUGIN);
} catch (Throwable t) {
plugin.getLogger().warning("[AuthLimbo] limbo recovery teleport failed for "
+ player.getName() + ": " + t.getMessage());
}
}
}
/* ---------------- Core teleport with chunk-prep ---------------- */
private void doTeleport(Player player, String name, Location saved) {
if (!player.isOnline()) {
plugin.getLogger().info("[AuthLimbo] " + name
+ " went offline before restore — skipping.");
clearTransit(player.getUniqueId());
return;
}
World world = saved.getWorld();
if (world == null) {
plugin.getLogger().warning("[AuthLimbo] Saved world for "
+ name + " is no longer loaded.");
clearTransit(player.getUniqueId());
return;
}
@ -311,118 +169,28 @@ public final class LoginListener implements Listener {
if (plugin.debug()) {
plugin.getLogger().info("[AuthLimbo][debug] Teleport ok for " + name);
}
// F1/F2: restore complete clear transit guard.
clearTransit(player.getUniqueId());
} else {
handleTeleportFailure(player, name, saved,
"teleportAsync returned false");
plugin.getLogger().warning("[AuthLimbo] teleportAsync returned false for "
+ name + " — Paper may have rejected the location.");
}
// Release the ticket 5s later gives the client time to
// download the chunk before we let it unload.
scheduleTicketRelease(world, cx, cz, key);
})
.exceptionally(ex -> {
handleTeleportFailure(player, name, saved,
"teleportAsync threw: " + ex.getMessage());
plugin.getLogger().warning("[AuthLimbo] teleportAsync threw for "
+ name + ": " + ex.getMessage());
scheduleTicketRelease(world, cx, cz, key);
return null;
});
}).exceptionally(ex -> {
handleTeleportFailure(player, name, saved,
"getChunkAtAsyncUrgently threw: " + ex.getMessage());
plugin.getLogger().warning("[AuthLimbo] getChunkAtAsyncUrgently threw for "
+ name + ": " + ex.getMessage());
scheduleTicketRelease(world, cx, cz, key);
return null;
});
}
/* ---------------- F2: failure recovery + retry ---------------- */
/**
* F2 on any failed teleport (false future, exceptional future, or
* chunk-load throw), do not abandon the player.
*
* Steps:
* 1. Synchronously TP back to limbo spawn so they aren't sitting in
* an unloaded chunk that may void-kill them next tick.
* 2. Increment retry counter. Up to {@link #MAX_RETRIES} retries
* schedule another doTeleport after {@link #RETRY_DELAY_TICKS}.
* 3. After MAX_RETRIES failures: leave player at limbo spawn in
* spectator gamemode, log ERROR with full coords + retry count,
* alert console for manual `/authlimbo tp` intervention.
*
* The player remains in {@code pendingTransit} across all retries so
* F1 still protects them from any void damage during the retry window.
*/
private void handleTeleportFailure(Player player, String name, Location saved, String reason) {
final UUID id = player.getUniqueId();
final int attempt = retryCounts.merge(id, 1, Integer::sum);
plugin.getLogger().warning(String.format(
"[AuthLimbo] Restore attempt %d/%d failed for %s — %s. Recovering to limbo spawn.",
attempt, MAX_RETRIES, name, reason));
// Step 1: snap back to limbo spawn synchronously so they don't
// continue free-falling in an unloaded section.
Location limbo = plugin.limbo().spawn();
if (limbo != null && player.isOnline()) {
// Run on main thread teleportAsync future callbacks may not be there.
Bukkit.getScheduler().runTask(plugin, () -> {
try {
player.teleport(limbo, PlayerTeleportEvent.TeleportCause.PLUGIN);
player.setHealth(20.0);
player.setFireTicks(0);
player.setFallDistance(0f);
} catch (Throwable t) {
plugin.getLogger().warning("[AuthLimbo] sync recovery TP failed for "
+ name + ": " + t.getMessage());
}
});
}
if (attempt >= MAX_RETRIES) {
// Step 3: give up gracefully.
plugin.getLogger().severe(String.format(
"[AuthLimbo] Failed to restore %s after %d retries — manual intervention needed."
+ " Saved coords: %s(%.1f, %.1f, %.1f). Last reason: %s.",
name, MAX_RETRIES,
saved.getWorld() == null ? "<null>" : saved.getWorld().getName(),
saved.getX(), saved.getY(), saved.getZ(), reason));
Bukkit.getConsoleSender().sendMessage(
"[AuthLimbo] ERROR: failed to restore " + name + " after "
+ MAX_RETRIES + " retries — manual intervention needed."
+ " Run `/authlimbo tp " + name + "` after investigating.");
Bukkit.getScheduler().runTask(plugin, () -> {
if (player.isOnline()) {
try {
player.setGameMode(GameMode.SPECTATOR);
player.sendMessage("§c[AuthLimbo] Could not restore your saved location."
+ " Staff have been notified — please ping an admin for"
+ " a manual /authlimbo tp.");
} catch (Throwable t) {
// best-effort
}
}
});
clearTransit(id);
return;
}
// Step 2: schedule a retry. Player stays in pendingTransit so F1
// continues to protect them.
Bukkit.getScheduler().runTaskLater(plugin, () -> {
if (!player.isOnline()) {
clearTransit(id);
return;
}
doTeleport(player, name, saved);
}, RETRY_DELAY_TICKS);
}
/* ---------------- Helpers ---------------- */
private void scheduleTicketRelease(World world, int cx, int cz, String key) {
if (!activeTickets.contains(key)) return;
Bukkit.getScheduler().runTaskLater(plugin, () -> {
@ -438,39 +206,4 @@ public final class LoginListener implements Listener {
}
}, 20L * 5L);
}
/**
* Removes a player from the transit set + retry counter. After this,
* F1's void-damage guard no longer protects them.
*/
private void clearTransit(UUID id) {
pendingTransit.remove(id);
retryCounts.remove(id);
}
/**
* Watchdog: if no callback fires within {@link #PENDING_TIMEOUT_TICKS},
* remove the UUID from pendingTransit so we don't leak entries.
*/
private void scheduleTransitTimeout(UUID id) {
Bukkit.getScheduler().runTaskLater(plugin, () -> {
if (pendingTransit.remove(id)) {
retryCounts.remove(id);
if (plugin.debug()) {
plugin.getLogger().info("[AuthLimbo][debug] pendingTransit timeout for " + id);
}
}
}, PENDING_TIMEOUT_TICKS);
}
private static String describeIntendedTarget(Player player) {
Location loc = player.getLocation();
return describe(loc);
}
private static String describe(Location loc) {
if (loc == null || loc.getWorld() == null) return "<unknown>";
return String.format("%s(%.1f, %.1f, %.1f)",
loc.getWorld().getName(), loc.getX(), loc.getY(), loc.getZ());
}
}