Runbook

Migration and Recovery

Rebuild or move a Monad node without improvising the cutover.

This is the conservative migration checklist Proofline uses for Monad full node recovery and server-to-server moves. The runbook is intentionally repeatable and verification-heavy.

Full Node Recovery Minimal Disruption

1. Prepare the destination server

  • Provision the new host and install the operating system cleanly.
  • Verify storage layout before installing node services.
  • Confirm SSH and console access before touching production state.
  • Keep a second disk or dedicated device ready for TrieDB when possible.

2. Recreate the node baseline

  • Install required packages and the Monad software stack.
  • Create the non-privileged node user and expected file layout.
  • Recreate TrieDB device mapping and validate it before startup.
  • Restore node configuration and environment carefully.

3. Restore identity safely

  • Recover encrypted SECP and BLS keystores from off-host backups.
  • Restore keystore passwords only from a secure external store.
  • Verify file ownership and permissions before starting services.
  • Do not improvise with identity files or rename critical artifacts casually.

4. Restore state

  • Use the most recent recommended snapshot workflow for the target network.
  • Rebuild execution and ledger state using the official restore path.
  • Validate TrieDB mount and expected disk mapping before final startup.

5. Start in the right order

  • Initialize one-time storage preparation steps first.
  • Start telemetry and observability components.
  • Start monad-bft, monad-execution, and monad-rpc only after configuration and state are ready.
  • Avoid changing multiple variables at once during first boot.

6. Validate before cutover

  • Check that all services are active and have clean start timestamps.
  • Confirm RPC responds and is not reporting unexpected syncing.
  • Verify required TCP and UDP listeners for Monad ports.
  • Confirm Prometheus targets, custom health metrics, and alert rules are loaded.
  • Only then switch public references and retire the old server.