Advanced Concepts
Operations Runbook
Day-2 runbook for monitoring, triage, and queue recovery.
Daily Checks
- API health endpoint responds at
/health. - Worker process is connected and consuming
refkit-ingestion. - Dead-letter queue volume is stable.
- No sustained claim payout failures.
Queue Triage
When ingestion errors spike:
- Inspect worker logs for failed job names and error payloads.
- Verify Redis connectivity and queue availability.
- Confirm schema compatibility for API + worker against deployed Prisma migrations.
- Replay or manually process DLQ jobs only after root cause is fixed.
Funding and Payout Triage
- Funding verification failures usually indicate missing/failed transaction or wrong target address.
- Claim failures may indicate wallet validity issues, chain RPC issues, or contract call failures.