For Admins
On-call Playbook

On-call Playbook

Common incidents and how to respond.

"Calls aren't appearing for company X"

  1. Check the background job dashboard (opens in a new tab) — is the Five9 sync running and succeeding?
  2. Check whether the company's Five9 campaign name in the portal exactly matches the Five9 configuration.
  3. Check Five9 directly — are calls actually landing there?
  4. If sync is failing systemwide, check Sentry for errors and escalate to engineering.

"Recording audio won't play"

  1. Check the S3 audio move job in the background job dashboard (opens in a new tab).
  2. Wait 15 minutes (one cycle) and try again.
  3. If still missing, the audio may not have been delivered by Five9. Check their SFTP server (engineering has access).

"An invoice is wrong"

  1. Open the invoice — check the line items.
  2. Cross-reference with the period's call data in the Calls (opens in a new tab) view.
  3. If the call count looks right but the rate is wrong, it's a Stripe configuration issue — fix in Stripe.
  4. If the call count is wrong, dig into Calls and look for missing or duplicate records.

"A spike alert seems wrong"

  1. Open the call-spike detail (opens in a new tab).
  2. Look at the actual call list during the window.
  3. If the calls are clearly noise (one source flooding), document and acknowledge.
  4. If the detector is consistently wrong for a particular company, escalate to engineering — the model may need tuning for that company's pattern.

"A user can't log in"

  1. Check whether their account is active (not discarded).
  2. Check whether their email is confirmed.
  3. Check whether they're behind a 2FA wall they can't pass — disable 2FA after verifying identity (see Security).
  4. Resend invitation if their original was never accepted.
  5. If still broken, escalate to engineering.

"Stripe webhook events look stuck"

Engineering territory — escalate. Note the time and the affected entity in your handoff.

"All recordings dashboard widgets are blank"

  1. Check whether the Five9 sync job is running (Sidekiq dashboard).
  2. Check whether Five9 itself is up.
  3. Wait 15 minutes for the next cycle.
  4. If still blank for multiple companies, escalate.

"An integrator's API calls are returning 401"

  1. Check whether their OAuth token has expired (Doorkeeper tracks issuance and expiration).
  2. Have them re-run the OAuth authorization flow to get a fresh token.
  3. If the token is valid but calls still 401, check whether the user behind the token is still active.

"The portal is slow"

  1. Check Sentry for performance issues.
  2. Check the Sidekiq dashboard — is one queue backed up?
  3. Check whether the Hetzner box is under load (engineering has direct access).
  4. Escalate if not resolved in 15 minutes.

"A customer says they were charged twice"

  1. Open Stripe directly — are there really two successful charges?
  2. If yes, refund the duplicate in Stripe.
  3. The portal will reflect the refund on the next sync.
  4. Document the cause — usually a webhook retry or a duplicate Stripe customer.

Closing notes

Admin access is powerful. With it comes a few habits worth keeping:

  • Document significant actions. When you impersonate, deactivate users, or void invoices, leave a note somewhere your team can find it.
  • Prefer fixing root causes over workarounds. A one-time manual fix is fine; the same fix three times means escalate to engineering.
  • Watch the audit log when something feels off. It's the easiest way to tell whether a problem is human (someone changed something) or systemic (something broke on its own).
  • Keep 2FA on. Your account has the keys to the platform.

Next