Smartcar Outage Runbook
When Smartcar or an OEM connected-services backend degrades, rental sessions can fail at the most sensitive steps: unlock, lock, and telemetry-dependent billing. This runbook defines how to maintain safety and reduce customer impact during those incidents.
Outage Signals
Treat issues as a potential outage when you observe one or more of:
- Sudden spike in lock/unlock failures across multiple vehicles or brands
- Telemetry timestamps not advancing for a broad fleet segment
- High timeout rates on command requests
- Concurrent customer reports across different locations
If failures are isolated to one vehicle, follow normal troubleshooting first.
Severity Levels
| Severity | Criteria | Default Policy |
|---|---|---|
| SEV-1 | Widespread inability to unlock/lock; active returns blocked | Freeze new session starts, prioritize safe session closure |
| SEV-2 | Partial brand/region degradation; intermittent failures | Restrict affected cohorts, monitor every 15 minutes |
| SEV-3 | Elevated error rate but acceptable fallback success | Continue operations with alerting and manual readiness |
First 15 Minutes
Declare Incident
Open an internal incident record with timestamp, observed symptoms, and affected brands/regions.
Scope Impact
Identify how many active sessions and pending check-ins depend on affected vehicles.
Set Temporary Policy
Choose a mode: full hold on new starts, partial hold by brand/region, or monitored continue.
Notify Support and Ops
Broadcast response instructions to support, operator teams, and on-call responders.
Start Customer Messaging
Send short in-app and email notices to impacted customers.
Fallback Operation Modes
Mode A: Safe-Return Priority (SEV-1)
- Pause new rental starts on affected vehicles.
- Allow active rentals to proceed to return with operator assistance.
- Prioritize lock confirmation during check-out.
- If lock verification is unavailable, require manual safety verification before finalizing.
Mode B: Limited Continue (SEV-2)
- Block only impacted brands/regions.
- Allow unaffected fleet segments to operate normally.
- Add proactive warnings on affected check-in flows.
Mode C: Monitor and Retry (SEV-3)
- Keep operations running.
- Increase retry windows and operator alerts.
- Prepare to escalate if failure rates worsen.
Command Failure Handling
| Step | Failure | Fallback |
|---|---|---|
| Check-in unlock | Command times out/fails | Retry, operator-assisted unlock, or reassign vehicle |
| In-trip lock/unlock | Intermittent failure | Retry with user guidance and operator escalation path |
| Check-out lock | Cannot confirm lock state | Manual verification protocol before force-close |
| Telemetry read | Stale odometer/fuel | Use manual photo evidence and post-incident reconciliation |
For repeated customer failures, route to support immediately instead of repeated blind retries.
Customer Communication Templates
Active Incident Banner
"Some connected-vehicle commands are currently delayed. We are actively working on a fix and support is available if your rental is affected."
Check-In Impacted
"Your vehicle connection is temporarily unavailable. We are retrying now and can help reassign your booking if needed."
Check-Out Impacted
"Return confirmation is delayed due to a temporary connectivity issue. Keep the vehicle secured and follow in-app instructions while we finalize your session."
Operator Checklist During Outage
- Keep a queue of sessions needing manual assistance.
- Record all manual interventions with reason and timestamps.
- Capture supporting photos for any force-end or delayed-finalization flow.
- Do not charge disputed overages until telemetry or evidence is reconciled.
- Confirm vehicle physical security for every manually closed check-out.
Recovery Phase
When upstream systems recover:
- Confirm command success and telemetry freshness return to baseline.
- Re-enable paused session-start policies in stages.
- Reconcile sessions that were force-closed or delayed.
- Review potential billing corrections (late fees, surcharges, extensions caused by outage).
- Send closure communication to impacted customers.
Post-Incident Review
Capture these points within 24 hours:
- Start/end timestamps and total impact window
- Affected vehicles, sessions, and customers
- Number of manual interventions and unresolved cases
- Revenue or support impact
- What changed in monitoring, policy, or tooling after incident
Use this review to tighten future response time and reduce repeat disruption.
Billing Integrity
During outages, avoid automatic penalties that rely on uncertain timestamps or stale telemetry. Reconcile first, then bill.
Need Help?
For incident response assistance, contact support@levyelectric.com and include your incident start time, affected subaccount, and representative session IDs.