On 19 August 2025, Learnosity experienced degraded performance in our analytics stack, affecting session data availability in US-East-1 for a subset of customers. This affected the Reports API and Data API, with no other stacks affected, and no data loss.
Monitoring detected a rapid increase in unprocessed and retried messages, along with elevated lock contention in the sessions database. The root cause was traced to a customer implementation issue generating an extraordinarily high number of submissions. This drove excessive retries, magnifying actual traffic volume.
The use of time-ordered v7 UUIDs for session IDs, normally handled without issue, became problematic under this contention. Uniqueness checks on each session ID required more resources and triggered a succession of temporary deadlocks. These deadlocks would usually self-resolve, but the amplified traffic prevented recovery, turning a minor issue into a sustained queue blockage.
Once the issue was identified, Learnosity moved the customer to a dedicated, isolated sync queue, preventing cross‑tenant impact while we investigated. We applied targeted rate limits for the isolated service to protect the database, and drained the backlog. Where safe, long‑running queries were terminated to free locks and allow forward progress.
To support faster diagnosis, Learnosity enabled detailed deadlock logging and expanded metrics around message retries, abandonment, and per‑session activity. Learnosity also worked with the customer to adjust implementation settings, reducing combined saves and submits by two orders of magnitude. Session IDs were also switched to v4 UUIDs which simplified uniqueness checks further preventing deadlocks.
Immediately after these changes were put into use, queues began to rapidly recover, and normal processing resumed. Most sessions for the subset of affected customers saw short delays, while the most significantly delayed session took ~6 hours before final persistence. Throughout, we identified no data loss.
To prevent recurrence, we are: