Issue affecting availability of session information in US-East-1

Incident Report for Learnosity

Postmortem

Affected Systems and Regions

On March 3, 2025, we experienced a temporary service slowdown impacting the availability of session information in the US-East-1 region. The issue began at 13:19 UTC and was resolved by 15:15 UTC. The Data API and Reports API experienced delays and intermittent failures when returning session results. Authoring and Assessment APIs were unaffected with no data loss.

Investigation

We identified a recently introduced inefficient database query was causing unnecessary strain when handling extremely large data sets. This led to delayed responses to the Data API sessions endpoints. This further led to delays or unavailability in select reports via the Reports API. Additionally, some timeouts and safeguards that should have triggered earlier did not activate as expected.

Resolution

To quickly restore performance:

  • We optimized the affected queries, improving efficiency and reducing database load.
  • We scaled up the impacted systems to immediately process the backlog.
  • We fine-tuned system timeouts and connection limits to prevent similar issues in the future.

Following these fixes, all services resumed normal operations.

Prevention

To prevent future occurrences:

  • We have permanently updated our database handling methods to ensure more efficient query execution.
  • We are enhancing automated monitoring to detect and respond to database slowdowns before they impact customers.

We appreciate your patience and remain committed to delivering a seamless experience.

Posted Mar 14, 2025 - 12:35 EDT

Resolved

As of 15:45 UTC, after a further 30 minutes of monitoring, we are resolving this issue. All services remain fully operational.

Learnosity Support and Systems Engineering teams will follow up with a post mortem once we have completed root cause analysis and finalised any next steps or preventative measures required.

Please reach out if you have any questions or concerns.
Posted Mar 03, 2025 - 10:54 EST

Monitoring

As of 15:15 UTC, all services have been restored. All queued messages have been processed and session information should now be available. Users of the Learnosity Firehose service will still see slight delays as messages are sent out for recently processed events.

Learnosity Support and Systems Engineering teams are continuing to actively investigate the issue, and will follow on with an update and resolution as soon as possible.
Posted Mar 03, 2025 - 10:20 EST

Update

As of 15:00 UTC, we are continuing to investigate the issue. Data and Reports API are still impacted.

Assessment and Authoring APIs remain unaffected. Assessment submissions are being queued and will be processed as the system works through any backlog queue. Note that if the Data API or Reports API is used in a preliminary/synchronous assessment delivery workflow, this will likely have a knock-on affect when initializing assessments. Initializing any assessment API directly is unaffected at present.

Learnosity Support and Systems Engineering teams are continuing to actively investigate the issue, and will follow on with an update and resolution as soon as possible.
Posted Mar 03, 2025 - 10:06 EST

Update

As of 14:00 UTC, we are still investigating an issue affecting the availability of session information in US-East-1. Both Data API and Reports API are experiencing extended delays and intermittent failures in returning session results. Authoring and Assessment APIs appear to be unaffected at this time and all session submissions are being successfully queued.

Learnosity Support and Systems Engineering teams are continuing to actively investigate the issue, and will follow on with an update and resolution as soon as possible.
Posted Mar 03, 2025 - 09:00 EST

Investigating

As of 13:45 UTC, we are experiencing delays in availability of session information in US-East-1

Learnosity Support and Systems Engineering teams are actively investigating the issue, and will follow on with an update and resolution as soon as possible.
Posted Mar 03, 2025 - 08:48 EST
This incident affected: AMER || Analytics (Loading and rendering of reports, Availability of session information).