Investigating possible issue with saving student responses in US-East-1
Incident Report for Learnosity
Postmortem

On November 1, 2022, one of the four databases that persist student responses became unable to write an ID that ordered questions/responses in review and report modes only. The analytics features were potentially impacted by a maximum 25% of submitted sessions during the 2 hour and 9 minute duration of the incident and, because only a subset of those sessions used contributing features, only an estimated 1% to 3% of sessions were actually affected. No data was lost and only the order of displayed questions were temporarily changed.

Immediately after discovery, we began testing remediation options and the safest and quickest way to address the issue was to reset the impacted ID, as it played no other role in processing the data. This was theoretically unable to affect a typical assessment because questions and responses are captured upon initialization within milliseconds. However, a very small number of sessions using deferred loading features (such as sections and, in some cases, lazy loading or dynamically adding items via public method) were impacted--only if the ID was reset both after an initial batch of items were loaded and before a following batch of items were loaded. In this case, the sequential ids in the later batch were set to values lower than those in the earlier section batch, due to the reset. (Using sections as an example, this caused section two to appear before section one in a report.)

To summarize, this issue was relevant only when all of the following applied:

  • customer was among a subset of customers using the US-East-1 region
  • sessions were among the 25% or less captured to the single affected shard
  • sessions were initialized for the first time during the incident window
  • sessions contained more than one section or deferred loading mechanism

Additional long-term adjustments were made to prevent this issue from occurring again in the future. In addition to the immediate fix of resetting the ID, the review/reporting logic was improved to no longer require the ID for ordering questions/responses. As a result, all affected sessions were thereafter automatically displayed correctly without any need for rescoring or other alterations to the data.

Posted Feb 14, 2023 - 11:32 EST

Resolved
As of 18:59 UTC, we have corrected the issue affecting the capture and persistence of some student responses. A single database was affected and our sharding process limited impact to 25% or less of active sessions only, for a subset of customers, in the US-East-1 region.

Learnosity Support and Systems Engineering teams will continue to monitor the situation for a further period to be sure errors no longer occur. We will further follow up with a post mortem once we have completed root cause analysis and finalised any next steps or preventative measures required.

Please reach out if you have any questions or concerns.
Posted Nov 01, 2022 - 15:21 EDT
Identified
As of 17:55 UTC, we have identified an issue with saving student responses in active sessions only, affecting a single shard in region US-East-1. We have elevated this to Partial outage for affected customers and are working to correct the issue. We are returning the 'Availability of existing session information' status to Operational and will continue to monitor results.

Learnosity Support and Systems Engineering teams are continuing to actively investigate the issue, and will follow on with an update and resolution as soon as possible.
Posted Nov 01, 2022 - 14:00 EDT
Update
As of 17:30 UTC, we've detected an increase in errors in persisting student responses for some customers in the US-East-1 region. This currently appears to be affecting a minority of active sessions only. Authoring and Analytics remain unaffected.

.

Learnosity Support and Systems Engineering teams are continuing to actively investigate the issue, and will follow on with an update and resolution as soon as possible.
Posted Nov 01, 2022 - 13:37 EDT
Investigating
As of 17:10 UTC, we are currently investigating a possible issue with saving student responses in region US-East-1 region.

Learnosity Support and Systems Engineering teams are actively investigating the issue, and will follow on with an update and resolution as soon as possible.
Posted Nov 01, 2022 - 13:26 EDT
This incident affected: AMER || Analytics (Availability of session information) and AMER || Assessment (Saving of student responses).