Backend API

Back-end degraded temporarily

Postmortem

This post-mortem analysis examines an incident that occurred during the integration between a Single Sign-On (SSO) system and Zivver on 20/06/2023, resulting in disruptions and challenges. The incident was primarily caused by the unavailability of the metadata URL and the lack of a server response, which had significant impacts on user access and platform performance.

Root Cause:

  1. Unavailability of Metadata URL: The integration relied on Zivver fetching essential information (metadata) about the Identity Provider from a designated URL. However, this metadata URL became inaccessible, preventing Zivver from retrieving the necessary data. Consequently, the normal authentication and authorization processes were disrupted, causing difficulties for users attempting to access Zivver’s services.
  2. Lack of Server Response: The Zivver platform, designed to wait for a response from the server providing metadata, experienced a lack of response within the expected timeframe. This delay in receiving the necessary information caused the platform to delay processing each request, resulting in an accumulation of pending requests. The subsequent influx of authentication requests overwhelmed the platform, leading to delays and potential timeouts.

Impact:
The incident had several notable impacts on the integration and its users:

  • Difficulties in Access: Users faced challenges when attempting to access Zivver’s services due to the unavailability of the metadata URL. This disrupted their workflow and hindered their ability to utilize the platform effectively.
  • Platform Overload: The accumulation of pending authentication requests, coupled with the lack of server response, resulted in an overload on the Zivver platform. This led to delays in processing requests and potentially caused timeouts, affecting the overall performance and responsiveness of the system.

Solution:

  • Zivver immediately made changes to the platform to ensure this issue could not happen again. The Zivver platform will no longer try to keep loading SSO metadata that is unavailable and Zivver ensures that unavailable metadata has no impact on the rest of the platform.
  • In addition, Zivver started to go over other parts of our platform to ensure that similar unlikely situations cannot happen in other places.
Resolved

Dear customer,

We have found and resolved the root-cause and after close monitoring for an hour we are confident the issue is mitigated.

A post-mortem will be shared later this week.

Thank you for your patience,

Zivver Support

Monitoring

Dear customer,

We have found and mitigated the root-cause of the issue. Your users should be able to log into Zivver once more.
If you are still experiencing issues, please contact support@zivver.com.

A Post Mortem will be shared later this week.

Thank you for your patience.

Updated

Dear customer,

Our apologies, the sending of the previous update was delayed.

Current status: Path to solution is clear and we are working on mitigating the cause of the issue.

Problem Identified

Dear Customer,

We have identified the cause of the service degradation, we are working on the mitigation and will implement as soon as possible.

Next update in 30 minutes.

Assessed

Dear customer,

At the moment we are experiencing an issue with our back-end which might cause you to be unable to log-in to Zivver. We are working on identifying and mitigating the underlying issue.

Next update at 10:00 CEST