V2 API & APP - Unavailability

Incident Report for Yousign

Postmortem

Incident Impact and Resolution Report

Dear Yousign Users,

On 2023-11-30, an unexpected service disruption occurred, affecting every customer using Yousign V2 via the API or Web Application. The incident lasted for approximately 20 minutes, followed by a one-hour delay in global V2 usage, primarily characterized by prolonged signature status changes.

The disruptions commenced at 14:36 CET, initially manifesting as increased response times and occasional timeouts. The situation gradually escalated to more frequent timeouts and HTTP 500 errors, reaching its peak within minutes, resulting in a global V2 outage until 14:55 CET.

From 14:55 CET to 15:40 CET, the V2 solution diligently processed all pending requests but accumulated some delays in updating signature status, causing a display issue from processing to signed.

The impacted components during this incident were:

Yousign V2 - API V2: https://api.yousign.com
Yousign V2 - APP V2: https://webapp.yousign.com

We want to assure you that despite the disruption, no information was lost, and all received signatures were properly processed.

Root Cause Analysis

Our Yousign databases are safeguarded by various mechanisms designed to minimize downtime in case of an issue. One such mechanism involves replicating databases from our main provider instances to our replicated instances for safeguard and internal usage purpose. However, this replication process is highly dependent on disk capacity and resource usage.

On 15/11/2023, an alarm alerted us to low available disk space on one of these resources. Regrettably, the alert was acknowledged but not acted upon by the responsible team. No other alerts were raised until the incident occurred.

Correction

To mitigate the issue, we had to halt the faulty replication process for our V2 database and resize the disk to alleviate pressure on the main database process. With sufficient space and the replication mechanism stopped, the V2 solution became responsive enough to handle new requests, ultimately ending the global outage.

Permanent Fix

We have identified the root cause and will implement the following actions in the coming days to ensure a permanent fix:

Rework alerts to implement a "rearm" mechanism, re-raising unresolved issues.
Modify alerts behavior to include additional thresholds for disk usage.

We sincerely apologize for any inconvenience this incident may have caused and appreciate your understanding as we work to enhance our systems for a more resilient and reliable service.

Thank you for your continued trust in Yousign.

Posted Dec 05, 2023 - 09:44 CET

Resolved

Incident resolved.
APP & API v2 in nominal state.
Posted Nov 30, 2023 - 15:59 CET

Update

APP & API v3 is no longer impacted.
v2 API & API signature delay is back to normal however response time is still a little bit slower than usual, investigation in progress.
Posted Nov 30, 2023 - 15:54 CET

Update

Delays in APP & API v2 are impacting signature time on APP & API v3.
Posted Nov 30, 2023 - 15:23 CET

Monitoring

Issue identified and fixed, the APP & API is slowly recovering.
Posted Nov 30, 2023 - 15:07 CET

Identified

Yousign V2 API & APP is currently unavailable (https://api.yousign.com & https://webapp.yousign.com). Our engineers are working to resolve this as fast as possible.

We will update you again shortly here.

If you have urgent questions, please contact our support team by sending an email to support@yousign.com.
Posted Nov 30, 2023 - 14:42 CET
This incident affected: Yousign V2 (API V2 - https://api.yousign.com, APP V2 - https://webapp.yousign.com) and Yousign V3 (APP V3 - https://yousign.app, API V3 - https://api.yousign.app).