| # How the healthcheck plugin and multi-site keep your sleep at night |
| |
| GerritHub.io had an outage on the 3rd of November at 15:20 GMT, which |
| was caused by a [critical Gerrit issue](https://bugs.chromium.org/p/gerrit/issues/detail?id=16384) |
| discovered that day. |
| |
| The issue was deep into the core of Gerrit, involving loading the |
| accounts external-ids, impacting pretty much anything that required |
| authenticated traffic. However, none of the GerritHub.io users |
| noticed any issues, delays, slow down, or reduced functionality. |
| |
| In this talk, Tony and Luca will describe what happened and how |
| the GerritForge Team detected, analyzed, and mitigated the problem, |
| avoiding a global outage. |
| |
| The learnings from this story can help other Gerrit admins to |
| set up operating practices about metrics, high availability, |
| and service resilience with Gerrit that can be useful in |
| preventing sleepless nights and managing outages. |
| |
| *[Luca Milanesio, GerritForge](../speakers.md#lmilanesio)* |
| *[Antonio Barone, GerritForge](../speakers.md#abarone)* |