Automated builds produce a noisy signal. Due to non-deterministic (a.k.a., ‘flaky’) behaviour, they can fail when they should have passed. Nevertheless, it's not uncommon for software organizations to incorporate automated builds in the code review process. In such cases, a failure from an automated build can block integration of a change set. To prevent invalid failures from unfairly blocking integration, organizations may allow developers to request a ‘recheck’, which repeats the build without updating the change set. While practical, an unconstrained recheck command may waste time and resources if it is not applied judiciously.
In this talk, I will describe our research on the use of the recheck command in 66,932 code reviews from the OpenStack community. We quantitatively analyze (i) how often build failures are rechecked; (ii) the extent to which invoking recheck changes build failure outcomes; and (iii) how much waste is generated by invoking recheck. We observe that (i) 55% of code reviews invoke the recheck command after a failing build is reported; (ii) invoking the recheck command only changes the outcome of a failing build in 42% of the cases; and (iii) invoking the recheck command increases review waiting time by an average of 2,200% and equates to 187.4 compute years of waste—enough compute resources to compete with the oldest land living animal on earth.