Destination#pushWasCanceled: minimize time spent in critical section
When cancelling a push to a replication destination don't notify
listeners about not attempted push during the critical section where the
stateLock is held but do this immediately after the critical section.
We observed in a high-availability setup with 2 primaries that
cancelling a replication push blocked >90 other threads trying to
update some refs which tried to create new replication tasks via
synchronous events. Cancelling the push was stuck on visibility checks
done in EventBroker#fireEvent triggered by Destination#pushWasCanceled.
This visibility check was slow since the affected repository is huge
(30GiB) and we use NFS for sharing repositories between primaries.
Moving the call to PushOne#notifyNotAttempted outside the critical
section should reduce the impact of this critical section on other
requests updating refs concurrently.
Change-Id: I085700c3f4cad95ef62521527ac4b920a59c76c2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
diff --git a/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java b/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java
index 98170ae..5316e35 100644
--- a/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java
+++ b/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java
@@ -71,6 +71,7 @@
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.Collection;
+import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
@@ -465,11 +466,13 @@
}
void pushWasCanceled(PushOne pushOp) {
+ Set<ImmutableSet<String>> notAttemptedRefs = Collections.emptySet();
synchronized (stateLock) {
URIish uri = pushOp.getURI();
pending.remove(uri);
- pushOp.notifyNotAttempted(pushOp.getRefs());
+ notAttemptedRefs = pushOp.getRefs();
}
+ pushOp.notifyNotAttempted(notAttemptedRefs);
}
void scheduleDeleteProject(URIish uri, Project.NameKey project, ProjectDeletionState state) {