Destination#pushWasCanceled: minimize time spent in critical section When cancelling a push to a replication destination don't notify listeners about not attempted push during the critical section where the stateLock is held but do this immediately after the critical section. We observed in a high-availability setup with 2 primaries that cancelling a replication push blocked >90 other threads trying to update some refs which tried to create new replication tasks via synchronous events. Cancelling the push was stuck on visibility checks done in EventBroker#fireEvent triggered by Destination#pushWasCanceled. This visibility check was slow since the affected repository is huge (30GiB) and we use NFS for sharing repositories between primaries. Moving the call to PushOne#notifyNotAttempted outside the critical section should reduce the impact of this critical section on other requests updating refs concurrently. Change-Id: I085700c3f4cad95ef62521527ac4b920a59c76c2 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
diff --git a/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java b/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java index 98170ae..5316e35 100644 --- a/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java +++ b/src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java
@@ -71,6 +71,7 @@ import java.io.IOException; import java.net.URISyntaxException; import java.util.Collection; +import java.util.Collections; import java.util.HashMap; import java.util.HashSet; import java.util.List; @@ -465,11 +466,13 @@ } void pushWasCanceled(PushOne pushOp) { + Set<ImmutableSet<String>> notAttemptedRefs = Collections.emptySet(); synchronized (stateLock) { URIish uri = pushOp.getURI(); pending.remove(uri); - pushOp.notifyNotAttempted(pushOp.getRefs()); + notAttemptedRefs = pushOp.getRefs(); } + pushOp.notifyNotAttempted(notAttemptedRefs); } void scheduleDeleteProject(URIish uri, Project.NameKey project, ProjectDeletionState state) {