Retry replications failed due to "failed to lock" errors

If two or more replication PushOps (to the same GIT and Ref) are
scheduled at approximately same time (and end up on different
replication threads), there is a large probability that the last
push to complete will fail with a remote "failure to lock" error.

This is due to the first replication operation updating the very
same remote ref during the execution of the second replication push
operation.

Since this is not recognized as a transient issue, the last ref
update will never happen and the commit will not be reachable from
the Gerrit slave.

In a very active system as ours, this happens often enough to be a
huge issue for both users and CI-systems.

This fix will acknowledge these "failed to lock" issues as a
transient error and schedule the failed replication operation
for a retry.

Change-Id: Ic51d8407e38eea257b0a2e5fb529cf23b5875ca8
3 files changed
tree: 752477e228a661b48f47ca0ed986925a68c7d29e
  1. .settings/
  2. src/
  3. .gitignore
  4. BUCK
  5. LICENSE