Allow to specify a seed that should be used to shuffle code owners

When code owners of a path are listed, code owners that have the same
score are returned in a random order. This is done to fairly distribute
the review work among the code owners.

However the random order is a problem for the grouping of files with the
same code owners that is done in the Gerrit frontend. The frontend
queries 5 code owners for each file and then groups files that have the
exact same code owners. If files have more than 5 code owners with the
same score it is random which of them are returned to the frontend. This
means for files that have the same code owners, the frontend may get a
different sets of random code owners, and hence the grouping does not
work. As a result of this the frontend shows more groups to the user and
users need to pick more code owners as reviewers. If there are many
files and code owners, this can easily lead to situations where the user
is presented many more files groups than would be necessary.

To fix this we now allow the caller of the list code owners REST
endpoints to provide a seed that should be used to shuffle the code
owners that have the same score. By generating a seed and using this
same seed for getting code owners suggestions for all files in a change
a stable sort order of code owners with the same score can be achieved
for all the files. This means the same random users will be suggested
for all files that have the same code owners, and hence the grouping in
the frontend will work much better.

This change only adds the support for setting a seed in the requests in
the backend. A follow-up change still needs to adapt the frontend to
generate a seed and set it on the requests. This should only be done
after this backend change has been rolled out.

Using the same seed for different requests only makes the sort order
stable if the same seed and the same limit is used. Using the same limit
is important because using the same seed for sorting only results in the
same order if the input list has exactly the same elements. If the input
list as more or fewer elements due to a different limit, the order will
be different. This limitation is fine for us, since the frontend will
execute all requests with the same limit.

The sort order is not guaranteed to be stable if new accounts are
created in between the requests, or if the account visibility is
changed. However we expect this to happen rarely, and if it happens it's
not a big issue as it only leads to less good grouping of code owners in
the code owner suggestion.

Signed-off-by: Edwin Kempin <ekempin@google.com>
Change-Id: Id59a4d92e6101a9bf860e931d919a5cdf24045ef
diff --git a/java/com/google/gerrit/plugins/codeowners/api/CodeOwners.java b/java/com/google/gerrit/plugins/codeowners/api/CodeOwners.java
index 5d1182c..1cf6e9e 100644
--- a/java/com/google/gerrit/plugins/codeowners/api/CodeOwners.java
+++ b/java/com/google/gerrit/plugins/codeowners/api/CodeOwners.java
@@ -47,6 +47,7 @@
     private Set<ListAccountsOption> options = EnumSet.noneOf(ListAccountsOption.class);
     private Integer limit;
     private String revision;
+    private Long seed;
 
     /**
      * Lists the code owners for the given path.
@@ -93,6 +94,16 @@
     }
 
     /**
+     * Sets the seed that should be used to shuffle code owners that have the same score.
+     *
+     * @param seed seed that should be used to shuffle code owners that have the same score
+     */
+    public QueryRequest withSeed(long seed) {
+      this.seed = seed;
+      return this;
+    }
+
+    /**
      * Sets the branch revision from which the code owner configs should be read.
      *
      * <p>Not supported for querying code owners for a path in a change.
@@ -114,6 +125,11 @@
       return Optional.ofNullable(limit);
     }
 
+    /** Returns the seed that should be used to shuffle code owners that have the same score. */
+    public Optional<Long> getSeed() {
+      return Optional.ofNullable(seed);
+    }
+
     /** Returns the branch revision from which the code owner configs should be read. */
     public Optional<String> getRevision() {
       return Optional.ofNullable(revision);
diff --git a/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInBranchImpl.java b/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInBranchImpl.java
index d5cb276..27dc958 100644
--- a/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInBranchImpl.java
+++ b/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInBranchImpl.java
@@ -56,6 +56,7 @@
           GetCodeOwnersForPathInBranch getCodeOwners = getCodeOwnersProvider.get();
           getOptions().forEach(getCodeOwners::addOption);
           getLimit().ifPresent(getCodeOwners::setLimit);
+          getSeed().ifPresent(getCodeOwners::setSeed);
           getRevision().ifPresent(getCodeOwners::setRevision);
           CodeOwnersInBranchCollection.PathResource pathInBranchResource =
               codeOwnersInBranchCollection.parse(
diff --git a/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInChangeImpl.java b/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInChangeImpl.java
index 4c8cfdc..e5a25a3 100644
--- a/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInChangeImpl.java
+++ b/java/com/google/gerrit/plugins/codeowners/api/CodeOwnersInChangeImpl.java
@@ -61,6 +61,7 @@
           GetCodeOwnersForPathInChange getCodeOwners = getCodeOwnersProvider.get();
           getOptions().forEach(getCodeOwners::addOption);
           getLimit().ifPresent(getCodeOwners::setLimit);
+          getSeed().ifPresent(getCodeOwners::setSeed);
           CodeOwnersInChangeCollection.PathResource pathInChangeResource =
               codeOwnersInChangeCollection.parse(
                   revisionResource, IdString.fromDecoded(path.toString()));
diff --git a/java/com/google/gerrit/plugins/codeowners/restapi/AbstractGetCodeOwnersForPath.java b/java/com/google/gerrit/plugins/codeowners/restapi/AbstractGetCodeOwnersForPath.java
index a5f7b5d..c77cb1d 100644
--- a/java/com/google/gerrit/plugins/codeowners/restapi/AbstractGetCodeOwnersForPath.java
+++ b/java/com/google/gerrit/plugins/codeowners/restapi/AbstractGetCodeOwnersForPath.java
@@ -55,6 +55,8 @@
 import java.util.EnumSet;
 import java.util.HashSet;
 import java.util.List;
+import java.util.Optional;
+import java.util.Random;
 import java.util.Set;
 import java.util.stream.Stream;
 import org.kohsuke.args4j.Option;
@@ -81,6 +83,7 @@
   private final Set<String> hexOptions;
 
   private int limit = DEFAULT_LIMIT;
+  private Optional<Long> seed = Optional.empty();
 
   @Option(
       name = "-o",
@@ -106,6 +109,13 @@
     this.limit = limit;
   }
 
+  @Option(
+      name = "--seed",
+      usage = "seed that should be used to shuffle code owners that have the same score")
+  public void setSeed(long seed) {
+    this.seed = Optional.of(seed);
+  }
+
   protected AbstractGetCodeOwnersForPath(
       AccountVisibility accountVisibility,
       Accounts accounts,
@@ -308,7 +318,9 @@
 
   private ImmutableList<CodeOwner> sortAndLimit(
       CodeOwnerScoring distanceScoring, ImmutableSet<CodeOwner> codeOwners) {
-    return sortCodeOwners(distanceScoring, codeOwners).limit(limit).collect(toImmutableList());
+    return sortCodeOwners(seed, distanceScoring, codeOwners)
+        .limit(limit)
+        .collect(toImmutableList());
   }
 
   /**
@@ -318,24 +330,27 @@
    *
    * <p>The order of code owners with the same distance score is random.
    *
+   * @param seed seed that should be used to randomize the order
    * @param distanceScoring the distance scorings for the code owners
    * @param codeOwners the code owners that should be sorted
    * @return the sorted code owners
    */
   private static Stream<CodeOwner> sortCodeOwners(
-      CodeOwnerScoring distanceScoring, ImmutableSet<CodeOwner> codeOwners) {
-    return randomizeOrder(codeOwners).sorted(distanceScoring.comparingByScoring());
+      Optional<Long> seed, CodeOwnerScoring distanceScoring, ImmutableSet<CodeOwner> codeOwners) {
+    return randomizeOrder(seed, codeOwners).sorted(distanceScoring.comparingByScoring());
   }
 
   /**
    * Returns the entries from the given set in a random order.
    *
+   * @param seed seed that should be used to randomize the order
    * @param set the set for which the entries should be returned in a random order
    * @return the entries from the given set in a random order
    */
-  private static <T> Stream<T> randomizeOrder(Set<T> set) {
+  private static <T> Stream<T> randomizeOrder(Optional<Long> seed, Set<T> set) {
     List<T> randomlyOrderedCodeOwners = new ArrayList<>(set);
-    Collections.shuffle(randomlyOrderedCodeOwners);
+    Collections.shuffle(
+        randomlyOrderedCodeOwners, seed.isPresent() ? new Random(seed.get()) : new Random());
     return randomlyOrderedCodeOwners.stream();
   }
 
@@ -406,6 +421,6 @@
    * <p>No visibility check is performed.
    */
   private Stream<Account.Id> getRandomUsers(int limit) throws IOException {
-    return randomizeOrder(accounts.allIds()).limit(limit);
+    return randomizeOrder(seed, accounts.allIds()).limit(limit);
   }
 }
diff --git a/javatests/com/google/gerrit/plugins/codeowners/acceptance/api/AbstractGetCodeOwnersForPathIT.java b/javatests/com/google/gerrit/plugins/codeowners/acceptance/api/AbstractGetCodeOwnersForPathIT.java
index 8d86d7e..4dbac76 100644
--- a/javatests/com/google/gerrit/plugins/codeowners/acceptance/api/AbstractGetCodeOwnersForPathIT.java
+++ b/javatests/com/google/gerrit/plugins/codeowners/acceptance/api/AbstractGetCodeOwnersForPathIT.java
@@ -50,6 +50,7 @@
 import com.google.gerrit.plugins.codeowners.restapi.GetCodeOwnersForPathInBranch;
 import com.google.inject.Inject;
 import java.util.List;
+import java.util.Random;
 import org.junit.Before;
 import org.junit.Test;
 
@@ -1013,4 +1014,73 @@
         .comparingElementsUsing(hasAccountId())
         .containsExactly(admin.id(), user.id(), user2.id());
   }
+
+  @Test
+  public void getCodeOwnersProvidingASeedMakesSortOrderStableAcrocssRequests() throws Exception {
+    TestAccount user2 = accountCreator.user2();
+
+    // create some code owner configs
+    codeOwnerConfigOperations
+        .newCodeOwnerConfig()
+        .project(project)
+        .branch("master")
+        .folderPath("/")
+        .addCodeOwnerEmail(admin.email())
+        .addCodeOwnerEmail(user.email())
+        .addCodeOwnerEmail(user2.email())
+        .create();
+
+    long seed = (new Random()).nextLong();
+
+    List<CodeOwnerInfo> codeOwnerInfos =
+        queryCodeOwners(getCodeOwnersApi().query().withSeed(seed), "/foo/bar/baz.md");
+    // all code owners have the same score, hence their order is random
+    assertThatList(codeOwnerInfos)
+        .comparingElementsUsing(hasAccountId())
+        .containsExactly(admin.id(), user.id(), user2.id());
+
+    // Check that the order for further requests that use the same seed is the same.
+    List<Account.Id> expectedAccountIds =
+        codeOwnerInfos.stream().map(info -> Account.id(info.account._accountId)).collect(toList());
+    for (int i = 0; i < 10; i++) {
+      assertThatList(queryCodeOwners(getCodeOwnersApi().query().withSeed(seed), "/foo/bar/baz.md"))
+          .comparingElementsUsing(hasAccountId())
+          .containsExactlyElementsIn(expectedAccountIds)
+          .inOrder();
+    }
+  }
+
+  @Test
+  public void getCodeOwnersProvidingASeedMakesSortOrderStableAcrossRequests_allUsersAreCodeOwners()
+      throws Exception {
+    TestAccount user2 = accountCreator.user2();
+
+    // create some code owner configs
+    codeOwnerConfigOperations
+        .newCodeOwnerConfig()
+        .project(project)
+        .branch("master")
+        .folderPath("/")
+        .addCodeOwnerEmail("*")
+        .create();
+
+    long seed = (new Random()).nextLong();
+
+    List<CodeOwnerInfo> codeOwnerInfos =
+        queryCodeOwners(getCodeOwnersApi().query().withSeed(seed), "/foo/bar/baz.md");
+    // all code owners have the same score, hence their order is random
+    assertThatList(codeOwnerInfos)
+        .comparingElementsUsing(hasAccountId())
+        .containsExactly(admin.id(), user.id(), user2.id());
+
+    // Check that the order for further requests that use the same seed is the same.
+    List<Account.Id> expectedAccountIds =
+        codeOwnerInfos.stream().map(info -> Account.id(info.account._accountId)).collect(toList());
+    for (int i = 0; i < 10; i++) {
+      assertThatList(queryCodeOwners(getCodeOwnersApi().query().withSeed(seed), "/foo/bar/baz.md"))
+          .comparingElementsUsing(hasAccountId())
+          .containsExactlyElementsIn(expectedAccountIds)
+          .inOrder();
+    }
+  }
 }
diff --git a/javatests/com/google/gerrit/plugins/codeowners/acceptance/restapi/AbstractGetCodeOwnersForPathRestIT.java b/javatests/com/google/gerrit/plugins/codeowners/acceptance/restapi/AbstractGetCodeOwnersForPathRestIT.java
index af2997e..dcd0cbb 100644
--- a/javatests/com/google/gerrit/plugins/codeowners/acceptance/restapi/AbstractGetCodeOwnersForPathRestIT.java
+++ b/javatests/com/google/gerrit/plugins/codeowners/acceptance/restapi/AbstractGetCodeOwnersForPathRestIT.java
@@ -172,4 +172,11 @@
     r.assertBadRequest();
     assertThat(r.getEntityContent()).contains("\"invalid\" is not a valid value for \"--limit\"");
   }
+
+  @Test
+  public void cannotGetCodeOwnersWithInvalidSeed() throws Exception {
+    RestResponse r = adminRestSession.get(getUrl(TEST_PATH, "seed=invalid"));
+    r.assertBadRequest();
+    assertThat(r.getEntityContent()).contains("\"invalid\" is not a valid value for \"--seed\"");
+  }
 }
diff --git a/resources/Documentation/rest-api.md b/resources/Documentation/rest-api.md
index e22d4b8..a1059b3 100644
--- a/resources/Documentation/rest-api.md
+++ b/resources/Documentation/rest-api.md
@@ -353,12 +353,13 @@
 
 The following request parameters can be specified:
 
-| Field Name  |          | Description |
-| ----------- | -------- | ----------- |
-| `o`         | optional | [Account option](../../../Documentation/rest-api-accounts.html#query-options) that controls which fields in the returned accounts should be populated. Can be specified multiple times. If not given, only the `_account_id` field for the account ID is populated.
-| `O`         | optional | [Account option](../../../Documentation/rest-api-accounts.html#query-options) in hex. For the explanation see `o` parameter.
+| Field Name   |          | Description |
+| ------------ | -------- | ----------- |
+| `o`          | optional | [Account option](../../../Documentation/rest-api-accounts.html#query-options) that controls which fields in the returned accounts should be populated. Can be specified multiple times. If not given, only the `_account_id` field for the account ID is populated.
+| `O`          | optional | [Account option](../../../Documentation/rest-api-accounts.html#query-options) in hex. For the explanation see `o` parameter.
 | `limit`\|`n` | optional | Limit defining how many code owners should be returned at most. By default 10.
-| `revision` | optional | Revision from which the code owner configs should be read as commit SHA1. Can be used to read historic code owners from this branch, but imports from other branches or repositories as well as default and global code owners from `refs/meta/config` are still read from the current revisions. If not specified the code owner configs are read from the HEAD revision of the branch. Not supported for getting code owners for a path in a change.
+| `seed`       | optional | Seed, as a long value, that should be used to shuffle code owners that have the same score. Can be used to make the sort order stable across several requests, e.g. to get the same set of random code owners for different file paths that have the same code owners. Important: the sort order is only stable if the requests use the same seed **and** the same limit. In addition, the sort order is not guaranteed to be stable if new accounts are created in between the requests, or if the account visibility is changed.
+| `revision`   | optional | Revision from which the code owner configs should be read as commit SHA1. Can be used to read historic code owners from this branch, but imports from other branches or repositories as well as default and global code owners from `refs/meta/config` are still read from the current revisions. If not specified the code owner configs are read from the HEAD revision of the branch. Not supported for getting code owners for a path in a change.
 
 As a response a list of [CodeOwnerInfo](#code-owner-info) entities is returned.
 The returned code owners are sorted by an internal score that expresses how good