Git file diff: fix when JGit returns multiple entries for same file path

In some cases, JGit returns two {ADDED, DELETED} diff entries for the
same file path instead of a single {MODIFIED} entry. This happens, e.g.
when a file is modified between patchsets with a change in file mode
(for example converting a symlink to a regular file). In the old diff
cache world, We used to handle this case outside of the cache (see [1])
by combining the two entries into a single entry with change type =
REWRITTEN.

The implementation in the new diff cache did not handle this case: in
GitFileDiffCacheImpl, we used to collect the diff entries into a map,
keyed by file path, hence resulting in duplicate key exception in the
toMap collector.

In this change, we fix this issue
  * We keep the git_modified_files cache (the one that computes the
  modified files between two git trees) as is, allowing it to return two
  {ADDED, DELETED} ModifiedFile entries for the same file.

  * We update the modified_files cache (which wraps the
  git_modified_files cache) so that it merges ADDED/DELETED entries for
  the same file into a single REWRITE entry.

  * We update the git_file_diff cache (the one that computes the
  detailed git diff for a single file) so that it merges ADDED/DELETED
  entries for the same file into a single REWRITE entry. The logic
  is added in this cache and not in file_diff cache (the one that wraps
  the git_file_diff cache and computes the edits due to rebase) since
  the git_file_diff is requested with one key per file, so we expect it
  to return a single result value for that key.

We unignore a test that covers this case (for new diff cache) in
RevisionDiffIT to assert correctness.

We increase the version of the modified_files cache since its logic was
changed. The git_file_diff cache does not require increasing the cache
version since this issue resulted in an exception in this cache and the
entries are not yet cached.

[1] https://gerrit.googlesource.com/gerrit/+/refs/heads/master/java/com/google/gerrit/server/change/FileInfoJsonOldImpl.java#112

Change-Id: If2cab046051d96a2ce669ba573b36e44e3ec64e7
5 files changed
tree: 1189eae4b04ae17bb167a443a79f3ccc82e0260f
  1. .settings/
  2. .ts-out/
  3. antlr3/
  4. contrib/
  5. Documentation/
  6. e2e-tests/
  7. java/
  8. javatests/
  9. lib/
  10. modules/
  11. plugins/
  12. polygerrit-ui/
  13. prolog/
  14. prologtests/
  15. proto/
  16. resources/
  17. tools/
  18. webapp/
  19. .bazelignore
  20. .bazelproject
  21. .bazelrc
  22. .bazelversion
  23. .editorconfig
  24. .git-blame-ignore-revs
  25. .gitignore
  26. .gitmodules
  27. .gitreview
  28. .mailmap
  29. .pydevproject
  30. .zuul.yaml
  31. BUILD
  32. COPYING
  33. INSTALL
  34. Jenkinsfile
  35. package.json
  36. README.md
  37. SUBMITTING_PATCHES
  38. twinkie.patch
  39. version.bzl
  40. WORKSPACE
  41. yarn.lock
README.md

Gerrit Code Review

Gerrit is a code review and project management tool for Git based projects.

Build Status Maven Central

Objective

Gerrit makes reviews easier by showing changes in a side-by-side display, and allowing inline comments to be added by any reviewer.

Gerrit simplifies Git based project maintainership by permitting any authorized user to submit changes to the master Git repository, rather than requiring all approved changes to be merged in by hand by the project maintainer.

Documentation

For information about how to install and use Gerrit, refer to the documentation.

Source

Our canonical Git repository is located on googlesource.com. There is a mirror of the repository on Github.

Reporting bugs

Please report bugs on the issue tracker.

Contribute

Gerrit is the work of hundreds of contributors. We appreciate your help!

Please read the contribution guidelines.

Note that we do not accept Pull Requests via the Github mirror.

Getting in contact

The Developer Mailing list is repo-discuss on Google Groups.

License

Gerrit is provided under the Apache License 2.0.

Build

Install Bazel and run the following:

    git clone --recurse-submodules https://gerrit.googlesource.com/gerrit
    cd gerrit && bazel build release

Install binary packages (Deb/Rpm)

The instruction how to configure GerritForge/BinTray repositories is here

On Debian/Ubuntu run:

    apt-get update & apt-get install gerrit=<version>-<release>

NOTE: release is a counter that starts with 1 and indicates the number of packages that have been released with the same version of the software.

On CentOS/RedHat run:

    yum clean all && yum install gerrit-<version>[-<release>]

On Fedora run:

    dnf clean all && dnf install gerrit-<version>[-<release>]

Use pre-built Gerrit images on Docker

Docker images of Gerrit are available on DockerHub

To run a CentOS 8 based Gerrit image:

    docker run -p 8080:8080 gerritcodereview/gerrit[:version]-centos8

To run a Ubuntu 20.04 based Gerrit image:

    docker run -p 8080:8080 gerritcodereview/gerrit[:version]-ubuntu20

NOTE: release is optional. Last released package of the version is installed if the release number is omitted.