Add in memory change data cache by project

This cache only stores the data needed to speedup ref advertisements.
The index is used to populate this cache when it is available otherwise
it falls back to scanning noteDb (generally on replicas). This cache is
most useful when using 'git upload-pack' with projects with large change
counts on replicas.

The following times were seen when running 'git ls-remote' on a project
with around 380K changes stored on NFS:

With ES index (primary):
 Before this change: (cold 1m3s) 12-17s (with large enough "changes" cache)
 After this change:  (cold 1m3s) 12-17s (with large enough "changes" cache)
No Index (replica):
 Before this change: (cold ~4m)   ~4m
 After this change:  (cold 2m25s) 12-17s

Although the numbers above do not show an improvement on "best" times
for primaries using ES, that's because those numbers represent the times
achieved with the existing "changes" cache enabled. However, since the
existing "changes" cache is disabled by default, most sites will see an
improvement after this change with 'git upload-pack' on primaries also,
and additionally with 'git receive-pack'. Thus the warm times after this
change are more achievable and more likely to stay fast, even when other
projects are accessed in between. Before this change it is harder to
achieve and maintain the warm times.

The existing "changes" cache can also is used to help cache changes by
project, however, it is documented as not being appropriate for
multi-primary setups since it will be out of date if alterations are
made to changes outside of the current server. The existing cache is
also not useable on replicas where there is no index. It is noteworthy
that the existing cache provide value beyond just its ability to cache
results since loading change data from the index (even with ES), is
faster than loading change data via a noteDb scan.

Similar to the existing cache, the new cache is able to take advantage
of the index when it is available on primaries, however it is actually
multi-primary friendly because it stores the meta-ref id for each change
and out-of-date changes are detected on each cache fetch resulting in
the cache being incrementally kept up-to-date. The new cache is also
useable on replicas by also falling back to a noteDb scan when the index
is not available, however here too it is able to incrementally stay up
to date after the first scan.

The new cache is much more space efficient than the existing "changes"
cache as it takes advantage of the fact that determining visibility for
public changes requires access only to a change's destination branch and
that many changes are generally destined for each branch. So although
the existing cache is fairly compact, this new cache stores even fewer
change data fields. This makes it likely possible to keep all the
projects on a server cached in memory, even on servers with large
projects and change counts. It may make sense to eventually remove the
existing changes cache.

Release-Notes: Add cache to speedup ref advertisements (particularly on replicas) and receive-pack
Forward-Compatible: checked
Change-Id: Ib9c77ee03d9012cdeb6ad05eed3492c3009e4334
14 files changed
tree: 7aeb70ce2056a0bd62d2d7fea8bb80e3a57672fb
  1. .settings/
  2. .ts-out/
  3. antlr3/
  4. contrib/
  5. Documentation/
  6. e2e-tests/
  7. java/
  8. javatests/
  9. lib/
  10. modules/
  11. plugins/
  12. polygerrit-ui/
  13. prolog/
  14. prologtests/
  15. proto/
  16. resources/
  17. tools/
  18. webapp/
  19. .bazelignore
  20. .bazelproject
  21. .bazelrc
  22. .bazelversion
  23. .editorconfig
  24. .git-blame-ignore-revs
  25. .gitignore
  26. .gitmodules
  27. .gitreview
  28. .mailmap
  29. .pydevproject
  30. .zuul.yaml
  31. BUILD
  32. COPYING
  33. INSTALL
  34. Jenkinsfile
  35. package.json
  36. README.md
  37. SUBMITTING_PATCHES
  38. version.bzl
  39. WORKSPACE
  40. yarn.lock
README.md

Gerrit Code Review

Gerrit is a code review and project management tool for Git based projects.

Build Status Maven Central

Objective

Gerrit makes reviews easier by showing changes in a side-by-side display, and allowing inline comments to be added by any reviewer.

Gerrit simplifies Git based project maintainership by permitting any authorized user to submit changes to the master Git repository, rather than requiring all approved changes to be merged in by hand by the project maintainer.

Documentation

For information about how to install and use Gerrit, refer to the documentation.

Source

Our canonical Git repository is located on googlesource.com. There is a mirror of the repository on Github.

Reporting bugs

Please report bugs on the issue tracker.

Contribute

Gerrit is the work of hundreds of contributors. We appreciate your help!

Please read the contribution guidelines.

Note that we do not accept Pull Requests via the Github mirror.

Getting in contact

The Developer Mailing list is repo-discuss on Google Groups.

License

Gerrit is provided under the Apache License 2.0.

Build

Install Bazel and run the following:

    git clone --recurse-submodules https://gerrit.googlesource.com/gerrit
    cd gerrit && bazel build release

Install binary packages (Deb/Rpm)

The instruction how to configure GerritForge/BinTray repositories is here

On Debian/Ubuntu run:

    apt-get update & apt-get install gerrit=<version>-<release>

NOTE: release is a counter that starts with 1 and indicates the number of packages that have been released with the same version of the software.

On CentOS/RedHat run:

    yum clean all && yum install gerrit-<version>[-<release>]

On Fedora run:

    dnf clean all && dnf install gerrit-<version>[-<release>]

Use pre-built Gerrit images on Docker

Docker images of Gerrit are available on DockerHub

To run a CentOS 8 based Gerrit image:

    docker run -p 8080:8080 gerritcodereview/gerrit[:version]-centos8

To run a Ubuntu 20.04 based Gerrit image:

    docker run -p 8080:8080 gerritcodereview/gerrit[:version]-ubuntu20

NOTE: release is optional. Last released package of the version is installed if the release number is omitted.