Migrate change index to use dimensional numeric types

Lucene 6.x deprecated IntField and replaced it with IntPoint that is
using different backend storage: [1]. Instead of continuing to
represent numeric data using a structure specifically designed and
tuned for text, the Bkd implementation introduced the first flexible
tree structure designed specifically for indexing discrete numeric
points: [1]. While the new data types are mostly the drop in
replacement for old IntField and LongField types, new type cannot be
used for document id types.

The previous migration from Lucene 5 to Lucene 6 switched to using
deprecated LegacyIntField type. In the next Lucene release 7, this
class and friends were extracted from Lucene distribution and moved
for one release to Apache Solr library. So theoretically we could
still use Apache Solr dependency by adding this dependency to Gerrit
and continue to use the old/deprecated/removed data types for one
major Lucene release.

We prefer forward migration strategy and switch to using string field
type as document id for account, change and groups indexes. Only change
index is handled in this commit. Other indexes are handled in follow-up
changes.

To support online migration, legacy numeric field types are still used
in the old index schema version, but new dimensional point field types
are used in new schema version. Old integer document id field type is
replaced with string type id in new change index schema. Therefore, in
different code paths it must be decided whether the legacy number field
types or the new dimensional point field types should be used depending
on the currently used index schema version. To support this logic, new
attribute is added to the index schema class: useLegacyNumericFields.

While this approach temporarily complicates the code, it can be removed
when a next gerrit version is released. Until then the deprecated type
classes are still used.

Non id fields are replaced with new IntPoint and LongPoint fields so
that we do not use any deprecated and removed features in Lucene and
could easily upgrade to the next major Lucene release without relying
on third party dependency (Apache Solr).

One side effect of this change is that ChangeQueryBuilder in the
AbandonUtil must be used with Guice provider. The reason for that is
because index collection must be accessed to retrieve schema instance,
to detect the useLegacyNumericFields attribute. Given that AbandonUtil
is bound in singleton scope, index collection is only provided when
multiversion index module is started. When the support for legacy
numeric field is removed in later gerrit releases this change can be
reverted.

[1] https://users.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf

Bug: Issue 11643
Change-Id: Icbc80d8a775a6ffea97e99717b24d3e8cacaee14
21 files changed
tree: a1bf1040a95fe5a5dadf62ce3d8c8070588bc1ab
  1. .settings/
  2. antlr3/
  3. contrib/
  4. Documentation/
  5. e2e-tests/
  6. java/
  7. javatests/
  8. lib/
  9. modules/
  10. plugins/
  11. polygerrit-ui/
  12. prolog/
  13. prologtests/
  14. proto/
  15. resources/
  16. tools/
  17. webapp/
  18. .bazelignore
  19. .bazelproject
  20. .bazelrc
  21. .bazelversion
  22. .editorconfig
  23. .git-blame-ignore-revs
  24. .gitignore
  25. .gitmodules
  26. .gitreview
  27. .mailmap
  28. .pydevproject
  29. BUILD
  30. COPYING
  31. INSTALL
  32. package.json
  33. README.md
  34. SUBMITTING_PATCHES
  35. version.bzl
  36. WORKSPACE
README.md

Gerrit Code Review

Gerrit is a code review and project management tool for Git based projects.

Build Status

Objective

Gerrit makes reviews easier by showing changes in a side-by-side display, and allowing inline comments to be added by any reviewer.

Gerrit simplifies Git based project maintainership by permitting any authorized user to submit changes to the master Git repository, rather than requiring all approved changes to be merged in by hand by the project maintainer.

Documentation

For information about how to install and use Gerrit, refer to the documentation.

Source

Our canonical Git repository is located on googlesource.com. There is a mirror of the repository on Github.

Reporting bugs

Please report bugs on the issue tracker.

Contribute

Gerrit is the work of hundreds of contributors. We appreciate your help!

Please read the contribution guidelines.

Note that we do not accept Pull Requests via the Github mirror.

Getting in contact

The Developer Mailing list is repo-discuss on Google Groups.

License

Gerrit is provided under the Apache License 2.0.

Build

Install Bazel and run the following:

    git clone --recurse-submodules https://gerrit.googlesource.com/gerrit
    cd gerrit && bazel build release

Install binary packages (Deb/Rpm)

The instruction how to configure GerritForge/BinTray repositories is here

On Debian/Ubuntu run:

    apt-get update & apt-get install gerrit=<version>-<release>

NOTE: release is a counter that starts with 1 and indicates the number of packages that have been released with the same version of the software.

On CentOS/RedHat run:

    yum clean all && yum install gerrit-<version>[-<release>]

On Fedora run:

    dnf clean all && dnf install gerrit-<version>[-<release>]

Use pre-built Gerrit images on Docker

Docker images of Gerrit are available on DockerHub

To run a CentOS 7 based Gerrit image:

    docker run -p 8080:8080 gerritforge/gerrit-centos7[:version]

To run a Ubuntu 15.04 based Gerrit image:

    docker run -p 8080:8080 gerritforge/gerrit-ubuntu15.04[:version]

NOTE: release is optional. Last released package of the version is installed if the release number is omitted.