Add an analyzer with tokenizer:keyword to prefix fields

Default Elasticsearch analyzer drops square brackets when
performing a query. A keyword tokenizer[1] outputs the
exact same text for queries without dropping any characters.
Also, a keyword tokenizer creates a single term for the given
text which makes 'match_phrase_prefix'[2] searches work as
intended by Gerrit.

For example, consider change C1 with hashtag '[area] subsystem'
and change C2 with 'area subsystem'. A Gerrit query [3] returns
C1 with Lucene and C1,C2 with Elasticsearch (without this change).

This helps match Elasticsearch's behaviour of 'prefixhashtag'
and 'prefixtopic' operators with that of Lucene.

[1] https://www.elastic.co/guide/en/elasticsearch/reference/7.17/analysis-keyword-tokenizer.html
[2] https://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-match-query-phrase-prefix.html
[3] prefixhashtag:"[area]"

Change-Id: Icf62611af9e8323f98d4cb21d619bf5bc3d73177
3 files changed
tree: 71ba36bc6b31365ec143a396a1606e38edd81a27
  1. src/
  2. BUILD
  3. external_plugin_deps.bzl
  4. Jenkinsfile
  5. LICENSE
  6. README.md
README.md

Index backend for Gerrit, based on ElasticSearch

Indexing backend libModule for Gerrit Code Review based on ElasticSearch.

This module was originally part of Gerrit core and then extracted into a separate component from v3.5.0-rc3 as part of Change-Id: Ib7b5167ce.

Note that, ElasticSearch source code is no longer Apache 2.0-licensed for versions 7.11 and newer. See ElasticSearch 2021 license change for more information.

How to build

This libModule is built like a Gerrit in-tree plugin, using Bazelisk. See the build instructions for more details.

Setup

See the setup instructions for how to install the index-elasticsearch module.

For further information and supported options, refer to the config documentation.

Integration test

This libModule runs tests like a Gerrit in-tree plugin, using Bazelisk. See the test instructions for more details.