Paginate no-limit queries
Currently, no-limit queries do a single index search with a rather
large limit (around the size of change index). For Elasticsearch
index backend, this can be problematic as Elasticsearch's REST
client will fail with an error like [1] if too many changes have
to be returned. The REST client has a default limit of 100MB for
content it can process. This effectively implies that no-limit
queries with ES index backend are likely non-functional for most
sites as the 100MB limit is reached by as low as ~50k changes.
Changing this default limit is not recommended as it can overload
ES data nodes and the client as well. Instead, this change updates
no-limit queries to paginate rather than doing a single index search
with a large limit. It is recommended to set an appropriate value
for 'index.maxPageSize' to avoid the error [1] especially when
'index.pageSizeMultiplier' is set to value greater than 1.
Here are some stats with a Lucene based site and ~20k docs in open
changes index and ~4M in closed changes index (with ~1M abandoned
and ~3M merged).
status:open staus:abandoned
no-limit no-limit
without this change 7.6s 436s
with change
paginationType=OFFSET 7.9s 2622s
pageSizeMultiplier=1
with change
paginationType=SEARCH_AFTER 7.8s 480s
pageSizeMultiplier=1
with change
paginationType=OFFSET 7.4s 417s
pageSizeMultiplier=10
with change
paginationType=SEARCH_AFTER 7.7s 418s
pageSizeMultiplier=10
If 'index.pageSizeMultiplier' is set to 1 (default), we update it to
10 for no-limit queries as it helps improve performance and also
prevents no-limit queries from severely degrading when pagination type
is OFFSET.
[1] entity content is too long [..] for the configured buffer limit [104857600]
Release-Notes: no-limit queries are now usable with Elasticsearch index backend
Change-Id: Ifb1f6f5411140c430f2520fb252e688b67d5333c
1 file changed