| = Request Tracing |
| |
| [[on-demand]] |
| == On-demand Request Tracing |
| |
| Gerrit supports on-demand tracing of single requests that results in |
| additional logs with debug information that are written to the |
| `error_log`. The logs that correspond to a traced request are |
| associated with a unique trace ID. This trace ID is returned with the |
| response and can be used by an administrator to find the matching log |
| entries. |
| |
| How tracing is enabled and how the trace ID is returned depends on the |
| request type: |
| |
| * REST API: For REST calls tracing can be enabled by setting the |
| `trace` request parameter or the `X-Gerrit-Trace` header, the trace |
| ID is returned as `X-Gerrit-Trace` header. More information about |
| this can be found in the link:rest-api.html#tracing[Request Tracing] |
| section of the link:rest-api.html[REST API documentation]. |
| * SSH API: For SSH calls tracing can be enabled by setting the |
| `--trace` option. More information about this can be found in |
| the link:cmd-index.html#trace[Trace] section of the |
| link:cmd-index.html[SSH command documentation]. |
| * Git: For Git pushes tracing can be enabled by setting the |
| `trace` push option, the trace ID is returned in the command output. |
| More information about this can be found in |
| the link:user-upload.html#trace[Trace] section of the |
| link:user-upload.html[upload documentation]. Tracing for Git requests |
| other than Git push is not supported. |
| |
| When request tracing is enabled it is possible to provide an ID that |
| should be used as trace ID. If a trace ID is not provided a trace ID is |
| automatically generated. The trace ID must be provided to the support |
| team (administrator of the server) so that they can find the trace. |
| |
| When doing traces it is recommended to specify the ID of the issue |
| that is being investigated as trace ID so that the traces of the issue |
| can be found more easily. When the issue ID is used as trace ID there |
| is no need to find the generated trace ID and report it in the issue. |
| |
| Since tracing consumes additional server resources tracing should only |
| be enabled for single requests if there is a concrete need for |
| debugging. In particular bots should never enable tracing for all their |
| requests by default. |
| |
| [[auto-retry]] |
| == Automatic Request Tracing |
| |
| Gerrit can be link:config-gerrit.html#retry.retryWithTraceOnFailure[ |
| configured] to automatically retry requests on non-recoverable failures |
| with tracing enabled. This allows to automatically captures traces of |
| these failures for further analysis by the Gerrit administrators. |
| |
| The auto-retry on failure behaves the same way as if the calling user |
| would retry the failed operation with tracing enabled. |
| |
| It is expected that the auto-retry fails with the same exception that |
| triggered the auto-retry, however this is not guaranteed: |
| |
| * Not all Gerrit operations are fully atomic and it can happen that |
| some parts of the operation have been successfully performed before |
| the failure happened. In this case the auto-retry may fail with a |
| different exception. |
| * Some exceptions may mistakenly be considered as non-recoverable and |
| the auto-retry actually succeeds. |
| |
| [[auto-retry-succeeded]] |
| If an auto-retry succeeds you may consider filing this as |
| link:https://bugs.chromium.org/p/gerrit/issues/entry?template=GoogleSource+Issue[ |
| Gerrit issue] so that the Gerrit developers can fix this and treat this |
| exception as recoverable. |
| |
| The trace IDs for auto-retries are generated and start with |
| `retry-on-failure-`. |
| |
| The best way to search for auto-retries in logs is to do a grep by |
| `AutoRetry`. For each auto-retry that happened this should match 1 or 2 |
| log entries: |
| |
| * one `ERROR` log entry with the exception that triggered the |
| auto-retry |
| * one `FINE` log entry with the exception that happened on auto-retry |
| (if this log entry is not present the operation succeeded on |
| auto-retry) |
| |
| To inspect single auto-retry occurrences in detail you can do a |
| link:#find-trace[grep by the trace ID]. The trace ID is part of the log |
| entries which have been found by the previous grep (watch out for |
| something like: `retry-on-failure-1534166888910-3985dfba`). |
| |
| [TIP] |
| Auto-retrying on failures is only supported by some of the REST |
| endpoints (change REST endpoints that perform updates). |
| |
| [[auto-retry-metrics]] |
| === Metrics |
| |
| If auto-retry is link:config-gerrit.html#retry.retryWithTraceOnFailure[ |
| enabled] the following metrics are reported: |
| |
| * `action/auto_retry_count`: Number of automatic retries with tracing |
| * `action/failures_on_auto_retry_count`: Number of failures on auto retry |
| |
| By comparing the values of these counters one can see how often the |
| auto-retry succeeds. As explained link:#auto-retry-succeeded[above] if |
| auto-retries succeed that's an issue with Gerrit that you may want to |
| report. |
| |
| [[find-trace]] |
| == Find log entries for a trace ID |
| |
| If tracing is enabled all log messages that correspond to the traced |
| request have a `TRACE_ID` tag set, e.g.: |
| |
| ---- |
| [2018-08-13 15:28:08,913] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : Received REST request: GET /a/accounts/self (parameters: [trace]) [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ] |
| [2018-08-13 15:28:08,914] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : Calling user: admin [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ] |
| [2018-08-13 15:28:08,942] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : REST call succeeded: 200 [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ] |
| ---- |
| |
| By doing a grep with the trace ID over the error log the log entries |
| that correspond to the request can be found. |
| |
| [TIP] |
| Usually only server administrators have access to the logs. |
| |
| == Which information is captured in a trace? |
| |
| * request details |
| ** REST API: request URL, request parameter names, calling user, |
| response code, response body on errors |
| ** SSH API: parameter names |
| ** Git API: push options, magic branch parameter names |
| * cache misses, cache evictions |
| * reads from NoteDb, writes to NoteDb |
| * reads of meta data files, writes of meta data files |
| * index queries (with parameters and matches) |
| * reindex events |
| * permission checks (e.g. which rule is responsible for a deny) |
| * timer metrics |
| * all other logs |