Marian Harbach | ebeb154 | 2019-12-13 10:42:46 +0100 | [diff] [blame] | 1 | :linkattrs: |
Edwin Kempin | 0ade5aa | 2018-08-10 08:49:32 +0200 | [diff] [blame] | 2 | = Request Tracing |
| 3 | |
Edwin Kempin | 4a14316 | 2019-08-28 15:45:21 +0200 | [diff] [blame] | 4 | [[on-demand]] |
| 5 | == On-demand Request Tracing |
| 6 | |
Edwin Kempin | 0ade5aa | 2018-08-10 08:49:32 +0200 | [diff] [blame] | 7 | Gerrit supports on-demand tracing of single requests that results in |
| 8 | additional logs with debug information that are written to the |
| 9 | `error_log`. The logs that correspond to a traced request are |
| 10 | associated with a unique trace ID. This trace ID is returned with the |
| 11 | response and can be used by an administrator to find the matching log |
| 12 | entries. |
| 13 | |
| 14 | How tracing is enabled and how the trace ID is returned depends on the |
| 15 | request type: |
| 16 | |
| 17 | * REST API: For REST calls tracing can be enabled by setting the |
Edwin Kempin | 99ccd92 | 2018-09-10 10:06:53 +0200 | [diff] [blame] | 18 | `trace` request parameter or the `X-Gerrit-Trace` header, the trace |
| 19 | ID is returned as `X-Gerrit-Trace` header. More information about |
| 20 | this can be found in the link:rest-api.html#tracing[Request Tracing] |
| 21 | section of the link:rest-api.html[REST API documentation]. |
Edwin Kempin | 0ade5aa | 2018-08-10 08:49:32 +0200 | [diff] [blame] | 22 | * SSH API: For SSH calls tracing can be enabled by setting the |
| 23 | `--trace` option. More information about this can be found in |
| 24 | the link:cmd-index.html#trace[Trace] section of the |
| 25 | link:cmd-index.html[SSH command documentation]. |
Edwin Kempin | c9a076d | 2018-12-13 09:03:37 +0100 | [diff] [blame] | 26 | * Git Push (requires usage of git protocol v2): For Git pushes tracing |
| 27 | can be enabled by setting the `trace` push option, the trace ID is |
| 28 | returned in the command output. More information about this can be |
| 29 | found in the link:user-upload.html#trace[Trace] section of the |
| 30 | link:user-upload.html[upload documentation]. |
| 31 | * Git Clone/Fetch/Ls-Remote (requires usage of git protocol v2): For |
| 32 | Git clone/fetch/ls-remote tracing can be enabled by setting the |
| 33 | `trace` server option. Use '-o trace=<TRACE-ID>' for `git fetch` and |
| 34 | `git ls-remote`, and '--server-option trace=<TRACE-ID>' for |
| 35 | `git clone`. If the `trace` server option is set without a value |
| 36 | (without trace ID) a trace ID is generated but the generated trace ID |
| 37 | is not returned to the client (hence a trace ID should always be |
| 38 | set). |
Edwin Kempin | 0ade5aa | 2018-08-10 08:49:32 +0200 | [diff] [blame] | 39 | |
Edwin Kempin | da82378 | 2018-08-29 22:52:54 +0200 | [diff] [blame] | 40 | When request tracing is enabled it is possible to provide an ID that |
| 41 | should be used as trace ID. If a trace ID is not provided a trace ID is |
| 42 | automatically generated. The trace ID must be provided to the support |
Edwin Kempin | f6de774 | 2019-08-28 15:39:56 +0200 | [diff] [blame] | 43 | team (administrator of the server) so that they can find the trace. |
Edwin Kempin | da82378 | 2018-08-29 22:52:54 +0200 | [diff] [blame] | 44 | |
| 45 | When doing traces it is recommended to specify the ID of the issue |
| 46 | that is being investigated as trace ID so that the traces of the issue |
| 47 | can be found more easily. When the issue ID is used as trace ID there |
| 48 | is no need to find the generated trace ID and report it in the issue. |
| 49 | |
Edwin Kempin | c9ec778 | 2018-08-29 08:45:51 +0200 | [diff] [blame] | 50 | Since tracing consumes additional server resources tracing should only |
| 51 | be enabled for single requests if there is a concrete need for |
| 52 | debugging. In particular bots should never enable tracing for all their |
| 53 | requests by default. |
| 54 | |
Edwin Kempin | b492633 | 2019-08-28 17:07:17 +0200 | [diff] [blame] | 55 | [[auto-retry]] |
| 56 | == Automatic Request Tracing |
| 57 | |
| 58 | Gerrit can be link:config-gerrit.html#retry.retryWithTraceOnFailure[ |
| 59 | configured] to automatically retry requests on non-recoverable failures |
| 60 | with tracing enabled. This allows to automatically captures traces of |
| 61 | these failures for further analysis by the Gerrit administrators. |
| 62 | |
| 63 | The auto-retry on failure behaves the same way as if the calling user |
| 64 | would retry the failed operation with tracing enabled. |
| 65 | |
| 66 | It is expected that the auto-retry fails with the same exception that |
| 67 | triggered the auto-retry, however this is not guaranteed: |
| 68 | |
| 69 | * Not all Gerrit operations are fully atomic and it can happen that |
| 70 | some parts of the operation have been successfully performed before |
| 71 | the failure happened. In this case the auto-retry may fail with a |
| 72 | different exception. |
| 73 | * Some exceptions may mistakenly be considered as non-recoverable and |
| 74 | the auto-retry actually succeeds. |
| 75 | |
| 76 | [[auto-retry-succeeded]] |
| 77 | If an auto-retry succeeds you may consider filing this as |
Edwin Kempin | 135f002 | 2023-06-20 12:30:21 +0000 | [diff] [blame] | 78 | link:https://issues.gerritcodereview.com/issues/new?component=1371020[ |
Marian Harbach | 3425337 | 2019-12-10 18:01:31 +0100 | [diff] [blame] | 79 | Gerrit issue,role=external,window=_blank] so that the Gerrit developers can fix this and treat this |
Edwin Kempin | b492633 | 2019-08-28 17:07:17 +0200 | [diff] [blame] | 80 | exception as recoverable. |
| 81 | |
| 82 | The trace IDs for auto-retries are generated and start with |
Edwin Kempin | 8e8c9d1 | 2019-09-24 08:52:35 +0200 | [diff] [blame] | 83 | `retry-on-failure-`. For REST requests they are returned to the client |
| 84 | as `X-Gerrit-Trace` header. |
Edwin Kempin | b492633 | 2019-08-28 17:07:17 +0200 | [diff] [blame] | 85 | |
| 86 | The best way to search for auto-retries in logs is to do a grep by |
| 87 | `AutoRetry`. For each auto-retry that happened this should match 1 or 2 |
| 88 | log entries: |
| 89 | |
Edwin Kempin | 6a81e44 | 2019-12-18 08:41:21 +0100 | [diff] [blame] | 90 | * one `FINE` log entry with the exception that triggered the auto-retry |
Edwin Kempin | b492633 | 2019-08-28 17:07:17 +0200 | [diff] [blame] | 91 | * one `FINE` log entry with the exception that happened on auto-retry |
| 92 | (if this log entry is not present the operation succeeded on |
| 93 | auto-retry) |
| 94 | |
| 95 | To inspect single auto-retry occurrences in detail you can do a |
| 96 | link:#find-trace[grep by the trace ID]. The trace ID is part of the log |
| 97 | entries which have been found by the previous grep (watch out for |
| 98 | something like: `retry-on-failure-1534166888910-3985dfba`). |
| 99 | |
| 100 | [TIP] |
| 101 | Auto-retrying on failures is only supported by some of the REST |
| 102 | endpoints (change REST endpoints that perform updates). |
| 103 | |
| 104 | [[auto-retry-metrics]] |
| 105 | === Metrics |
| 106 | |
| 107 | If auto-retry is link:config-gerrit.html#retry.retryWithTraceOnFailure[ |
| 108 | enabled] the following metrics are reported: |
| 109 | |
| 110 | * `action/auto_retry_count`: Number of automatic retries with tracing |
| 111 | * `action/failures_on_auto_retry_count`: Number of failures on auto retry |
| 112 | |
| 113 | By comparing the values of these counters one can see how often the |
| 114 | auto-retry succeeds. As explained link:#auto-retry-succeeded[above] if |
| 115 | auto-retries succeed that's an issue with Gerrit that you may want to |
| 116 | report. |
| 117 | |
| 118 | [[find-trace]] |
Edwin Kempin | 0ade5aa | 2018-08-10 08:49:32 +0200 | [diff] [blame] | 119 | == Find log entries for a trace ID |
| 120 | |
| 121 | If tracing is enabled all log messages that correspond to the traced |
| 122 | request have a `TRACE_ID` tag set, e.g.: |
| 123 | |
| 124 | ---- |
| 125 | [2018-08-13 15:28:08,913] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : Received REST request: GET /a/accounts/self (parameters: [trace]) [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ] |
| 126 | [2018-08-13 15:28:08,914] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : Calling user: admin [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ] |
| 127 | [2018-08-13 15:28:08,942] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : REST call succeeded: 200 [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ] |
| 128 | ---- |
| 129 | |
| 130 | By doing a grep with the trace ID over the error log the log entries |
| 131 | that correspond to the request can be found. |
Edwin Kempin | ce92339b | 2018-08-29 09:56:16 +0200 | [diff] [blame] | 132 | |
Edwin Kempin | f6de774 | 2019-08-28 15:39:56 +0200 | [diff] [blame] | 133 | [TIP] |
| 134 | Usually only server administrators have access to the logs. |
| 135 | |
Edwin Kempin | ce92339b | 2018-08-29 09:56:16 +0200 | [diff] [blame] | 136 | == Which information is captured in a trace? |
| 137 | |
| 138 | * request details |
| 139 | ** REST API: request URL, request parameter names, calling user, |
| 140 | response code, response body on errors |
| 141 | ** SSH API: parameter names |
| 142 | ** Git API: push options, magic branch parameter names |
| 143 | * cache misses, cache evictions |
| 144 | * reads from NoteDb, writes to NoteDb |
| 145 | * reads of meta data files, writes of meta data files |
| 146 | * index queries (with parameters and matches) |
| 147 | * reindex events |
| 148 | * permission checks (e.g. which rule is responsible for a deny) |
Edwin Kempin | eb17f3b | 2018-08-29 10:31:33 +0200 | [diff] [blame] | 149 | * timer metrics |
Edwin Kempin | ce92339b | 2018-08-29 09:56:16 +0200 | [diff] [blame] | 150 | * all other logs |
Edwin Kempin | 48a2316 | 2021-09-01 12:48:49 +0200 | [diff] [blame] | 151 | |
| 152 | GERRIT |
| 153 | ------ |
| 154 | Part of link:index.html[Gerrit Code Review] |
| 155 | |
| 156 | SEARCHBOX |
| 157 | --------- |