Marian Harbach | ebeb154 | 2019-12-13 10:42:46 +0100 | [diff] [blame] | 1 | :linkattrs: |
Dave Borowitz | 5a8e44b | 2017-02-21 16:25:35 -0500 | [diff] [blame] | 2 | = Gerrit Code Review - NoteDb Backend |
| 3 | |
Han-Wen Nienhuys | 3c670be | 2020-06-30 19:17:07 +0200 | [diff] [blame] | 4 | NoteDb is the storage backend for code review metadata. It is based on |
| 5 | Git, so code reviews are stored together with the code under review. |
| 6 | NoteDb replaced the traditional SQL backend for change, account and group |
| 7 | metadata that was used in the 2.x series. |
Dave Borowitz | 5a8e44b | 2017-02-21 16:25:35 -0500 | [diff] [blame] | 8 | |
| 9 | .Advantages |
| 10 | - *Simplicity*: All data is stored in one location in the site directory, rather |
| 11 | than being split between the site directory and a possibly external database |
| 12 | server. |
| 13 | - *Consistency*: Replication and backups can use a snapshot of the Git |
| 14 | repository refs, which will include both the branch and patch set refs, and |
| 15 | the change metadata that points to them. |
| 16 | - *Auditability*: Rather than storing mutable rows in a database, modifications |
| 17 | to changes are stored as a sequence of Git commits, automatically preserving |
| 18 | history of the metadata. + |
| 19 | There are no strict guarantees, and meta refs may be rewritten, but the |
| 20 | default assumption is that all operations are logged. |
| 21 | - *Extensibility*: Plugin developers can add new fields to metadata without the |
| 22 | core database schema having to know about them. |
| 23 | - *New features*: Enables simple federation between Gerrit servers, as well as |
| 24 | offline code review and interoperation with other tools. |
| 25 | |
Dave Borowitz | 5a8e44b | 2017-02-21 16:25:35 -0500 | [diff] [blame] | 26 | For an example NoteDb change, poke around at this one: |
| 27 | ---- |
| 28 | git fetch https://gerrit.googlesource.com/gerrit refs/changes/70/98070/meta \ |
| 29 | && git log -p FETCH_HEAD |
| 30 | ---- |
| 31 | |
Han-Wen Nienhuys | 3c670be | 2020-06-30 19:17:07 +0200 | [diff] [blame] | 32 | == Current Status |
| 33 | |
| 34 | NoteDb is the only database format supported by Gerrit 3.0+. The |
| 35 | change data migration tools are only included in Gerrit 2.16; they are |
| 36 | not available in 3.0, so any upgrade from Gerrit 2.x to 3.x must go through |
| 37 | 2.16 to effect the NoteDb upgrade. |
| 38 | |
Han-Wen Nienhuys | 29cf0da | 2021-04-06 12:49:40 +0200 | [diff] [blame] | 39 | == Format |
| 40 | |
| 41 | Each review ("change") in Gerrit is numbered. The different revisions |
| 42 | ("patchsets") of a change 12345 are stored under |
| 43 | ---- |
| 44 | refs/changes/45/12345/${PATCHSET_NUMBER} |
| 45 | ---- |
| 46 | |
| 47 | The revisions are stored as commits to the main project, ie. if you |
| 48 | fetch this ref, you can check out the proposed change. |
| 49 | |
| 50 | A change 12345 has its review metadata under |
| 51 | ---- |
| 52 | refs/changes/45/12345/meta |
| 53 | ---- |
| 54 | The metadata is a notes branch. The commit messages on the branch hold |
| 55 | modifications to global data of the change (votes, global comments). The inline |
| 56 | comments are in a |
Matthias Sohn | d0c3bdf | 2024-09-04 17:59:20 +0200 | [diff] [blame] | 57 | link:https://eclipse.gerrithub.io/plugins/gitiles/eclipse-jgit/jgit/\+/refs/heads/master/org.eclipse.jgit/src/org/eclipse/jgit/notes/NoteMap.java[NoteMap], |
Han-Wen Nienhuys | 29cf0da | 2021-04-06 12:49:40 +0200 | [diff] [blame] | 58 | where the key is the commit SHA-1 of the patchset |
| 59 | that the comment refers to, and the value is JSON data. The format of the |
| 60 | JSON is in the |
| 61 | link:https://gerrit.googlesource.com/gerrit/\+/master/java/com/google/gerrit/server/notedb/RevisionNoteData.java[RevisionNoteData] |
| 62 | which contains |
| 63 | link:https://gerrit.googlesource.com/gerrit/\+/master/java/com/google/gerrit/entities/Comment.java[Comment] entities. |
| 64 | |
| 65 | For example: |
| 66 | ---- |
| 67 | { |
| 68 | "key": { |
| 69 | "uuid": "c7be1334_47885e36", |
| 70 | "filename": |
| 71 | "java/com/google/gerrit/server/restapi/project/CommitsCollection.java", |
| 72 | "patchSetId": 7 |
| 73 | }, |
| 74 | "lineNbr": 158, |
| 75 | "author": { |
| 76 | "id": 1026112 |
| 77 | }, |
| 78 | "writtenOn": "2019-11-06T09:00:50Z", |
| 79 | "side": 1, |
| 80 | "message": "nit: factor this out in a variable, use |
| 81 | toImmutableList as collector", |
| 82 | "range": { |
| 83 | "startLine": 156, |
| 84 | "startChar": 32, |
| 85 | "endLine": 158, |
| 86 | "endChar": 66 |
| 87 | }, |
| 88 | "revId": "071c601d6ee1a2a9f520415fd9efef8e00f9cf60", |
| 89 | "serverId": "173816e5-2b9a-37c3-8a2e-48639d4f1153", |
| 90 | "unresolved": true |
| 91 | }, |
| 92 | ---- |
| 93 | |
| 94 | Automated systems may post "robot comments" instead of normal |
| 95 | comments, which are an extension of the previous comment, defined in |
| 96 | the |
| 97 | link:https://gerrit.googlesource.com/gerrit/\+/master/java/com/google/gerrit/entities/RobotComment.java[RobotComment] |
| 98 | class. |
| 99 | |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 100 | [[migration]] |
| 101 | == Migration |
| 102 | |
| 103 | Migrating change metadata can take a long time for large sites, so |
| 104 | administrators choose whether to do the migration offline or online, depending |
| 105 | on their available resources and tolerance for downtime. |
| 106 | |
| 107 | Only change metadata requires manual steps to migrate it from ReviewDb; account |
| 108 | and group data is migrated automatically by `gerrit.war init`. |
| 109 | |
| 110 | [[online-migration]] |
| 111 | === Online |
| 112 | |
David Pursehouse | 93d6132 | 2019-05-22 10:38:20 +0900 | [diff] [blame] | 113 | Note that online migration is only available in 2.x. To do the online migration |
| 114 | from 2.14.x or 2.15.x to 3.0, it is necessary to first upgrade to 2.16.x. |
| 115 | |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 116 | To start the online migration, set the `noteDb.changes.autoMigrate` option in |
| 117 | `gerrit.config` and restart Gerrit: |
| 118 | ---- |
| 119 | [noteDb "changes"] |
| 120 | autoMigrate = true |
| 121 | ---- |
| 122 | |
| 123 | Alternatively, pass the `--migrate-to-note-db` flag to |
| 124 | `gerrit.war daemon`: |
| 125 | ---- |
| 126 | java -jar gerrit.war daemon -d /path/to/site --migrate-to-note-db |
| 127 | ---- |
| 128 | |
| 129 | Both ways of starting the online migration are equivalent. Once started, it is |
| 130 | safe to restart the server at any time; the migration will pick up where it left |
| 131 | off. Migration progress will be reported to the Gerrit logs. |
| 132 | |
| 133 | *Advantages* |
| 134 | |
| 135 | * No downtime required. |
| 136 | |
| 137 | *Disadvantages* |
| 138 | |
David Pursehouse | 93d6132 | 2019-05-22 10:38:20 +0900 | [diff] [blame] | 139 | * Only available in 2.x; not available in Gerrit 3.0. |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 140 | * Much slower than offline; uses only a single thread, to leave resources |
| 141 | available for serving traffic. |
| 142 | * Performance may be degraded, particularly of updates; data needs to be written |
| 143 | to both ReviewDb and NoteDb while the migration is in progress. |
| 144 | |
| 145 | [[offline-migration]] |
| 146 | === Offline |
| 147 | |
| 148 | To run the offline migration, run the `migrate-to-note-db` program: |
| 149 | ---- |
Makson Lee | 060b815 | 2017-10-02 01:58:15 +0000 | [diff] [blame] | 150 | java -jar gerrit.war migrate-to-note-db -d /path/to/site |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 151 | ---- |
| 152 | |
| 153 | Once started, it is safe to cancel and restart the migration process, or to |
| 154 | switch to the online process. |
| 155 | |
Dave Borowitz | 68a3e12 | 2018-02-07 11:14:06 -0500 | [diff] [blame] | 156 | [NOTE] |
| 157 | Migration requires a heap size comparable to running a Gerrit server. If you |
| 158 | normally run `gerrit.war daemon` with an `-Xmx` flag, pass that to the migration |
| 159 | tool as well. |
| 160 | |
Christian Aistleitner | d2d5658 | 2020-06-15 16:17:17 +0200 | [diff] [blame] | 161 | [NOTE] |
| 162 | Note that by appending `--reindex false` to the above command, you can skip the |
David Pursehouse | 1f7c385 | 2020-06-16 09:22:37 +0900 | [diff] [blame] | 163 | lengthy, implicit reindexing step of the migration. This is useful if you plan |
Christian Aistleitner | d2d5658 | 2020-06-15 16:17:17 +0200 | [diff] [blame] | 164 | to perform further Gerrit upgrades while the server is offline and have to |
David Pursehouse | 1f7c385 | 2020-06-16 09:22:37 +0900 | [diff] [blame] | 165 | reindex later anyway (E.g.: a follow-up upgrade to Gerrit 3.2 or newer, which |
| 166 | requires to reindex changes anyway). |
Christian Aistleitner | d2d5658 | 2020-06-15 16:17:17 +0200 | [diff] [blame] | 167 | |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 168 | *Advantages* |
| 169 | |
| 170 | * Much faster than online; can use all available CPUs, since no live traffic |
| 171 | needs to be served. |
| 172 | * No degraded performance of live servers due to writing data to 2 locations. |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 173 | |
| 174 | *Disadvantages* |
| 175 | |
Paladox none | 98ec67e | 2019-06-03 14:49:47 +0000 | [diff] [blame] | 176 | * Available in Gerrit 2.15 and 2.16 only. |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 177 | * May require substantial downtime; takes about twice as long as an |
Adam Yi | 9703538 | 2018-01-18 09:55:08 +0000 | [diff] [blame] | 178 | link:pgm-reindex.html[offline reindex]. (In fact, one of the migration steps is a |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 179 | full reindex, so it can't possibly take less time.) |
| 180 | |
Dave Borowitz | d0e461a | 2017-08-15 15:37:52 -0400 | [diff] [blame] | 181 | [[trial-migration]] |
| 182 | ==== Trial mode |
| 183 | |
Dave Borowitz | d1251e5 | 2017-09-06 08:35:36 -0400 | [diff] [blame] | 184 | The migration tool also supports "trial mode", where changes are |
Dave Borowitz | d0e461a | 2017-08-15 15:37:52 -0400 | [diff] [blame] | 185 | migrated to NoteDb and read from NoteDb at runtime, but their primary storage |
| 186 | location is still ReviewDb, and data is kept in sync between the two locations. |
| 187 | |
Dave Borowitz | d1251e5 | 2017-09-06 08:35:36 -0400 | [diff] [blame] | 188 | To run the migration in trial mode, add `--trial` to `migrate-to-note-db` or |
| 189 | `daemon`: |
Dave Borowitz | d0e461a | 2017-08-15 15:37:52 -0400 | [diff] [blame] | 190 | ---- |
Makson Lee | 060b815 | 2017-10-02 01:58:15 +0000 | [diff] [blame] | 191 | java -jar gerrit.war migrate-to-note-db --trial -d /path/to/site |
Dave Borowitz | d1251e5 | 2017-09-06 08:35:36 -0400 | [diff] [blame] | 192 | # OR |
| 193 | java -jar gerrit.war daemon -d /path/to/site --migrate-to-note-db --trial |
Dave Borowitz | d0e461a | 2017-08-15 15:37:52 -0400 | [diff] [blame] | 194 | ---- |
| 195 | |
Dave Borowitz | d1251e5 | 2017-09-06 08:35:36 -0400 | [diff] [blame] | 196 | Or, set `noteDb.changes.trial=true` in `gerrit.config`. |
| 197 | |
Dave Borowitz | d0e461a | 2017-08-15 15:37:52 -0400 | [diff] [blame] | 198 | There are several use cases for trial mode: |
| 199 | |
| 200 | * Help test early releases of the migration tool for bugs with lower risk. |
| 201 | * Try out new NoteDb-only features like |
Adam Yi | 9703538 | 2018-01-18 09:55:08 +0000 | [diff] [blame] | 202 | link:rest-api-changes.html#get-hashtags[hashtags] without running the full |
Dave Borowitz | d0e461a | 2017-08-15 15:37:52 -0400 | [diff] [blame] | 203 | migration. |
| 204 | |
| 205 | To continue with the full migration after running the trial migration, use |
| 206 | either the online or offline migration steps as normal. To revert to |
| 207 | ReviewDb-only, remove `noteDb.changes.read` and `noteDb.changes.write` from |
Dave Borowitz | b03c7eb | 2017-09-08 08:15:03 -0400 | [diff] [blame] | 208 | `notedb.config` and restart Gerrit. |
Dave Borowitz | d0e461a | 2017-08-15 15:37:52 -0400 | [diff] [blame] | 209 | |
Dave Borowitz | 5a8e44b | 2017-02-21 16:25:35 -0500 | [diff] [blame] | 210 | == Configuration |
| 211 | |
Dave Borowitz | b03c7eb | 2017-09-08 08:15:03 -0400 | [diff] [blame] | 212 | The migration process works by setting a configuration option in `notedb.config` |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 213 | for each step in the process, then performing the corresponding data migration. |
Dave Borowitz | b03c7eb | 2017-09-08 08:15:03 -0400 | [diff] [blame] | 214 | |
| 215 | Config options are read from `notedb.config` first, falling back to |
| 216 | `gerrit.config`. If editing config manually, you may edit either file, but the |
| 217 | migration process itself only touches `notedb.config`. This means if your |
| 218 | `gerrit.config` is managed with Puppet or a similar tool, it can overwrite |
| 219 | `gerrit.config` without affecting the migration process. You should not manage |
| 220 | `notedb.config` with Puppet, but you may copy values back into `gerrit.config` |
| 221 | and delete `notedb.config` at some later point after completing the migration. |
| 222 | |
| 223 | In general, users should not set the options described below manually; this |
| 224 | section serves primarily as a reference. |
Dave Borowitz | 5a8e44b | 2017-02-21 16:25:35 -0500 | [diff] [blame] | 225 | |
| 226 | - `noteDb.changes.write=true`: During a ReviewDb write, the state of the change |
| 227 | in NoteDb is written to the `note_db_state` field in the `Change` entity. |
| 228 | After the ReviewDb write, this state is written into NoteDb, resulting in |
| 229 | effectively double the time for write operations. NoteDb write errors are |
| 230 | dropped on the floor, and no attempt is made to read from ReviewDb or correct |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 231 | errors (without additional configuration, below). |
Dave Borowitz | 5a8e44b | 2017-02-21 16:25:35 -0500 | [diff] [blame] | 232 | - `noteDb.changes.read=true`: Change data is written |
| 233 | to and read from NoteDb, but ReviewDb is still the source of truth. During |
| 234 | reads, first read the change from ReviewDb, and compare its `note_db_state` |
| 235 | with what is in NoteDb. If it doesn't match, immediately "auto-rebuild" the |
| 236 | change, copying data from ReviewDb to NoteDb and returning the result. |
| 237 | - `noteDb.changes.primaryStorage=NOTE_DB`: New changes are written only to |
| 238 | NoteDb, but changes whose primary storage is ReviewDb are still supported. |
| 239 | Continues to read from ReviewDb first as in the previous stage, but if the |
| 240 | change is not in ReviewDb, falls back to reading from NoteDb. + |
| 241 | Migration of existing changes is described in the link:#migration[Migration] |
Dave Borowitz | 52e30e7 | 2017-08-15 13:16:27 -0400 | [diff] [blame] | 242 | section above. + |
Dave Borowitz | 5a8e44b | 2017-02-21 16:25:35 -0500 | [diff] [blame] | 243 | Due to an implementation detail, writes to Changes or related tables still |
| 244 | result in write calls to the database layer, but they are inside a transaction |
| 245 | that is always rolled back. |
Dave Borowitz | 22b8412 | 2017-02-08 12:04:33 -0500 | [diff] [blame] | 246 | - `noteDb.changes.disableReviewDb=true`: All access to Changes or related tables |
| 247 | is disabled; reads return no results, and writes are no-ops. Assumes the state |
| 248 | of all changes in NoteDb is accurate, and so is only safe once all changes are |
| 249 | NoteDb primary. Otherwise, reading changes only from NoteDb might result in |
| 250 | inaccurate results, and writing to NoteDb would compound the problem. + |
Fabio Ponciroli | 9d2e647 | 2019-11-28 17:36:10 -0800 | [diff] [blame] | 251 | |
| 252 | == NoteDB to ReviewDB rollback |
| 253 | |
| 254 | In case of rollback from NoteDB to ReviewDB, all the meta refs and the |
| 255 | sequence ref need to be removed. |
Kenyon Ralph | 0e06e03 | 2022-01-17 10:11:11 -0800 | [diff] [blame] | 256 | The link:https://gerrit.googlesource.com/gerrit/+/refs/heads/master/contrib/remove-notedb-refs.sh[remove-notedb-refs.sh,role=external,window=_blank] |
Fabio Ponciroli | 9d2e647 | 2019-11-28 17:36:10 -0800 | [diff] [blame] | 257 | script has been written to automate this process. |