|  | = Gerrit Code Review - NoteDb Backend | 
|  |  | 
|  | NoteDb is the next generation of Gerrit storage backend, which replaces the | 
|  | traditional SQL backend for change and account metadata with storing data in the | 
|  | same repository as code changes. | 
|  |  | 
|  | .Advantages | 
|  | - *Simplicity*: All data is stored in one location in the site directory, rather | 
|  | than being split between the site directory and a possibly external database | 
|  | server. | 
|  | - *Consistency*: Replication and backups can use a snapshot of the Git | 
|  | repository refs, which will include both the branch and patch set refs, and | 
|  | the change metadata that points to them. | 
|  | - *Auditability*: Rather than storing mutable rows in a database, modifications | 
|  | to changes are stored as a sequence of Git commits, automatically preserving | 
|  | history of the metadata. + | 
|  | There are no strict guarantees, and meta refs may be rewritten, but the | 
|  | default assumption is that all operations are logged. | 
|  | - *Extensibility*: Plugin developers can add new fields to metadata without the | 
|  | core database schema having to know about them. | 
|  | - *New features*: Enables simple federation between Gerrit servers, as well as | 
|  | offline code review and interoperation with other tools. | 
|  |  | 
|  | == Current Status | 
|  |  | 
|  | - Storing change metadata is fully implemented in master, and is live on the | 
|  | servers behind `googlesource.com`. In other words, if you use | 
|  | link:https://gerrit-review.googlesource.com/[gerrit-review], you're already | 
|  | using NoteDb. + | 
|  | - Storing some account data, e.g. user preferences, is implemented in releases | 
|  | back to 2.13. | 
|  | - Storing the rest of account data is a work in progress. | 
|  | - Storing group data is a work in progress. | 
|  |  | 
|  | To use a configuration similar to the current state of `googlesource.com`, paste | 
|  | the following config snippet in your `gerrit.config`: | 
|  |  | 
|  | ---- | 
|  | [noteDb "changes"] | 
|  | write = true | 
|  | read = true | 
|  | primaryStorage = NOTE_DB | 
|  | disableReviewDb = true | 
|  | ---- | 
|  |  | 
|  | This is the configuration that will be produced if you enable experimental | 
|  | NoteDb support on a new site with `init`. | 
|  |  | 
|  | `googlesource.com` additionally uses `fuseUpdates = true`, because its repo | 
|  | backend supports atomic multi-ref transactions. Native JGit doesn't, so setting | 
|  | this option on a dev server would fail. | 
|  |  | 
|  | For an example NoteDb change, poke around at this one: | 
|  | ---- | 
|  | git fetch https://gerrit.googlesource.com/gerrit refs/changes/70/98070/meta \ | 
|  | && git log -p FETCH_HEAD | 
|  | ---- | 
|  |  | 
|  | == Configuration | 
|  |  | 
|  | Account and group data is migrated to NoteDb automatically using the normal | 
|  | schema upgrade process during updates. The remainder of this section details the | 
|  | configuration options that control migration of the change data, which is mostly | 
|  | but not fully implemented. | 
|  |  | 
|  | Change migration state is configured in `gerrit.config` with options like | 
|  | `noteDb.changes.*`. These options are undocumented outside of this file, and the | 
|  | general approach has been to add one new option for each phase of the migration. | 
|  | Assume that each config option in the following list requires all of the | 
|  | previous options, unless otherwise noted. | 
|  |  | 
|  | - `noteDb.changes.write=true`: During a ReviewDb write, the state of the change | 
|  | in NoteDb is written to the `note_db_state` field in the `Change` entity. | 
|  | After the ReviewDb write, this state is written into NoteDb, resulting in | 
|  | effectively double the time for write operations. NoteDb write errors are | 
|  | dropped on the floor, and no attempt is made to read from ReviewDb or correct | 
|  | errors (without additional configuration, below). + | 
|  | This state allows for a rolling update in a multi-master setting, where some | 
|  | servers can start reading from NoteDb, but older servers are still reading | 
|  | only from ReviewDb. | 
|  | - `noteDb.changes.read=true`: Change data is written | 
|  | to and read from NoteDb, but ReviewDb is still the source of truth. During | 
|  | reads, first read the change from ReviewDb, and compare its `note_db_state` | 
|  | with what is in NoteDb. If it doesn't match, immediately "auto-rebuild" the | 
|  | change, copying data from ReviewDb to NoteDb and returning the result. | 
|  | - `noteDb.changes.primaryStorage=NOTE_DB`: New changes are written only to | 
|  | NoteDb, but changes whose primary storage is ReviewDb are still supported. | 
|  | Continues to read from ReviewDb first as in the previous stage, but if the | 
|  | change is not in ReviewDb, falls back to reading from NoteDb. + | 
|  | Migration of existing changes is described in the link:#migration[Migration] | 
|  | section below. + | 
|  | Due to an implementation detail, writes to Changes or related tables still | 
|  | result in write calls to the database layer, but they are inside a transaction | 
|  | that is always rolled back. | 
|  | - `noteDb.changes.disableReviewDb=true`: All access to Changes or related tables | 
|  | is disabled; reads return no results, and writes are no-ops. Assumes the state | 
|  | of all changes in NoteDb is accurate, and so is only safe once all changes are | 
|  | NoteDb primary. Otherwise, reading changes only from NoteDb might result in | 
|  | inaccurate results, and writing to NoteDb would compound the problem. + | 
|  | Thus it is up to an admin of a previously-ReviewDb site to ensure | 
|  | MigratePrimaryStorage has been run for all changes. Note that the current | 
|  | implementation of the `rebuild-note-db` program does not do this. + | 
|  | In this phase, it would be possible to delete the Changes tables out from | 
|  | under a running server with no effect. | 
|  | - `noteDb.changes.fuseUpdates=true`: Code and meta updates within a single | 
|  | repository are fused into a single atomic `BatchRefUpdate`, so they either | 
|  | all succeed or all fail. This mode is only possible on a backend that | 
|  | supports atomic ref updates, which notably excludes the default file repository | 
|  | backend. | 
|  |  | 
|  | [[migration]] | 
|  | == Migration | 
|  |  | 
|  | Once configuration options are set, migration to NoteDb is primarily | 
|  | accomplished by running the `rebuild-note-db` program. Currently, this program | 
|  | bulk copies ReviewDb data into NoteDb, but leaves primary storage of these | 
|  | changes in ReviewDb, so the site is runnable with | 
|  | `noteDb.changes.{write,read}=true`, but ReviewDb is still required. | 
|  |  | 
|  | Eventually, `rebuild-note-db` will set primary storage to NoteDb for all | 
|  | changes by default, so a site will be able to stop using ReviewDb for changes | 
|  | immediately after a successful run. | 
|  |  | 
|  | There is code in `PrimaryStorageMigrator.java` to migrate individual changes | 
|  | from NoteDb primary to ReviewDb primary. This code is not intended to be used | 
|  | except in the event of a critical bug in NoteDb primary changes in production. | 
|  | It will likely never be used by `rebuild-note-db`, and in fact it's not | 
|  | recommended to run `rebuild-note-db` until the code is stable enough that the | 
|  | reverse migration won't be necessary. | 
|  |  | 
|  | === Zero-Downtime Multi-Master Migration | 
|  |  | 
|  | Single-master Gerrit sites can use `rebuild-note-db` on an offline site to | 
|  | rebuild NoteDb, but this doesn't work in a zero-downtime environment like | 
|  | googlesource.com. | 
|  |  | 
|  | Here, the migration process looks like: | 
|  |  | 
|  | - Turn on `noteDb.changes.write=true` to start writing to NoteDb. | 
|  | - Run a parallel link:https://research.google.com/pubs/pub35650.html[FlumeJava] | 
|  | pipeline to write NoteDb data for all changes, and update all `note_db_state` | 
|  | fields. (Sorry, this implementation is entirely closed-source.) | 
|  | - Turn on `noteDb.changes.read=true` to start reading from NoteDb. | 
|  | - Turn on `noteDb.changes.primaryStorage=NOTE_DB` to start writing new changes | 
|  | to NoteDb only. | 
|  | - Run a Flume to migrate all existing changes to NoteDb primary. (Also | 
|  | closed-source, but basically just a wrapper around `PrimaryStorageMigrator`.) | 
|  | - Turn off access to ReviewDb changes tables. |