blob: 7a7cef2d86695a96fa53e7b851945178e5bc5aa5 [file] [log] [blame]
Marian Harbachebeb1542019-12-13 10:42:46 +01001:linkattrs:
Edwin Kempin311d5702017-07-28 15:10:24 +02002= Gerrit Code Review - Accounts
3
4== Overview
5
Edwin Kempin7a16bee2021-09-07 15:24:18 +02006Gerrit accounts are stored in link:note-db.html[NoteDb].
Edwin Kempin311d5702017-07-28 15:10:24 +02007
8The account data consists of a sequence number (account ID), account
Ben Rohlfs9c960cb2020-02-02 22:12:30 +01009properties (full name, display name, preferred email, registration
10date, status, inactive flag), preferences (general, diff and edit
11preferences), project watches, SSH keys, external IDs, starred changes
12and reviewed flags.
Edwin Kempin311d5702017-07-28 15:10:24 +020013
14Most account data is stored in a special link:#all-users[All-Users]
15repository, which has one branch per user. Within the user branch there
16are Git config files for the link:#account-properties[
17account properties], the link:#preferences[account preferences] and the
18link:#project-watches[project watches]. In addition there is an
19`authorized_keys` file for the link:#ssh-keys[SSH keys] that follows
20the standard OpenSSH file format.
21
22The account data in the user branch is versioned and the Git history of
23this branch serves as an audit log.
24
25The link:#external-ids[external IDs] are stored as Git Notes inside the
26`All-Users` repository in the `refs/meta/external-ids` notes branch.
27Storing all external IDs in a notes branch ensures that each external
28ID is only used once.
29
30The link:#starred-changes[starred changes] are represented as
31independent refs in the `All-Users` repository. They are not stored in
32the user branch, since this data doesn't need versioning.
33
34The link:#reviewed-flags[reviewed flags] are not stored in Git, but are
35persisted in a database table. This is because there is a high volume
36of reviewed flags and storing them in Git would be inefficient.
37
38Since accessing the account data in Git is not fast enough for account
39queries, e.g. when suggesting reviewers, Gerrit has a
40link:#account-index[secondary index for accounts].
41
42[[all-users]]
43== `All-Users` repository
44
45The `All-Users` repository is a special repository that only contains
46user-specific information. It contains one branch per user. The user
47branch is formatted as `refs/users/CD/ABCD`, where `CD/ABCD` is the
48link:access-control.html#sharded-user-id[sharded account ID], e.g. the
49user branch for account `1000856` is `refs/users/56/1000856`. The
50account IDs in the user refs are sharded so that there is a good
51distribution of the Git data in the storage system.
52
53A user branch must exist for each account, as it represents the
54account. The files in the user branch are all optional. This means
55having a user branch with a tree that is completely empty is also a
56valid account definition.
57
58Updates to the user branch are done through the
59link:rest-api-accounts.html[Gerrit REST API], but users can also
60manually fetch their user branch and push changes back to Gerrit. On
61push the user data is evaluated and invalid user data is rejected.
62
63To hide the implementation detail of the sharded account ID in the ref
64name Gerrit offers a magic `refs/users/self` ref that is automatically
65resolved to the user branch of the calling user. The user can then use
66this ref to fetch from and push to the own user branch. E.g. if user
67`1000856` pushes to `refs/users/self`, the branch
68`refs/users/56/1000856` is updated. In Gerrit `self` is an established
69term to refer to the calling user (e.g. in change queries). This is why
70the magic ref for the own user branch is called `refs/users/self`.
71
72A user branch should only be readable and writeable by the user to whom
73the account belongs. To assign permissions on the user branches the
74normal branch permission system is used. In the permission system the
75user branches are specified as `refs/users/${shardeduserid}`. The
76`${shardeduserid}` variable is resolved to the sharded account ID. This
77variable is used to assign default access rights on all user branches
78that apply only to the owning user. The following permissions are set
79by default when a Gerrit site is newly installed or upgraded to a
80version which supports user branches:
81
82.All-Users project.config
83----
84[access "refs/users/${shardeduserid}"]
85 exclusiveGroupPermissions = read push submit
86 read = group Registered Users
87 push = group Registered Users
88 label-Code-Review = -2..+2 group Registered Users
89 submit = group Registered Users
90----
91
92The user branch contains several files with account data which are
93described link:#account-data-in-user-branch[below].
94
95In addition to the user branches the `All-Users` repository also
96contains a branch for the link:#external-ids[external IDs] and special
97refs for the link:#starred-changes[starred changes].
98
99Also the next available value of the link:#account-sequence[account
100sequence] is stored in the `All-Users` repository.
101
102[[account-index]]
103== Account Index
104
105There are several situations in which Gerrit needs to query accounts,
106e.g.:
107
108* For sending email notifications to project watchers.
109* For reviewer suggestions.
110
111Accessing the account data in Git is not fast enough for account
112queries, since it requires accessing all user branches and parsing
113all files in each of them. To overcome this Gerrit has a secondary
114index for accounts. The account index is either based on
115link:config-gerrit.html#index.type[Lucene or Elasticsearch].
116
117Via the link:rest-api-accounts.html#query-account[Query Account] REST
118endpoint link:user-search-accounts.html[generic account queries] are
119supported.
120
121Accounts are automatically reindexed on any update. The
122link:rest-api-accounts.html#index-account[Index Account] REST endpoint
123allows to reindex an account manually. In addition the
124link:pgm-reindex.html[reindex] program can be used to reindex all
125accounts offline.
126
127[[account-data-in-user-branch]]
128== Account Data in User Branch
129
130A user branch contains several Git config files with the account data:
131
132* `account.config`:
133+
134Stores the link:#account-properties[account properties].
135
136* `preferences.config`:
137+
138Stores the link:#preferences[user preferences] of the account.
139
140* `watch.config`:
141+
142Stores the link:#project-watches[project watches] of the account.
143
144In addition it contains an
145link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
Marian Harbach34253372019-12-10 18:01:31 +0100146authorized_keys,role=external,window=_blank] file with the link:#ssh-keys[SSH keys] of the account.
Edwin Kempin311d5702017-07-28 15:10:24 +0200147
148[[account-properties]]
149=== Account Properties
150
151The account properties are stored in the user branch in the
152`account.config` file:
153
154----
155[account]
156 fullName = John Doe
Ben Rohlfs9c960cb2020-02-02 22:12:30 +0100157 displayName = John
Edwin Kempin311d5702017-07-28 15:10:24 +0200158 preferredEmail = john.doe@example.com
159 status = OOO
160 active = false
161----
162
163For active accounts the `active` parameter can be omitted.
164
165The registration date is not contained in the `account.config` file but
166is derived from the timestamp of the first commit on the user branch.
167
168When users update their account properties by pushing to the user
169branch, it is verified that the preferred email exists in the external
170IDs.
171
172Users are not allowed to flip the active value themselves; only
173administrators and users with the
174link:access-control.html#capability_modifyAccount[Modify Account]
175global capability are allowed to change it.
176
177Since all data in the `account.config` file is optional the
178`account.config` file may be absent from some user branches.
179
180[[preferences]]
181=== Preferences
182
183The account properties are stored in the user branch in the
184`preferences.config` file. There are separate sections for
185link:intro-user.html#preferences[general],
186link:user-review-ui.html#diff-preferences[diff] and edit preferences:
187
188----
Edwin Kempin311d5702017-07-28 15:10:24 +0200189[diff]
190 hideTopMenu = true
191[edit]
192 lineLength = 80
193----
194
195The parameter names match the names that are used in the preferences REST API:
196
197* link:rest-api-accounts.html#preferences-info[General Preferences]
198* link:rest-api-accounts.html#diff-preferences-info[Diff Preferences]
199* link:rest-api-accounts.html#edit-preferences-info[Edit Preferences]
200
201If the value for a preference is the same as the default value for this
David Pursehouse6d2ae282019-01-09 11:00:27 +0900202preference, it can be omitted in the `preferences.config` file.
Edwin Kempin311d5702017-07-28 15:10:24 +0200203
Edwin Kempin1e01692e2018-01-17 11:01:00 +0100204Defaults for preferences that apply for all accounts can be configured
205in the `refs/users/default` branch in the `All-Users` repository.
Edwin Kempin311d5702017-07-28 15:10:24 +0200206
207[[project-watches]]
208=== Project Watches
209
210Users can configure watches on projects to receive email notifications
211for changes of that project.
212
213A watch configuration consists of the project name and an optional
214filter query. If a filter query is specified, email notifications will
215be sent only for changes of that project that match this query.
216
217In addition, each watch configuration can contain a list of
218notification types that determine for which events email notifications
219should be sent. E.g. a user can configure that email notifications
220should only be sent if a new patch set is uploaded and when the change
221gets submitted, but not on other events.
222
223Project watches are stored in a `watch.config` file in the user branch:
224
225----
226[project "foo"]
227 notify = * [ALL_COMMENTS]
228 notify = branch:master [ALL_COMMENTS, NEW_PATCHSETS]
229 notify = branch:master owner:self [SUBMITTED_CHANGES]
230----
231
232The `watch.config` file has one project section for all project watches
233of a project. The project name is used as subsection name and the
234filters with the notification types, that decide for which events email
235notifications should be sent, are represented as `notify` values in the
236subsection. A `notify` value is formatted as
237"<filter> [<comma-separated-list-of-notification-types>]". The
238supported notification types are described in the
239link:user-notify.html#notify.name.type[Email Notifications documentation].
240
241For a change event, a notification will be sent if any `notify` value
242of the corresponding project has both a filter that matches the change
243and a notification type that matches the event.
244
245In order to send email notifications on change events, Gerrit needs to
246find all accounts that watch the corresponding project. To make this
247lookup fast the secondary account index is used. The account index
248contains a repeated field that stores the projects that are being
249watched by an account. After the accounts that watch the project have
250been retrieved from the index, the complete watch configuration is
251available from the account cache and Gerrit can check if any watch
252matches the change and the event.
253
254[[ssh-keys]]
255=== SSH Keys
256
257SSH keys are stored in the user branch in an `authorized_keys` file,
258which is the
259link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
Marian Harbach34253372019-12-10 18:01:31 +0100260standard OpenSSH file format,role=external,window=_blank] for storing SSH keys:
Edwin Kempin311d5702017-07-28 15:10:24 +0200261
262----
263ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCgug5VyMXQGnem2H1KVC4/HcRcD4zzBqSuJBRWVonSSoz3RoAZ7bWXCVVGwchtXwUURD689wFYdiPecOrWOUgeeyRq754YWRhU+W28vf8IZixgjCmiBhaL2gt3wff6pP+NXJpTSA4aeWE5DfNK5tZlxlSxqkKOS8JRSUeNQov5Tw== john.doe@example.com
264# DELETED
265# INVALID ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDm5yP7FmEoqzQRDyskX+9+N0q9GrvZeh5RG52EUpE4ms/Ujm3ewV1LoGzc/lYKJAIbdcZQNJ9+06EfWZaIRA3oOwAPe1eCnX+aLr8E6Tw2gDMQOGc5e9HfyXpC2pDvzauoZNYqLALOG3y/1xjo7IH8GYRS2B7zO/Mf9DdCcCKSfw== john.doe@example.com
266ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCaS7RHEcZ/zjl9hkWkqnm29RNr2OQ/TZ5jk2qBVMH3BgzPsTsEs+7ag9tfD8OCj+vOcwm626mQBZoR2e3niHa/9gnHBHFtOrGfzKbpRjTWtiOZbB9HF+rqMVD+Dawo/oicX/dDg7VAgOFSPothe6RMhbgWf84UcK5aQd5eP5y+tQ== john.doe@example.com
267----
268
269When the SSH API is used, Gerrit needs an efficient way to lookup SSH
270keys by username. Since the username can be easily resolved to an
271account ID (via the account cache), accessing the SSH keys in the user
272branch is fast.
273
274To identify SSH keys in the REST API Gerrit uses
275link:rest-api-accounts.html#ssh-key-id[sequence numbers per account].
276This is why the order of the keys in the `authorized_keys` file is
David Pursehouse0da65cf2019-05-27 16:25:27 +0900277used to determine the sequence numbers of the keys (the sequence
Edwin Kempin311d5702017-07-28 15:10:24 +0200278numbers start at 1).
279
280To keep the sequence numbers intact when a key is deleted, a
281'# DELETED' line is inserted at the position where the key was deleted.
282
283Invalid keys are marked with the prefix '# INVALID'.
284
285[[external-ids]]
286== External IDs
287
David Pursehousef7f400c2019-05-27 20:22:08 +0900288External IDs are used to link identities, such as the username and email
289addresses, and external identies such as an LDAP account or an OAUTH
290identity, to an account in Gerrit.
Edwin Kempin311d5702017-07-28 15:10:24 +0200291
292External IDs are stored as Git Notes in the `All-Users` repository. The
293name of the notes branch is `refs/meta/external-ids`.
294
Han-Wen Nienhuys37a1cab2021-04-01 12:46:00 +0200295As note key the SHA-1 of the external ID key is used, for example the key
David Pursehouse05c4cba2019-05-27 16:25:58 +0900296for the external ID `username:jdoe` is `e0b751ae90ef039f320e097d7d212f490e933706`.
297This ensures that an external ID is used only once (e.g. an external ID can
298never be assigned to multiple accounts at a point in time).
Edwin Kempin311d5702017-07-28 15:10:24 +0200299
Thomas Draebing29595cd2021-03-24 15:23:35 +0100300By default, the SHA-1 sum is computed preserving the case of the external ID. If
301auth.userNameCaseInsensitive` is set to `true`, the SHA-1 sum of external IDs
302in the `gerrit:` and `username:` schemes are computed from the all lowercase
303external ID. This enables case insensitive username handling. The case of the
304external ID is however preserved by using the original capitalization in the
305note content.
306
Han-Wen Nienhuys37a1cab2021-04-01 12:46:00 +0200307The following commands show how to find the SHA-1 of an external ID:
Saša Živkov7bc25422019-12-13 12:20:44 +0100308
309----
310$ echo -n 'gerrit:jdoe' | shasum
3117c2a55657d911109dbc930836e7a770fb946e8ef -
312
313$ echo -n 'username:jdoe' | shasum
314e0b751ae90ef039f320e097d7d212f490e933706 -
315----
316
Edwin Kempin7ff264d2018-09-20 09:48:03 +0200317[IMPORTANT]
318If the external ID key is changed manually you must adapt the note key
Han-Wen Nienhuys37a1cab2021-04-01 12:46:00 +0200319to the new SHA-1, otherwise the external ID becomes inconsistent and is
Edwin Kempin7ff264d2018-09-20 09:48:03 +0200320ignored by Gerrit.
321
Edwin Kempin311d5702017-07-28 15:10:24 +0200322The note content is a Git config file:
323
324----
325[externalId "username:jdoe"]
326 accountId = 1003407
327 email = jdoe@example.com
328 password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7
329----
330
Han-Wen Nienhuys37a1cab2021-04-01 12:46:00 +0200331Once SHA-1 of an external ID is known the following command can be used to
Saša Živkov3c6b3c82019-12-13 12:27:42 +0100332show the content of the note:
333
334----
335$ echo -n 'gerrit:jdoe' | shasum
3367c2a55657d911109dbc930836e7a770fb946e8ef -
337
338$ git show refs/meta/external-ids:7c/2a55657d911109dbc930836e7a770fb946e8ef
339[externalId "username:jdoe"]
340 accountId = 1003407
341 email = jdoe@example.com
342 password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7
343----
344
David Pursehouse0da65cf2019-05-27 16:25:27 +0900345The config file has one `externalId` section. The external ID key, which
346consists of scheme and ID in the format '<scheme>:<id>', is used as
Edwin Kempin311d5702017-07-28 15:10:24 +0200347subsection name.
348
David Pursehouse0da65cf2019-05-27 16:25:27 +0900349The `accountId` field is mandatory. The `email` and `password` fields
Edwin Kempin311d5702017-07-28 15:10:24 +0200350are optional.
351
Clark Boylana888eec2020-11-23 13:10:41 -0800352Note that git will automatically nest these notes at varying levels. If
353refs/meta/external-ids:7c/2a55657d911109dbc930836e7a770fb946e8ef is not
354found then check
355refs/meta/external-ids:7c/2a/55657d911109dbc930836e7a770fb946e8ef and
356so on.
357
David Pursehouse0da65cf2019-05-27 16:25:27 +0900358The external IDs are maintained by Gerrit. This means users are not
Edwin Kempin311d5702017-07-28 15:10:24 +0200359allowed to manually edit their external IDs. Only users with the
Edwin Kempin47dd7ba2017-08-31 11:33:44 +0200360link:access-control.html#capability_accessDatabase[Access Database]
361global capability can push updates to the `refs/meta/external-ids`
362branch. However Gerrit rejects pushes if:
Edwin Kempin311d5702017-07-28 15:10:24 +0200363
364* any external ID config file cannot be parsed
365* if a note key does not match the SHA of the external ID key in the
366 note content
367* external IDs for non-existing accounts are contained
368* invalid emails are contained
369* any email is not unique (the same email is assigned to multiple
370 accounts)
371* hashed passwords of external IDs with scheme `username` cannot be
372 decoded
373
Clark Boylanf53508e2021-07-14 08:57:38 -0700374Users can edit some external IDs via the user settings page or the
375REST API. Note that email addresses cannot be deleted if they are
376associated with the user's login credentials external ID, for
377example the email address associated with an OpenId or OAUTH external
378ID. If users wish to remove these email addresses from Gerrit they must
379first update the external authentication record in that system,
380log in to Gerrit, then Gerrit will update the external ID record with
381the new email address.
382
Edwin Kempin311d5702017-07-28 15:10:24 +0200383[[starred-changes]]
384== Starred Changes
385
386link:dev-stars.html[Starred changes] allow users to mark changes as
387favorites and receive email notifications for them.
388
389Each starred change is a tuple of an account ID, a change ID and a
390label.
391
392To keep track of a change that is starred by an account, Gerrit creates
393a `refs/starred-changes/YY/XXXX/ZZZZZZZ` ref in the `All-Users`
394repository, where `YY/XXXX` is the sharded numeric change ID and
395`ZZZZZZZ` is the account ID.
396
397A starred-changes ref points to a blob that contains the list of labels
398that the account set on the change. The label list is stored as UTF-8
399text with one label per line.
400
401Since JGit has explicit optimizations for looking up refs by prefix
402when the prefix ends with '/', this ref format is optimized to find
403starred changes by change ID. Finding starred changes by change ID is
404e.g. needed when a change is updated so that all users that have
405the link:dev-stars.html#default-star[default star] on the change can be
406notified by email.
407
408Gerrit also needs an efficient way to find all changes that were
409starred by an account, e.g. to provide results for the
410link:user-search.html#is-starred[is:starred] query operator. With the
411ref format as described above the lookup of starred changes by account
412ID is expensive, as this requires a scan of the full
413`refs/starred-changes/*` namespace. To overcome this the users that
414have starred a change are stored in the change index together with the
415star labels.
416
417[[reviewed-flags]]
418== Reviewed Flags
419
420When reviewing a patch set in the Gerrit UI, the reviewer can mark
421files in the patch set as reviewed. These markers are called ‘Reviewed
422Flags’ and are private to the user. A reviewed flag is a tuple of patch
423set ID, file and account ID.
424
425Each user can have many thousands of reviewed flags and over time the
426number can grow without bounds.
427
428The high amount of reviewed flags makes a storage in Git unsuitable
429because each update requires opening the repository and committing a
430change, which is a high overhead for flipping a bit. Therefore the
431reviewed flags are stored in a database table. By default they are
432stored in a local H2 database, but there is an extension point that
433allows to plug in alternate implementations for storing the reviewed
434flags. To replace the storage for reviewed flags a plugin needs to
435implement the link:dev-plugins.html#account-patch-review-store[
Matthias Sohnd8182ba2019-12-09 14:50:23 +0100436AccountPatchReviewStore] interface. E.g. to support a cluster setup with
437multiple primary servers handling write operations where reviewed flags should
438be replicated between the primary nodes one could implement a store for the
439reviewed flags that is based on MySQL with replication.
Edwin Kempin311d5702017-07-28 15:10:24 +0200440
441[[account-sequence]]
442== Account Sequence
443
444The next available account sequence number is stored as UTF-8 text in a
445blob pointed to by the `refs/sequences/accounts` ref in the `All-Users`
446repository.
447
448Multiple processes share the same sequence by incrementing the counter
449using normal git ref updates. To amortize the cost of these ref
450updates, processes increment the counter by a larger number and hand
451out numbers from that range in memory until they run out. The size of
452the account ID batch that each process retrieves at once is controlled
453by the link:config-gerrit.html#notedb.accounts.sequenceBatchSize[
454notedb.accounts.sequenceBatchSize] parameter in the `gerrit.config`
455file.
456
Edwin Kempind97ec6c2017-10-05 14:20:28 +0200457[[replication]]
458== Replication
459
460To replicate account data the following branches from the `All-Users`
461repository must be replicated:
462
463* `refs/users/*` (user branches)
464* `refs/meta/external-ids` (external IDs)
465* `refs/starred-changes/*` (star labels)
466* `refs/sequences/accounts` (account sequence numbers, not needed for Gerrit
Han-Wen Nienhuys348a6032019-09-24 19:44:57 +0200467 replicas)
Edwin Kempind97ec6c2017-10-05 14:20:28 +0200468
Edwin Kempin311d5702017-07-28 15:10:24 +0200469GERRIT
470------
471Part of link:index.html[Gerrit Code Review]
472
473SEARCHBOX
474---------