blob: 4f4685b1f20c2496b1a61b7191ab1e8770d01fe3 [file] [log] [blame]
Edwin Kempin311d5702017-07-28 15:10:24 +02001= Gerrit Code Review - Accounts
2
3== Overview
4
5Starting from 2.15 Gerrit accounts are fully stored in
Gert van Dijk570ac7b2017-10-17 22:36:04 +02006link:note-db.html[NoteDb].
Edwin Kempin311d5702017-07-28 15:10:24 +02007
8The account data consists of a sequence number (account ID), account
9properties (full name, preferred email, registration date, status,
10inactive flag), preferences (general, diff and edit preferences),
11project watches, SSH keys, external IDs, starred changes and reviewed
12flags.
13
14Most account data is stored in a special link:#all-users[All-Users]
15repository, which has one branch per user. Within the user branch there
16are Git config files for the link:#account-properties[
17account properties], the link:#preferences[account preferences] and the
18link:#project-watches[project watches]. In addition there is an
19`authorized_keys` file for the link:#ssh-keys[SSH keys] that follows
20the standard OpenSSH file format.
21
22The account data in the user branch is versioned and the Git history of
23this branch serves as an audit log.
24
25The link:#external-ids[external IDs] are stored as Git Notes inside the
26`All-Users` repository in the `refs/meta/external-ids` notes branch.
27Storing all external IDs in a notes branch ensures that each external
28ID is only used once.
29
30The link:#starred-changes[starred changes] are represented as
31independent refs in the `All-Users` repository. They are not stored in
32the user branch, since this data doesn't need versioning.
33
34The link:#reviewed-flags[reviewed flags] are not stored in Git, but are
35persisted in a database table. This is because there is a high volume
36of reviewed flags and storing them in Git would be inefficient.
37
38Since accessing the account data in Git is not fast enough for account
39queries, e.g. when suggesting reviewers, Gerrit has a
40link:#account-index[secondary index for accounts].
41
42[[all-users]]
43== `All-Users` repository
44
45The `All-Users` repository is a special repository that only contains
46user-specific information. It contains one branch per user. The user
47branch is formatted as `refs/users/CD/ABCD`, where `CD/ABCD` is the
48link:access-control.html#sharded-user-id[sharded account ID], e.g. the
49user branch for account `1000856` is `refs/users/56/1000856`. The
50account IDs in the user refs are sharded so that there is a good
51distribution of the Git data in the storage system.
52
53A user branch must exist for each account, as it represents the
54account. The files in the user branch are all optional. This means
55having a user branch with a tree that is completely empty is also a
56valid account definition.
57
58Updates to the user branch are done through the
59link:rest-api-accounts.html[Gerrit REST API], but users can also
60manually fetch their user branch and push changes back to Gerrit. On
61push the user data is evaluated and invalid user data is rejected.
62
63To hide the implementation detail of the sharded account ID in the ref
64name Gerrit offers a magic `refs/users/self` ref that is automatically
65resolved to the user branch of the calling user. The user can then use
66this ref to fetch from and push to the own user branch. E.g. if user
67`1000856` pushes to `refs/users/self`, the branch
68`refs/users/56/1000856` is updated. In Gerrit `self` is an established
69term to refer to the calling user (e.g. in change queries). This is why
70the magic ref for the own user branch is called `refs/users/self`.
71
72A user branch should only be readable and writeable by the user to whom
73the account belongs. To assign permissions on the user branches the
74normal branch permission system is used. In the permission system the
75user branches are specified as `refs/users/${shardeduserid}`. The
76`${shardeduserid}` variable is resolved to the sharded account ID. This
77variable is used to assign default access rights on all user branches
78that apply only to the owning user. The following permissions are set
79by default when a Gerrit site is newly installed or upgraded to a
80version which supports user branches:
81
82.All-Users project.config
83----
84[access "refs/users/${shardeduserid}"]
85 exclusiveGroupPermissions = read push submit
86 read = group Registered Users
87 push = group Registered Users
88 label-Code-Review = -2..+2 group Registered Users
89 submit = group Registered Users
90----
91
92The user branch contains several files with account data which are
93described link:#account-data-in-user-branch[below].
94
95In addition to the user branches the `All-Users` repository also
96contains a branch for the link:#external-ids[external IDs] and special
97refs for the link:#starred-changes[starred changes].
98
99Also the next available value of the link:#account-sequence[account
100sequence] is stored in the `All-Users` repository.
101
102[[account-index]]
103== Account Index
104
105There are several situations in which Gerrit needs to query accounts,
106e.g.:
107
108* For sending email notifications to project watchers.
109* For reviewer suggestions.
110
111Accessing the account data in Git is not fast enough for account
112queries, since it requires accessing all user branches and parsing
113all files in each of them. To overcome this Gerrit has a secondary
114index for accounts. The account index is either based on
115link:config-gerrit.html#index.type[Lucene or Elasticsearch].
116
117Via the link:rest-api-accounts.html#query-account[Query Account] REST
118endpoint link:user-search-accounts.html[generic account queries] are
119supported.
120
121Accounts are automatically reindexed on any update. The
122link:rest-api-accounts.html#index-account[Index Account] REST endpoint
123allows to reindex an account manually. In addition the
124link:pgm-reindex.html[reindex] program can be used to reindex all
125accounts offline.
126
127[[account-data-in-user-branch]]
128== Account Data in User Branch
129
130A user branch contains several Git config files with the account data:
131
132* `account.config`:
133+
134Stores the link:#account-properties[account properties].
135
136* `preferences.config`:
137+
138Stores the link:#preferences[user preferences] of the account.
139
140* `watch.config`:
141+
142Stores the link:#project-watches[project watches] of the account.
143
144In addition it contains an
145link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
146authorized_keys] file with the link:#ssh-keys[SSH keys] of the account.
147
148[[account-properties]]
149=== Account Properties
150
151The account properties are stored in the user branch in the
152`account.config` file:
153
154----
155[account]
156 fullName = John Doe
157 preferredEmail = john.doe@example.com
158 status = OOO
159 active = false
160----
161
162For active accounts the `active` parameter can be omitted.
163
164The registration date is not contained in the `account.config` file but
165is derived from the timestamp of the first commit on the user branch.
166
167When users update their account properties by pushing to the user
168branch, it is verified that the preferred email exists in the external
169IDs.
170
171Users are not allowed to flip the active value themselves; only
172administrators and users with the
173link:access-control.html#capability_modifyAccount[Modify Account]
174global capability are allowed to change it.
175
176Since all data in the `account.config` file is optional the
177`account.config` file may be absent from some user branches.
178
179[[preferences]]
180=== Preferences
181
182The account properties are stored in the user branch in the
183`preferences.config` file. There are separate sections for
184link:intro-user.html#preferences[general],
185link:user-review-ui.html#diff-preferences[diff] and edit preferences:
186
187----
Edwin Kempin311d5702017-07-28 15:10:24 +0200188[diff]
189 hideTopMenu = true
190[edit]
191 lineLength = 80
192----
193
194The parameter names match the names that are used in the preferences REST API:
195
196* link:rest-api-accounts.html#preferences-info[General Preferences]
197* link:rest-api-accounts.html#diff-preferences-info[Diff Preferences]
198* link:rest-api-accounts.html#edit-preferences-info[Edit Preferences]
199
200If the value for a preference is the same as the default value for this
David Pursehouse6d2ae282019-01-09 11:00:27 +0900201preference, it can be omitted in the `preferences.config` file.
Edwin Kempin311d5702017-07-28 15:10:24 +0200202
Edwin Kempin1e01692e2018-01-17 11:01:00 +0100203Defaults for preferences that apply for all accounts can be configured
204in the `refs/users/default` branch in the `All-Users` repository.
Edwin Kempin311d5702017-07-28 15:10:24 +0200205
206[[project-watches]]
207=== Project Watches
208
209Users can configure watches on projects to receive email notifications
210for changes of that project.
211
212A watch configuration consists of the project name and an optional
213filter query. If a filter query is specified, email notifications will
214be sent only for changes of that project that match this query.
215
216In addition, each watch configuration can contain a list of
217notification types that determine for which events email notifications
218should be sent. E.g. a user can configure that email notifications
219should only be sent if a new patch set is uploaded and when the change
220gets submitted, but not on other events.
221
222Project watches are stored in a `watch.config` file in the user branch:
223
224----
225[project "foo"]
226 notify = * [ALL_COMMENTS]
227 notify = branch:master [ALL_COMMENTS, NEW_PATCHSETS]
228 notify = branch:master owner:self [SUBMITTED_CHANGES]
229----
230
231The `watch.config` file has one project section for all project watches
232of a project. The project name is used as subsection name and the
233filters with the notification types, that decide for which events email
234notifications should be sent, are represented as `notify` values in the
235subsection. A `notify` value is formatted as
236"<filter> [<comma-separated-list-of-notification-types>]". The
237supported notification types are described in the
238link:user-notify.html#notify.name.type[Email Notifications documentation].
239
240For a change event, a notification will be sent if any `notify` value
241of the corresponding project has both a filter that matches the change
242and a notification type that matches the event.
243
244In order to send email notifications on change events, Gerrit needs to
245find all accounts that watch the corresponding project. To make this
246lookup fast the secondary account index is used. The account index
247contains a repeated field that stores the projects that are being
248watched by an account. After the accounts that watch the project have
249been retrieved from the index, the complete watch configuration is
250available from the account cache and Gerrit can check if any watch
251matches the change and the event.
252
253[[ssh-keys]]
254=== SSH Keys
255
256SSH keys are stored in the user branch in an `authorized_keys` file,
257which is the
258link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
259standard OpenSSH file format] for storing SSH keys:
260
261----
262ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCgug5VyMXQGnem2H1KVC4/HcRcD4zzBqSuJBRWVonSSoz3RoAZ7bWXCVVGwchtXwUURD689wFYdiPecOrWOUgeeyRq754YWRhU+W28vf8IZixgjCmiBhaL2gt3wff6pP+NXJpTSA4aeWE5DfNK5tZlxlSxqkKOS8JRSUeNQov5Tw== john.doe@example.com
263# DELETED
264# INVALID ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDm5yP7FmEoqzQRDyskX+9+N0q9GrvZeh5RG52EUpE4ms/Ujm3ewV1LoGzc/lYKJAIbdcZQNJ9+06EfWZaIRA3oOwAPe1eCnX+aLr8E6Tw2gDMQOGc5e9HfyXpC2pDvzauoZNYqLALOG3y/1xjo7IH8GYRS2B7zO/Mf9DdCcCKSfw== john.doe@example.com
265ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCaS7RHEcZ/zjl9hkWkqnm29RNr2OQ/TZ5jk2qBVMH3BgzPsTsEs+7ag9tfD8OCj+vOcwm626mQBZoR2e3niHa/9gnHBHFtOrGfzKbpRjTWtiOZbB9HF+rqMVD+Dawo/oicX/dDg7VAgOFSPothe6RMhbgWf84UcK5aQd5eP5y+tQ== john.doe@example.com
266----
267
268When the SSH API is used, Gerrit needs an efficient way to lookup SSH
269keys by username. Since the username can be easily resolved to an
270account ID (via the account cache), accessing the SSH keys in the user
271branch is fast.
272
273To identify SSH keys in the REST API Gerrit uses
274link:rest-api-accounts.html#ssh-key-id[sequence numbers per account].
275This is why the order of the keys in the `authorized_keys` file is
David Pursehouse0da65cf2019-05-27 16:25:27 +0900276used to determine the sequence numbers of the keys (the sequence
Edwin Kempin311d5702017-07-28 15:10:24 +0200277numbers start at 1).
278
279To keep the sequence numbers intact when a key is deleted, a
280'# DELETED' line is inserted at the position where the key was deleted.
281
282Invalid keys are marked with the prefix '# INVALID'.
283
284[[external-ids]]
285== External IDs
286
David Pursehousef7f400c2019-05-27 20:22:08 +0900287External IDs are used to link identities, such as the username and email
288addresses, and external identies such as an LDAP account or an OAUTH
289identity, to an account in Gerrit.
Edwin Kempin311d5702017-07-28 15:10:24 +0200290
291External IDs are stored as Git Notes in the `All-Users` repository. The
292name of the notes branch is `refs/meta/external-ids`.
293
David Pursehouse05c4cba2019-05-27 16:25:58 +0900294As note key the SHA1 of the external ID key is used, for example the key
295for the external ID `username:jdoe` is `e0b751ae90ef039f320e097d7d212f490e933706`.
296This ensures that an external ID is used only once (e.g. an external ID can
297never be assigned to multiple accounts at a point in time).
Edwin Kempin311d5702017-07-28 15:10:24 +0200298
Edwin Kempin7ff264d2018-09-20 09:48:03 +0200299[IMPORTANT]
300If the external ID key is changed manually you must adapt the note key
David Pursehouse0da65cf2019-05-27 16:25:27 +0900301to the new SHA1, otherwise the external ID becomes inconsistent and is
Edwin Kempin7ff264d2018-09-20 09:48:03 +0200302ignored by Gerrit.
303
Edwin Kempin311d5702017-07-28 15:10:24 +0200304The note content is a Git config file:
305
306----
307[externalId "username:jdoe"]
308 accountId = 1003407
309 email = jdoe@example.com
310 password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7
311----
312
David Pursehouse0da65cf2019-05-27 16:25:27 +0900313The config file has one `externalId` section. The external ID key, which
314consists of scheme and ID in the format '<scheme>:<id>', is used as
Edwin Kempin311d5702017-07-28 15:10:24 +0200315subsection name.
316
David Pursehouse0da65cf2019-05-27 16:25:27 +0900317The `accountId` field is mandatory. The `email` and `password` fields
Edwin Kempin311d5702017-07-28 15:10:24 +0200318are optional.
319
David Pursehouse0da65cf2019-05-27 16:25:27 +0900320The external IDs are maintained by Gerrit. This means users are not
Edwin Kempin311d5702017-07-28 15:10:24 +0200321allowed to manually edit their external IDs. Only users with the
Edwin Kempin47dd7ba2017-08-31 11:33:44 +0200322link:access-control.html#capability_accessDatabase[Access Database]
323global capability can push updates to the `refs/meta/external-ids`
324branch. However Gerrit rejects pushes if:
Edwin Kempin311d5702017-07-28 15:10:24 +0200325
326* any external ID config file cannot be parsed
327* if a note key does not match the SHA of the external ID key in the
328 note content
329* external IDs for non-existing accounts are contained
330* invalid emails are contained
331* any email is not unique (the same email is assigned to multiple
332 accounts)
333* hashed passwords of external IDs with scheme `username` cannot be
334 decoded
335
336[[starred-changes]]
337== Starred Changes
338
339link:dev-stars.html[Starred changes] allow users to mark changes as
340favorites and receive email notifications for them.
341
342Each starred change is a tuple of an account ID, a change ID and a
343label.
344
345To keep track of a change that is starred by an account, Gerrit creates
346a `refs/starred-changes/YY/XXXX/ZZZZZZZ` ref in the `All-Users`
347repository, where `YY/XXXX` is the sharded numeric change ID and
348`ZZZZZZZ` is the account ID.
349
350A starred-changes ref points to a blob that contains the list of labels
351that the account set on the change. The label list is stored as UTF-8
352text with one label per line.
353
354Since JGit has explicit optimizations for looking up refs by prefix
355when the prefix ends with '/', this ref format is optimized to find
356starred changes by change ID. Finding starred changes by change ID is
357e.g. needed when a change is updated so that all users that have
358the link:dev-stars.html#default-star[default star] on the change can be
359notified by email.
360
361Gerrit also needs an efficient way to find all changes that were
362starred by an account, e.g. to provide results for the
363link:user-search.html#is-starred[is:starred] query operator. With the
364ref format as described above the lookup of starred changes by account
365ID is expensive, as this requires a scan of the full
366`refs/starred-changes/*` namespace. To overcome this the users that
367have starred a change are stored in the change index together with the
368star labels.
369
370[[reviewed-flags]]
371== Reviewed Flags
372
373When reviewing a patch set in the Gerrit UI, the reviewer can mark
374files in the patch set as reviewed. These markers are called ‘Reviewed
375Flags’ and are private to the user. A reviewed flag is a tuple of patch
376set ID, file and account ID.
377
378Each user can have many thousands of reviewed flags and over time the
379number can grow without bounds.
380
381The high amount of reviewed flags makes a storage in Git unsuitable
382because each update requires opening the repository and committing a
383change, which is a high overhead for flipping a bit. Therefore the
384reviewed flags are stored in a database table. By default they are
385stored in a local H2 database, but there is an extension point that
386allows to plug in alternate implementations for storing the reviewed
387flags. To replace the storage for reviewed flags a plugin needs to
388implement the link:dev-plugins.html#account-patch-review-store[
389AccountPatchReviewStore] interface. E.g. to support a multi-master
390setup where reviewed flags should be replicated between the master
391nodes one could implement a store for the reviewed flags that is
392based on MySQL with replication.
393
394[[account-sequence]]
395== Account Sequence
396
397The next available account sequence number is stored as UTF-8 text in a
398blob pointed to by the `refs/sequences/accounts` ref in the `All-Users`
399repository.
400
401Multiple processes share the same sequence by incrementing the counter
402using normal git ref updates. To amortize the cost of these ref
403updates, processes increment the counter by a larger number and hand
404out numbers from that range in memory until they run out. The size of
405the account ID batch that each process retrieves at once is controlled
406by the link:config-gerrit.html#notedb.accounts.sequenceBatchSize[
407notedb.accounts.sequenceBatchSize] parameter in the `gerrit.config`
408file.
409
Edwin Kempind97ec6c2017-10-05 14:20:28 +0200410[[replication]]
411== Replication
412
413To replicate account data the following branches from the `All-Users`
414repository must be replicated:
415
416* `refs/users/*` (user branches)
417* `refs/meta/external-ids` (external IDs)
418* `refs/starred-changes/*` (star labels)
419* `refs/sequences/accounts` (account sequence numbers, not needed for Gerrit
420 slaves)
421
Edwin Kempin311d5702017-07-28 15:10:24 +0200422GERRIT
423------
424Part of link:index.html[Gerrit Code Review]
425
426SEARCHBOX
427---------