blob: 974328310387a9c5a88fa4c76036ee25734a6545 [file] [log] [blame]
Edwin Kempin311d5702017-07-28 15:10:24 +02001= Gerrit Code Review - Accounts
2
3== Overview
4
5Starting from 2.15 Gerrit accounts are fully stored in
6link:dev-note-db.html[NoteDb].
7
8The account data consists of a sequence number (account ID), account
9properties (full name, preferred email, registration date, status,
10inactive flag), preferences (general, diff and edit preferences),
11project watches, SSH keys, external IDs, starred changes and reviewed
12flags.
13
14Most account data is stored in a special link:#all-users[All-Users]
15repository, which has one branch per user. Within the user branch there
16are Git config files for the link:#account-properties[
17account properties], the link:#preferences[account preferences] and the
18link:#project-watches[project watches]. In addition there is an
19`authorized_keys` file for the link:#ssh-keys[SSH keys] that follows
20the standard OpenSSH file format.
21
22The account data in the user branch is versioned and the Git history of
23this branch serves as an audit log.
24
25The link:#external-ids[external IDs] are stored as Git Notes inside the
26`All-Users` repository in the `refs/meta/external-ids` notes branch.
27Storing all external IDs in a notes branch ensures that each external
28ID is only used once.
29
30The link:#starred-changes[starred changes] are represented as
31independent refs in the `All-Users` repository. They are not stored in
32the user branch, since this data doesn't need versioning.
33
34The link:#reviewed-flags[reviewed flags] are not stored in Git, but are
35persisted in a database table. This is because there is a high volume
36of reviewed flags and storing them in Git would be inefficient.
37
38Since accessing the account data in Git is not fast enough for account
39queries, e.g. when suggesting reviewers, Gerrit has a
40link:#account-index[secondary index for accounts].
41
42[[all-users]]
43== `All-Users` repository
44
45The `All-Users` repository is a special repository that only contains
46user-specific information. It contains one branch per user. The user
47branch is formatted as `refs/users/CD/ABCD`, where `CD/ABCD` is the
48link:access-control.html#sharded-user-id[sharded account ID], e.g. the
49user branch for account `1000856` is `refs/users/56/1000856`. The
50account IDs in the user refs are sharded so that there is a good
51distribution of the Git data in the storage system.
52
53A user branch must exist for each account, as it represents the
54account. The files in the user branch are all optional. This means
55having a user branch with a tree that is completely empty is also a
56valid account definition.
57
58Updates to the user branch are done through the
59link:rest-api-accounts.html[Gerrit REST API], but users can also
60manually fetch their user branch and push changes back to Gerrit. On
61push the user data is evaluated and invalid user data is rejected.
62
63To hide the implementation detail of the sharded account ID in the ref
64name Gerrit offers a magic `refs/users/self` ref that is automatically
65resolved to the user branch of the calling user. The user can then use
66this ref to fetch from and push to the own user branch. E.g. if user
67`1000856` pushes to `refs/users/self`, the branch
68`refs/users/56/1000856` is updated. In Gerrit `self` is an established
69term to refer to the calling user (e.g. in change queries). This is why
70the magic ref for the own user branch is called `refs/users/self`.
71
72A user branch should only be readable and writeable by the user to whom
73the account belongs. To assign permissions on the user branches the
74normal branch permission system is used. In the permission system the
75user branches are specified as `refs/users/${shardeduserid}`. The
76`${shardeduserid}` variable is resolved to the sharded account ID. This
77variable is used to assign default access rights on all user branches
78that apply only to the owning user. The following permissions are set
79by default when a Gerrit site is newly installed or upgraded to a
80version which supports user branches:
81
82.All-Users project.config
83----
84[access "refs/users/${shardeduserid}"]
85 exclusiveGroupPermissions = read push submit
86 read = group Registered Users
87 push = group Registered Users
88 label-Code-Review = -2..+2 group Registered Users
89 submit = group Registered Users
90----
91
92The user branch contains several files with account data which are
93described link:#account-data-in-user-branch[below].
94
95In addition to the user branches the `All-Users` repository also
96contains a branch for the link:#external-ids[external IDs] and special
97refs for the link:#starred-changes[starred changes].
98
99Also the next available value of the link:#account-sequence[account
100sequence] is stored in the `All-Users` repository.
101
102[[account-index]]
103== Account Index
104
105There are several situations in which Gerrit needs to query accounts,
106e.g.:
107
108* For sending email notifications to project watchers.
109* For reviewer suggestions.
110
111Accessing the account data in Git is not fast enough for account
112queries, since it requires accessing all user branches and parsing
113all files in each of them. To overcome this Gerrit has a secondary
114index for accounts. The account index is either based on
115link:config-gerrit.html#index.type[Lucene or Elasticsearch].
116
117Via the link:rest-api-accounts.html#query-account[Query Account] REST
118endpoint link:user-search-accounts.html[generic account queries] are
119supported.
120
121Accounts are automatically reindexed on any update. The
122link:rest-api-accounts.html#index-account[Index Account] REST endpoint
123allows to reindex an account manually. In addition the
124link:pgm-reindex.html[reindex] program can be used to reindex all
125accounts offline.
126
127[[account-data-in-user-branch]]
128== Account Data in User Branch
129
130A user branch contains several Git config files with the account data:
131
132* `account.config`:
133+
134Stores the link:#account-properties[account properties].
135
136* `preferences.config`:
137+
138Stores the link:#preferences[user preferences] of the account.
139
140* `watch.config`:
141+
142Stores the link:#project-watches[project watches] of the account.
143
144In addition it contains an
145link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
146authorized_keys] file with the link:#ssh-keys[SSH keys] of the account.
147
148[[account-properties]]
149=== Account Properties
150
151The account properties are stored in the user branch in the
152`account.config` file:
153
154----
155[account]
156 fullName = John Doe
157 preferredEmail = john.doe@example.com
158 status = OOO
159 active = false
160----
161
162For active accounts the `active` parameter can be omitted.
163
164The registration date is not contained in the `account.config` file but
165is derived from the timestamp of the first commit on the user branch.
166
167When users update their account properties by pushing to the user
168branch, it is verified that the preferred email exists in the external
169IDs.
170
171Users are not allowed to flip the active value themselves; only
172administrators and users with the
173link:access-control.html#capability_modifyAccount[Modify Account]
174global capability are allowed to change it.
175
176Since all data in the `account.config` file is optional the
177`account.config` file may be absent from some user branches.
178
179[[preferences]]
180=== Preferences
181
182The account properties are stored in the user branch in the
183`preferences.config` file. There are separate sections for
184link:intro-user.html#preferences[general],
185link:user-review-ui.html#diff-preferences[diff] and edit preferences:
186
187----
188[general]
189 showSiteHeader = false
190[diff]
191 hideTopMenu = true
192[edit]
193 lineLength = 80
194----
195
196The parameter names match the names that are used in the preferences REST API:
197
198* link:rest-api-accounts.html#preferences-info[General Preferences]
199* link:rest-api-accounts.html#diff-preferences-info[Diff Preferences]
200* link:rest-api-accounts.html#edit-preferences-info[Edit Preferences]
201
202If the value for a preference is the same as the default value for this
203preference, it can be omitted in the `preference.config` file.
204
205Defaults for general and diff preferences that apply for all accounts
206can be configured in the `refs/users/default` branch in the `All-Users`
207repository.
208
209[[project-watches]]
210=== Project Watches
211
212Users can configure watches on projects to receive email notifications
213for changes of that project.
214
215A watch configuration consists of the project name and an optional
216filter query. If a filter query is specified, email notifications will
217be sent only for changes of that project that match this query.
218
219In addition, each watch configuration can contain a list of
220notification types that determine for which events email notifications
221should be sent. E.g. a user can configure that email notifications
222should only be sent if a new patch set is uploaded and when the change
223gets submitted, but not on other events.
224
225Project watches are stored in a `watch.config` file in the user branch:
226
227----
228[project "foo"]
229 notify = * [ALL_COMMENTS]
230 notify = branch:master [ALL_COMMENTS, NEW_PATCHSETS]
231 notify = branch:master owner:self [SUBMITTED_CHANGES]
232----
233
234The `watch.config` file has one project section for all project watches
235of a project. The project name is used as subsection name and the
236filters with the notification types, that decide for which events email
237notifications should be sent, are represented as `notify` values in the
238subsection. A `notify` value is formatted as
239"<filter> [<comma-separated-list-of-notification-types>]". The
240supported notification types are described in the
241link:user-notify.html#notify.name.type[Email Notifications documentation].
242
243For a change event, a notification will be sent if any `notify` value
244of the corresponding project has both a filter that matches the change
245and a notification type that matches the event.
246
247In order to send email notifications on change events, Gerrit needs to
248find all accounts that watch the corresponding project. To make this
249lookup fast the secondary account index is used. The account index
250contains a repeated field that stores the projects that are being
251watched by an account. After the accounts that watch the project have
252been retrieved from the index, the complete watch configuration is
253available from the account cache and Gerrit can check if any watch
254matches the change and the event.
255
256[[ssh-keys]]
257=== SSH Keys
258
259SSH keys are stored in the user branch in an `authorized_keys` file,
260which is the
261link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
262standard OpenSSH file format] for storing SSH keys:
263
264----
265ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCgug5VyMXQGnem2H1KVC4/HcRcD4zzBqSuJBRWVonSSoz3RoAZ7bWXCVVGwchtXwUURD689wFYdiPecOrWOUgeeyRq754YWRhU+W28vf8IZixgjCmiBhaL2gt3wff6pP+NXJpTSA4aeWE5DfNK5tZlxlSxqkKOS8JRSUeNQov5Tw== john.doe@example.com
266# DELETED
267# INVALID ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDm5yP7FmEoqzQRDyskX+9+N0q9GrvZeh5RG52EUpE4ms/Ujm3ewV1LoGzc/lYKJAIbdcZQNJ9+06EfWZaIRA3oOwAPe1eCnX+aLr8E6Tw2gDMQOGc5e9HfyXpC2pDvzauoZNYqLALOG3y/1xjo7IH8GYRS2B7zO/Mf9DdCcCKSfw== john.doe@example.com
268ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCaS7RHEcZ/zjl9hkWkqnm29RNr2OQ/TZ5jk2qBVMH3BgzPsTsEs+7ag9tfD8OCj+vOcwm626mQBZoR2e3niHa/9gnHBHFtOrGfzKbpRjTWtiOZbB9HF+rqMVD+Dawo/oicX/dDg7VAgOFSPothe6RMhbgWf84UcK5aQd5eP5y+tQ== john.doe@example.com
269----
270
271When the SSH API is used, Gerrit needs an efficient way to lookup SSH
272keys by username. Since the username can be easily resolved to an
273account ID (via the account cache), accessing the SSH keys in the user
274branch is fast.
275
276To identify SSH keys in the REST API Gerrit uses
277link:rest-api-accounts.html#ssh-key-id[sequence numbers per account].
278This is why the order of the keys in the `authorized_keys` file is
279used to determines the sequence numbers of the keys (the sequence
280numbers start at 1).
281
282To keep the sequence numbers intact when a key is deleted, a
283'# DELETED' line is inserted at the position where the key was deleted.
284
285Invalid keys are marked with the prefix '# INVALID'.
286
287[[external-ids]]
288== External IDs
289
290External IDs are used to link external identities, such as an LDAP
291account or an OAUTH identity, to an account in Gerrit.
292
293External IDs are stored as Git Notes in the `All-Users` repository. The
294name of the notes branch is `refs/meta/external-ids`.
295
296As note key the SHA1 of the external ID key is used. This ensures that
297an external ID is used only once (e.g. an external ID can never be
298assigned to multiple accounts at a point in time).
299
300The note content is a Git config file:
301
302----
303[externalId "username:jdoe"]
304 accountId = 1003407
305 email = jdoe@example.com
306 password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7
307----
308
309The config file has one `externalId` section. The external ID key which
310consists of scheme and ID in the format '<scheme>:<id>' is used as
311subsection name.
312
313The `accountId` field is mandatory, the `email` and `password` fields
314are optional.
315
316The external IDs are maintained by Gerrit, this means users are not
317allowed to manually edit their external IDs. Only users with the
Edwin Kempin47dd7ba2017-08-31 11:33:44 +0200318link:access-control.html#capability_accessDatabase[Access Database]
319global capability can push updates to the `refs/meta/external-ids`
320branch. However Gerrit rejects pushes if:
Edwin Kempin311d5702017-07-28 15:10:24 +0200321
322* any external ID config file cannot be parsed
323* if a note key does not match the SHA of the external ID key in the
324 note content
325* external IDs for non-existing accounts are contained
326* invalid emails are contained
327* any email is not unique (the same email is assigned to multiple
328 accounts)
329* hashed passwords of external IDs with scheme `username` cannot be
330 decoded
331
332[[starred-changes]]
333== Starred Changes
334
335link:dev-stars.html[Starred changes] allow users to mark changes as
336favorites and receive email notifications for them.
337
338Each starred change is a tuple of an account ID, a change ID and a
339label.
340
341To keep track of a change that is starred by an account, Gerrit creates
342a `refs/starred-changes/YY/XXXX/ZZZZZZZ` ref in the `All-Users`
343repository, where `YY/XXXX` is the sharded numeric change ID and
344`ZZZZZZZ` is the account ID.
345
346A starred-changes ref points to a blob that contains the list of labels
347that the account set on the change. The label list is stored as UTF-8
348text with one label per line.
349
350Since JGit has explicit optimizations for looking up refs by prefix
351when the prefix ends with '/', this ref format is optimized to find
352starred changes by change ID. Finding starred changes by change ID is
353e.g. needed when a change is updated so that all users that have
354the link:dev-stars.html#default-star[default star] on the change can be
355notified by email.
356
357Gerrit also needs an efficient way to find all changes that were
358starred by an account, e.g. to provide results for the
359link:user-search.html#is-starred[is:starred] query operator. With the
360ref format as described above the lookup of starred changes by account
361ID is expensive, as this requires a scan of the full
362`refs/starred-changes/*` namespace. To overcome this the users that
363have starred a change are stored in the change index together with the
364star labels.
365
366[[reviewed-flags]]
367== Reviewed Flags
368
369When reviewing a patch set in the Gerrit UI, the reviewer can mark
370files in the patch set as reviewed. These markers are called ‘Reviewed
371Flags’ and are private to the user. A reviewed flag is a tuple of patch
372set ID, file and account ID.
373
374Each user can have many thousands of reviewed flags and over time the
375number can grow without bounds.
376
377The high amount of reviewed flags makes a storage in Git unsuitable
378because each update requires opening the repository and committing a
379change, which is a high overhead for flipping a bit. Therefore the
380reviewed flags are stored in a database table. By default they are
381stored in a local H2 database, but there is an extension point that
382allows to plug in alternate implementations for storing the reviewed
383flags. To replace the storage for reviewed flags a plugin needs to
384implement the link:dev-plugins.html#account-patch-review-store[
385AccountPatchReviewStore] interface. E.g. to support a multi-master
386setup where reviewed flags should be replicated between the master
387nodes one could implement a store for the reviewed flags that is
388based on MySQL with replication.
389
390[[account-sequence]]
391== Account Sequence
392
393The next available account sequence number is stored as UTF-8 text in a
394blob pointed to by the `refs/sequences/accounts` ref in the `All-Users`
395repository.
396
397Multiple processes share the same sequence by incrementing the counter
398using normal git ref updates. To amortize the cost of these ref
399updates, processes increment the counter by a larger number and hand
400out numbers from that range in memory until they run out. The size of
401the account ID batch that each process retrieves at once is controlled
402by the link:config-gerrit.html#notedb.accounts.sequenceBatchSize[
403notedb.accounts.sequenceBatchSize] parameter in the `gerrit.config`
404file.
405
406GERRIT
407------
408Part of link:index.html[Gerrit Code Review]
409
410SEARCHBOX
411---------