Import wiki pages as markdown docs Change-Id: Ie20fad897c6bce3e02963dd1870b0585d4ea3b59
diff --git a/docs/Background.md b/docs/Background.md new file mode 100644 index 0000000..30e50cc --- /dev/null +++ b/docs/Background.md
@@ -0,0 +1,47 @@ +Google developed [Mondrian] +(http://video.google.com/videoplay?docid=-8502904076440714866), a Perforce based +code review tool to facilitate peer-review of changes prior to submission to the +central code repository. Mondrian is not open source, as it is tied to the use +of [Perforce](http://www.perforce.com/) and to many Google-only services, such +as [Bigtable](http://labs.google.com/papers/bigtable.html). Google employees +have often described how useful Mondrian and its peer-review process is to their +day-to-day work. + +Guido van Rossum open sourced portions of Mondrian within [Rietveld] +(http://code.google.com/p/rietveld/), a similar code review tool running on +Google App Engine, but for use with Subversion rather than Perforce. Rietveld is +in common use by many open source projects, facilitating their peer reviews much +as Mondrian does for Google employees. Unlike Mondrian and the Google Perforce +triggers, Rietveld is strictly advisory and does not enforce peer-review prior +to submission. + +Git is a distributed version control system, wherein each repository is assumed +to be owned/maintained by a single user. There are no inherent security controls +built into Git, so the ability to read from or write to a repository is +controlled entirely by the host's filesystem access controls. When multiple +maintainers collaborate on a single shared repository a high degree of trust is +required, as any collaborator with write access can alter the repository. + +[Gitosis](http://eagain.net/gitweb/?p=gitosis.git;a=blob;f=README.rst;hb=HEAD) +provides tools to secure centralized Git repositories, permitting multiple +maintainers to manage the same project at once, by restricting the access to +only over a secure network protocol, much like Perforce secures a repository by +only permitting access over its network port. + +The [Android Open Source Project](http://source.android.com/) (AOSP) was founded +by Google by the open source releasing of the Android operating system. AOSP has +selected Git as its primary version control tool. As many of the engineers have +a background of working with Mondrian at Google, there is a strong desire to +have the same (or better) feature set available for Git and AOSP. + +Gerrit Code Review started as a simple set of patches to Rietveld, and was +originally built to service AOSP. This quickly turned into a fork as we added +access control features that Guido van Rossum did not want to see complicating +the Rietveld code base. As the functionality and code were starting to become +drastically different, a different name was needed. Gerrit calls back to the +original namesake of Rietveld, [Gerrit Rietveld] +(http://en.wikipedia.org/wiki/Gerrit_Rietveld), a Dutch architect. + +Gerrit2 is a complete rewrite of the Gerrit fork, completely changing the +implementation from Python on Google App Engine, to Java on a J2EE servlet +container and a SQL database.
diff --git a/docs/MultiMaster.md b/docs/MultiMaster.md new file mode 100644 index 0000000..239d0e3 --- /dev/null +++ b/docs/MultiMaster.md
@@ -0,0 +1,81 @@ +We spent some time at the May 2012 Hackathon outlining an incremental approach +to making open source Gerrit clusterable (the version running for android-review +and gerrit-review is already clustered but uses much of Google's proprietary +technologies such as GFS and BigTable). Several incremental steps were outlined +on how to move Gerrit in that direction. + +# Shared Git Repo - Shared DB + +This is the simplest case for Gerrit multi master, so it is likely the first +step which is needed by most other ideas is to support a very simple +master/master installation of Gerrit where both (or all if more than 2) masters +share a common filesystem backend (likely a high end NFS server) and a common +db. + +Four issues have been identified here which need to be resolved before this is +possible: + +1. Cache coherency and +1. Submit conflict resolution +1. Mirror/Slave Replication +1. User sessions + +A naive approach to #1 is to simply use really short cache times, but this sort +of defeats the purpose of caching. To solve this properly, some sort of eviction +protocol will need to be developed for masters to inform their peers of a needed +eviction (a plugin is up for [review] +(https://gerrit-review.googlesource.com/#/c/37460/1) which does this using UDP). + +# 2 could be easily solved by manually determining a submit master and later + +upgrading to some sort of voting mechanism among peer masters to choose a submit +master, these would be incremental approaches. The issue is that each server +runs a plugin queue and can therefor can attempt to merge changes to the same +branches at the same time resulting in "failed to lock" errors which will leave +failed to merge messages on changes. If the same change makes it into multiple +queues, might it also cause issues by attempting to being merged twice? If a +peer goes down, might its queue be the only holder of certain changes which then +will be missed until a restart of some server? + +# 3 can be solved similarly to #2 + +Select a replication master. The replication master is responsible to rerun full +replication on startup and anytime a master goes down (since it may have currently +been replicating something). Otherwise, masters can replicate as they would normally +(as a single master) as they cause ref updates. Since there is a bug where replication +"failed to lock" attempts are not retried currently, this should also be fixed since +they will likely be even more prevalent with multi master setups. + +A simple ssh connection between peers was deemed sufficient in most cases to +accomplish both #1 and #2. Although a single form of communication is not very +good since it prevents the cluster from distinguishing between a downed node and +a network split. Without being able to distinguish this, the cluster cannot +dynamically adapt when communication is down with a peer. Likely a cluster +should have a backdoor com channel to help indicate inter node network failures, +since the DB and the Repos are shared in this scenario, either could easily be +used for the back door channel (CAF is using the repos: +All-Projects:refs/meta/masters/node). + +Spearce has a solution to #4 + +A [thread] +(https://groups.google.com/d/msg/repo-discuss/ZIIuBaCz9Jc/ZTQGpuy_Y1MJ) about +some what is required for this setup to work well. + +# Multi Site Masters with Separate Backends + +The main additional problem with separate backends is: resolving ref updates in +a globally safe way. In Google’s implementation, this is solved by placing the +refs in BigTable. ZooKeeper seemed like a good free/open source alternative +since it is Java based and under the Apache license. The other piece to solve is +moving object data across sites, it was suggested that ZooKeeper would likely be +involved in helping to coordinate this, but details were not really discussed. + +A plugin for ZooKeeper ref-db is up for [review] +(https://gerrit-review.googlesource.com/#/c/37460/1). + +# Distributed FS + +Finally, it was felt that once multi sites were conquered, that a distributed +filesystem may eventually be needed to scale the git repos effectively, Hadoop +DFS was proposed for this.
diff --git a/docs/Notedb.md b/docs/Notedb.md new file mode 100644 index 0000000..493d74d --- /dev/null +++ b/docs/Notedb.md
@@ -0,0 +1,90 @@ +### What is notedb? + +Notedb is the successor to ReviewDb: a replacement database backend for Gerrit. +The goal is to read code review metadata from the same set of repositories that +store the data. This allows for improved atomicity, consistency of replication, +and the creation of new features like federated review and offline review. + +This document describes the state of migration (not necessarily completely up to +date), the tasks that remain, and notes on some of the challenges we've +encountered along the way. This document is **not** a full design document for +notedb; if you're that curious, bug dborowitz and he will help you out. + +Finally, this document is for core developers. If you are a casual user of +Gerrit looking for documentation, you've come to the wrong place. + +## Root Tables + +While ReviewDb has a lot of tables, there are relatively few "root" tables, that +is, tables whose primary key's `get!ParentKey()` method returns null: + +* **Change**: subtables ChangeMessage, PatchSet, PatchSetApproval, + PatchSetAncestor, PatchLineComment, TrackingId +* **Account**: subtables AccountExternalId, AccountProjectWatch, + AccountSshKey, AccountDiffPreference, StarredChange +* **AccountGroup**: subtables AccountGroupMember + +TODO(dborowitz): Document other minor tables, audits, etc. + +For each root entity in each of these tables, there is one DAG describing all +modifications that have been applied to the change over time. Entities DAGs are +stored in a repository corresponding to their function: + +* **Change**: stored in `refs/changes/YZ/XYZ/meta` in the destination repository +* **Account**: stored in `refs/accounts/YZ/XYZ/meta` (TBD) in `All-Users` +* **AccountGroup**: stored in `TBD` in `All-Projects` + +Most of this document focuses on Change entities, partly because it's the most +complex, but also because that's where most effort to date has been focused. + +## Changes: What's Done + +Current progress, along with some possibly-interesting implementation notes. + +* ChangeMessages: Stored in commit message body. Currently the subject of the + commit message contains a machine-generated-but-readable summary like + "Updated patch set 3"; we might decide to eliminate this and just use the + ChangeMessage. +* PatchSetApprovals: Stored as a footer "Label: Label-Name=Foo". Instead of + storing an implicit 0 review for reviewers, include them explicitly in a + separate Reviewer footer. Freeze labels at submit time along with the full + submit rule evaluator results using "Submitted-with". +* PatchLineComments: The only thing thus far actually stored in a note, on the + commit to which it applies. Drafts are stored in a per-user ref in + All-Users. +* TrackingId: Killed this long ago and use the secondary index. (Just reminded + myself I need to rip out the gwtorm entity.) +* Change: Started storing obvious fields: owner, project, branch, topic. +* PatchSet: Storing IDs and created on timestamps right now, not reading from + it yet. + +## Changes: What Needs to Be Done + +* PatchSetAncestors: Replace with a persistent cache, which should probably be + rebuilt in RebuildNotedb. +* PatchSet: Draft field. (Someday I think we should replace Draft with WIP, + but I digress.) +* Change: Kill more fields. Actually implement reading from changes; see + challenges section. +* Some sort of batch parser. If we get 100 changes back from a search, + sequentially reading the DAG for each of those might take a while. +* Benchmark and optimize the heck out of the parsers. Let's say a target is + 1ms per change DAG. Related, we may also need to disable (eager) parsing of + certain fields, if we can prove with benchmarks that they are problematic. + (We already do this for PatchLineComments, to avoid having to read anything + but commits in the common case.) + +## JGit Changes + +* Teach JGit to pack notedb refs in a separate pack file from everything else. + We don't want to hurt locality within the main pack by interleaving metadata + commits. +* Teach JGit to better handle many small, separate commit graphs in the + packer. Ordered chronologically, notedb metadata will be spread across a + large number of separate DAGs. We will get better kernel buffer cache + locality by clustering all commits in each disconnect DAG together. (But + this may also hurt batch parsing; see above.) + +## Challenges + +TODO
diff --git a/docs/OutstandingTopics.md b/docs/OutstandingTopics.md new file mode 100644 index 0000000..215cfee --- /dev/null +++ b/docs/OutstandingTopics.md
@@ -0,0 +1,72 @@ +# Overview + +Following a [discussion on the mailing list] +(https://groups.google.com/forum/#!topic/repo-discuss/qKz7AtZDlC4) we decided to +create a summary page containing the list of outstanding topics that are +currently under review and deserve a particular attention because of their +nature. + +The purpose of this page is to keep track of them and prevent the risk of them +being forgotten in the Gerrit changes backlog. The topics listed here are either +related to the Gerrit architecture or to some fixes to severe bugs that need +particular attention and time for being reviewed and merged. + +Currently outstanding topics are: 1. [Top-menu-loading] +(OutstandingTopics#Top_menu_loading.md) 1. [auth-backends-HttpAuthProtocol] +(OutstandingTopics#Pluggable_authentication_backend.md) 1. [secure-store] +(OutstandingTopics#Secure_Store.md) 1. [angular-gerrit-integration] +(OutstandingTopics#Angular_Gerrit.md) + +-------------------------------------------------------------------------------- + +# Top-menu loading + +Refactor the top-menu loading mechanism in order to enrich the current REST-API +to fetch its entire content from the backend. Currently it is a "mixed" GWT + +REST-API driven, with part of the logic in GWT and other in the REST-API. + +The inability to control the top-menu from a REST-API forbids plugins (or other +alternative Gerrit GUIs) to render Gerrit header. With this change it will be +potentially possible to box the Gerrit top-menu into a different L&F. + +* Gerrit changes: [changes] + (https://gerrit-review.googlesource.com/#/q/status:open+project:gerrit+branch:master+topic:top-menus) +* Owner: [Luca] + (https://gerrit-review.googlesource.com/#/q/owner:%22Luca+Milanesio%22+status:open) +* Status: review started (+1) +* Issues: Need extra reviewers with +2 permissions to finalise the change. + +# Pluggable authentication backend + +Replace the current Gerrit authentication infrastructure, mainly based on +mega-switch/case with the list of protocols/methods supported, with a new +plugin-based authentication back-end. + +These changes would allow Gerrit to be more extensible, avoiding further growth +of the mega-switch/case all over the code and support the ability to load user +plugins to support other authentication systems in the same way that Gerrit +groups have been refactored years ago. + +* Gerrit changes: [changes] + (https://gerrit-review.googlesource.com/#/q/status:open+project:gerrit+branch:master+topic:auth-backends-HttpAuthProtocol) +* Owner: [Dariusz] + (https://gerrit-review.googlesource.com/#/q/owner:%22Dariusz+%25C5%2581uksza%22+status:open) +* Status: review started, partially merged +* Issues: after having merged part of it, the review is now stuck. Needs Shawn + attention as the first attempt to merge it broke the Gerrit authentication. + +# Angular Gerrit + +Dariusz presented at the Gerrit User a prototype for leveraging the REST-API +through an AngularJS UX. It has been published to GitHub at +https://github.com/dluksza/angular-gerrit. + +In order to use the Gerrit-Angular integration a set of changes in the Plugin +infrastructure are needed and have been uploaded for review. + +* Gerrit changes: [changes] + (https://gerrit-review.googlesource.com/#/q/status:open+topic:angular-gerrit-integration) +* Owner: [Dariusz] + (https://gerrit-review.googlesource.com/#/q/owner:%22Dariusz+%25C5%2581uksza%22+status:open) +* Status: changes submitted, topic created. +* Issues: None at the moment.
diff --git a/docs/ReviewKungFu.md b/docs/ReviewKungFu.md new file mode 100644 index 0000000..b080ee4 --- /dev/null +++ b/docs/ReviewKungFu.md
@@ -0,0 +1,49 @@ +# Review Kung Fu + +As you progress on your path of Review Kung Fu your skill may take you to new +levels. Your objective is to spot potential problems and help others find +solutions. As you do, others will take note of your skills, they will seek your +knowledge and wisdom: your Kung Fu. They will look to you to help grow their own +skills, to see further. This will be a sign that your efforts and training are +paying off. You, in turn will learn from them, and your Review Kung Fu will +continue to reach new higher levels. + +Together producers and reviewers share a symbiotic relationship. This +relationship fosters high quality results. When the product of this symbiosis is +shared with others, the Review Cuckoo rejoices. In his glory he spreads his +blessing and praise on those who have most helped others produce well. You +worked hard for this blessing and you proudly wear his praise; not because of +your arrogance, but because Diffy wants those seeking to improve their Kung Fu +to easily recognize those whose skill may help them deliver. + +# Diffy's Badges + +### Toes + +You are dipping your toes in the Review waters. You are not sure if you want to +take the plunge but your are excited about the possibilities ahead. + +### Heel + +You have recognized issues needing to be addressed and guided your peers towards +an improved product. In doing so you have had to dig your heels in a few times, +but it was worth it, a better path was taken. + +### Feather + +You have transcended beyond the bare mechanics of reviewing. Where others read, +you meditate. Where others comment on code, you add your yin to its yang. You +are no longer a reviewer, but an ethereal wisp that dwells within Gerrit. + +### Tail + +### Beak + +### Bandana + +### The Eye + +You have seen beyond what others have seen. You can spot a needle in a haystack. +Your vision has guided countless who wavered. Very little escapes your eye. It +sometimes seems like a glimpse of your eye may help find a way beyond what +appears to others as noisy chaos. Your Review Kung Fu is unchallenged.
diff --git a/docs/RoadMap.md b/docs/RoadMap.md new file mode 100644 index 0000000..0998c25 --- /dev/null +++ b/docs/RoadMap.md
@@ -0,0 +1,44 @@ +# Introduction + +There are many ideas suggested, and discussions which lead to decisions about +the future direction and designs of Gerrit which happen in many forums: on IRC, +the ML, at the hackathons, and in the issue tracker. It can be hard for an +outsider to get a feel for where Gerrit might be headed in the near, medium, and +long term. Of course, since this is an open source project, code speaks loudly. +However there are times when the maintainers feel that certain pathes are not +the way forward, and the desired alternative may have already been proposed as +the way that Gerrit should move. It can be helpful to developers to get an idea +about these decisions before embarking on developping a feature. Naturally, +there are also times when people just want to get a feel for what might be +coming down the pike. So we will attempt to illustrate some of these decisions +here. + +# Architecture + +* The REST API is viewed as the longterm stable approach for RPCs with the + Gerrit Server. At this point new UI elements and new ssh commands should be + developed against it. If a new service is created, it should extend the + current REST API or implement new pieces. + +* The hooks will eventually be moved to plugins. + +* The Database will eventually be removed from Gerrit. Authoritative data will + mostly be pushed into the project repositories. An indexing service such as + Lucene will be used to provide fast access to data. While this is the long + term plan, some pieces have already been moved out of the DB and into the + repos, for example project configuration and ACLs. Newer features are + expected to take a similar approach when possible. + +* New user preferences should be placed in a (yet to be born) repo named + All-Users. Each users preferences will live in a gitconfig style file under + a reference named refs/users/xxx/accountid where xxx is accountId mod 1000. + Since users shouldn't be able to access other users' refs, this structure + can be hidden from them and they should be able to access their refs via + refs/heads/master + +* Groups should probably gain a similar All-Groups repo. The membership file + could live there. See the [gitgroups plugin] + (https://gerrit-review.googlesource.com/35780). + +* Authentication will eventually be moved entirely to plugins. Much work has + already been done on this.
diff --git a/docs/Scaling.md b/docs/Scaling.md new file mode 100644 index 0000000..0ad9bbb --- /dev/null +++ b/docs/Scaling.md
@@ -0,0 +1,305 @@ +# Scaling Dimensions + +As you scale, you will always run into some issues. Depending on your specific +setup those issues may be very different from what other people are running +into. Some of the specific dimensions which people may or may not need to scale +on are: + +* Number of Changes: Several sites have close to 1M changes. +* Number of Projects: Some Gerrit installs have 20K projects, many have at + least a few thousand +* Number of Refs: Several sites have projects with over 100K refs, at least + one has close to 500K refs +* Number of Users: Many installs serve 1K users, some serve at least 3K. +* Number of Groups: Some Gerrit installs use 6K groups (most issues have been + worked out here) +* Single Repository Sizes: Some Gerrit projects are 40G aggressively packed, + these can often cause issues with email templates taking lots of CPU +* Total Repository Sizes: Gerrit can handle at least 1TB of repository data + easily +* Large Files: Gerrit may have difficulty with some very large files (what + size can it handle easily?) +* Number of Slaves: Several installations have over 10 slaves per master, at + least one has 25. +* Number of Continents with slaves: Slaves for a single master have been + distributed across at least 3 continents successfully +* Number of Receive-Packs: Some Gerrit masters are handling 20k receive-packs + (pushes) per day. + +-------------------------------------------------------------------------------- + +# Servers + +## Master + +The first step to scaling is to scale your master server. Some easy, but pricey +ways to scale your master are: + +* Adding cores, some of the larger installations use 48 core machines +* Adding RAM, most of the larger installations have over 100GB, at least one + has 1TB +* Ensure fast disk IO, SSDs help prevent serious performance degradation when + repos are not well repacked (seeks can be crippling here). +* Network, I suspect that most large installs use 10Gb Ethernet + +## Mirrors/Slaves + +Once you have a decent master, it is probably worth adding either some git +mirrors (if you do not need ACLs on your repos), or Gerrit slaves to help +offload much of your read only requests. Mirrors/Slaves can also help reduce LAN +and WAN traffic if you place them nearer to your users/build hosts. This can be +particularly useful for remote sites. Some of the larger installations have at +least 25 of these. + +### Shared Storage and Replication Entries For Slaves + +A common practice is to use site local shared storage (NFS...) on remote slaves +when there is more than one slave at the remote site. One major advantage of +this is that it reduces the data required to be pushed during replication to +that site. This requires consolidating the replication events to those slaves in +order to avoid having duplicated pushes to the same storage. This consolidation +means that the master replication file will only have one entry for each set of +slaves on the same storage. While a single slave could be setup as the sole +replication receiver, added availability and scaling is being reliably achieved +by using a load balancer on the receiving end to distribute each incoming push +to a different slave (since the back-end storage is the same, they all will +still see every update). + +### DB Slaves + +DB slaves are being used on remote sites so that remote slaves do not have to +traverse the WAN to talk to the master DB. Both PostGreSQL and MYSQL are being +used successfully for this. This can be particularly helpful to help reduce some +WAN traffic related to high ref counts when doing repo syncs (the [change cache] +(https://gerrit-review.googlesource.com/#/c/35220) was also designed to help +with this.) + +## Multi - Master + +The Gerrit MultiMaster plug-in describes how to setup a single site multi-master +with a shared storage back-end for git repository data. However, there are +currently no known sites using the open source MM technology yet. The google +hosted gerrit-review site currently runs in multi-site multi-master mode, but it +relies on proprietary technology to do so. + +-------------------------------------------------------------------------------- + +# Jetty + +The default built in web container which Gerrit uses is Jetty. Some +installations have had serious "Failed to dispatch" errors which lead to 100%CPU +and filled up logs, requiring a server reboot to recover. This can triggered by +long running RPCs building causing the http queue to be used. One way to +workaround this issue is to set http.maxqueued = 0. Alternatively, you can use +[Tomcat] +(https://gerrit-review.googlesource.com/#/c/35010/6/Documentation/install-tomcat.txt) +instead to replace Jetty. + +-------------------------------------------------------------------------------- + +# Repo Syncs + +With beefier servers, many people have [seen] +(http://groups.google.com/group/repo-discuss/browse_thread/thread/c8f003f2247d7157/ad6915f5558df8f5?lnk=gst&q=repo+sync+error#ad6915f5558df8f5) +channel master issues with ssh. Setting GIT\_SSH will cause repo to avoid using +channel master, and thus avoid triggering these errors: + +``` +export GIT\_SSH=$(which ssh) +``` + +Is this related to resetting the key after a certain amount of data? + +-------------------------------------------------------------------------------- + +# Java HEAP and GC + +Operations on git repositories can consume lots of memory. If you consume more +memory than your java heap, your server may either run out of memory and fail, +or simply thrash forever while java gciing. Large fetches such as clones tend to +be the largest RAM consumers on a Gerrit system. Since the total potential +memory load is generally proportional to the total amount of SSH threads and +replication threads combined, it is a good idea to configure your heap size and +thread counts together to form a safe combination. One way to do that is to +first determine your maximum memory usage per thread. Once you have determined +the per thread usage, you can tune your server so that you total thread count +multiplied by your maximum memory usage per thread, does not exceed your heap +size. + +One way to figure out your maximum memory usage per thread, is to find your +maximum git clone tipping point. Your tipping point is the maximum number of +clones that you can perform in parallel without causing your server to run out +of memory. To do this, you must first tune your server so that your ssh threads +are set to a higher than safe value. You must set it to a value at least as high +as the number of parallel clones you are going to attempt. When ready, increase +your testing with higher and higher clone counts until the server tips, then +deduce the point right before it tips. It helps to use your "worst" repository +for this. Once you have found the tipping point, you can calculate the +approximate per thread memory usage by dividing your heap size by your clone +count. If you find that you still have large java gc, you may further want to +reduce your thread counts. + +Your luck may vary with tweaking your jvm gc parameters. You may find that +increasing the size of the young generation may help drastically reduce the +amount of gc thrashing your server performs. + +-------------------------------------------------------------------------------- + +# Replication + +There are many scalability issues which can plague replication, most are related +to high ref counts, those are not specifically mentioned here, so you will +likely need to first be familiar with the "High Ref Counts" section to make +replication run smoothly. + +## JSch + +Jsch has threading issues which seem to serialize replication even across worker +groups. This has lead some teams to perform replication without using ssh (Jsch +is the ssh implementation used inside jgit). To do this, you may setup a "write +only" git deamon on your slaves with a port only open to your Gerrit master and +replicate via git daemon without authentication or encryption. This is +particularly useful if you have sites which replicate to at very different +speeds. + +## Failed To Lock + +With older versions of the replication plug-in, your replication can start +running into contention and failing with "Failed to Lock" errors in your logs. +This can happen when 2 separate threads attempt to replicate the same +project/branch combination (the plug-in no longer allows this). This problem can +resurface even with the newer plug-ing if you run a MultiMaster setup since +nothing currently prevents two different masters running the replciation plug-in +for the same instance from pushing the same ref at the same time. + +There are other scenarios besides replication contention that can cause "Failed +to Lock" errors. Fortunately, the current version of the replication plug-in can +be configured to retry these failed pushes. + +-------------------------------------------------------------------------------- + +# High Ref Counts + +High ref counts can have impacts in many places in the git/jgit/Gerrit stacks. +There are many ongoing fixes and tweaks to alleviated many of these problems, +but some of them still remain. Some can be "unofficially" worked around. + +## git daemon mirrors + +Current versions (prior to git 1.7.11) will use an [excessive amount of CPU] +(http://marc.info/?l=git&m=133310001303068&w=2) when receiving pushes on sites +with high ref counts. Upgrading git there can help drastically reduce your +replication time in these cases. + +## git + +Suggest to your users that they use the latest git possible, many of the older +versions (which are still the defaults on many distros) have severe problems +with high ref counts. Particularly [bad] +(http://marc.info/?l=git&m=131552810309538&w=2) versions are between 1.7.4.1 and +1.7.7. Git 1.8.1 seems to have some speed-ups in fetches of high ref counts +compared to even 1.7.8. + +## jgit + +jGit still has a [performance problem] +(http://groups.google.com/group/repo-discuss/browse_thread/thread/d0914922dc565516) +with high refs. Two diffenert patches, the [bucket queue] +(https://git.eclipse.org/r/#/c/24295/) patch and the [integer priority queue] +(https://git.eclipse.org/r/5491) patch, have been proposed and will drastically +reduce upload and replication times in Gerrit if applied for repos with many (> +60K?) patch sets. + +There are some very high performance patches which make jgit extremely fast. + +## Tags + +If you have android repositories, you likely use around 400-600 of them. Cross +project tagging can be [problematic] +(http://marc.info/?l=git&m=133772533609422&w=2). There are no good solutions yet +to this problem. + +-------------------------------------------------------------------------------- + +## ACLS + +On servers with little or no anonymous access, and large change counts, it can +be disastrous when non-logged-in users access a change-list page. A change-list +page scans all the changes and skips the changes a user cannot see until it has +found enough changes to display which the user can see. When there is no +anonymous access, this may mean traversing all of the changes in your instance +only to return a blank list. When the change count starts approaching 1M +changes, this large change traversal can cripple even a very high scale DB and +Gerrit combination. This is most prevalent on Monday mornings after your users +return to the office and have not logged into Gerrit yet (but are still +accessing it). One hacky way to deal with this is to potentially make a change +to Gerrit to never run the ChangeList page for non-logged in users. However, +this is not a viable solution for public sites. Of course, public sites likely +do not have 1M changes which are not visible to non-logged in users. It may make +sense to make this a configuration option in Gerrit at some point if this cannot +be sped up? + +-------------------------------------------------------------------------------- + +# Disk Space / File Cleanup + +Installations which do not have enough spare disk space for their repos can run +into problems easily. Be aware that git repos contain highly compressed data and +that at times this data may need to be uncompressed. It is easy to underestimate +the temporary needs of repositories because git is so good at compressing this +data. However, minor changes can cause repositories to "explode" so it is good +to plan for this and leave a lot of free space for this to never be an issue. +This is particularly important for those using SSDs where they might be more +likely to skimp on space. + +## Git GC Repo Explosions + +Under certain conditions git gc can cause a repo explosion (jgit gc does not +suffer from this problem because it puts unreachable objects in a packfile), +primarily when unreferenced objects are removed from pack files and are placed +as loose refs in the file-system. Eventually git gc should prune these, but +until that happens serious problems can occur. + +Some of the situations which can cause many unreferenced objects: + +* A user uploads a change to the wrong repository and it gets rejected by + Gerrit +* Tags are [deleted](http://marc.info/?l=git&m=131829057610072&w=2) from the + linux repo + +## Git GC + +Running GC regularly is important, particularly on sites with heavy uploads. +Older versions of jgit do not have built in gc and require using git gc. Setting +up a crontab is probably a good idea in these case. If you do run gc too often, +however excessive pack file churn can also be a problem. A potential [solution] +(https://gerrit-review.googlesource.com/#/c/35215/) for packfile churn: + +> nice -n 19 ionice -c 3 gitexproll.sh -r 5 ... + +Pack file churn can lead to several issues, RAM utilization, Disk utilization +and excessive WAN utilization for file-system mirroring scripts (such as rysnc). + +## Keep and Noz files + +Currently, Gerrit may leave behind some temporary files in your git repos when +it shuts down (particularly if ungraceful). There are some temporary files which +begin with "noz", this can consume disk space if left uncleaned. There are also +some .keep files in the objects/pack directories which can be left behind, these +don't in themselves don't take space, but they will prevent git gc from +repacking the packfile they are associated with which can lead to poor disk +space utilization and performance issues. + +## ~/.gerritcodereview + +The temporary unjared war files in here can build up. (This has been move to +review\_site/tmp in Gerrit 2.5+) + +-------------------------------------------------------------------------------- + +# hooks + +Servers with lots of RAM are susceptible to slow forks which can delay each hook +invocation quite a bit. When java uses over 10G of memory, it may add at least a +second to each hook invocation. Using java 7 seems to avoid this problem and +makes hooks blazingly fast again.
diff --git a/docs/ShowCases.md b/docs/ShowCases.md new file mode 100644 index 0000000..d3bc57a --- /dev/null +++ b/docs/ShowCases.md
@@ -0,0 +1,34 @@ +# Gerrit Installations in the Wild + +## Open Source Projects + +#### Android + +* [Where it all began](https://android-review.googlesource.com) +* [Cyanogen](http://review.cyanogenmod.org) +* [AOKP](http://gerrit.aokp.co) + +#### Other + +* [Gerrit](http://gerrit-review.googlesource.com) +* [ChromiumOS](http://chromium-review.googlesource.com) +* [Couchbase](http://review.couchbase.org) +* [Eclipse](https://git.eclipse.org/r/) +* [GerritHub](http://gerrithub.io) +* [GWT](https://gwt-review.googlesource.com/) +* [Kitware](http://review.source.kitware.com) +* [LibreOffice](https://gerrit.libreoffice.org) +* [OpenAFS](http://gerrit.openafs.org) +* [openstack](https://review.openstack.org) +* [oVirt](http://gerrit.ovirt.org) +* [Qt](https://codereview.qt-project.org) +* [RockBox](http://gerrit.rockbox.org) +* [SciLab](https://codereview.scilab.org) +* [STOQ](http://gerrit.async.com.br) +* [Typo3](https://review.typo3.org) +* [Wikimedia](https://gerrit.wikimedia.org) +* [Gluster](http://review.gluster.org) +* [Tuleap](https://gerrit.tuleap.net) +* [Go (programming language)](https://go-review.googlesource.com) +* [OpenSwitch](https://review.openswitch.net) +* [Vaadin](https://dev.vaadin.com)
diff --git a/docs/SqlMergeUserAccounts.md b/docs/SqlMergeUserAccounts.md new file mode 100644 index 0000000..7513711 --- /dev/null +++ b/docs/SqlMergeUserAccounts.md
@@ -0,0 +1,245 @@ +# Introduction + +Sometimes users wind up with two accounts on a Gerrit server, this is especially +common with OpenID installations when the user forgets which OpenID provider he +had used, and opens yet another account with a different OpenID identity... but +the site administrator knows they are the same person. + +Unfortunately this has happened often enough on review.source.android.com that +I've developed a set of PostgreSQL scripts to handle merging the accounts. + +The first script, load\_merge.sql, creates a temporary table called "links" +which contains a mapping of source account\_id to destination account\_id. This +mapping tries to map the most recently created account for a user to the oldest +account for the same user, by comparing email addresses and registration dates. +Administrators can (and probably should) edit this temporary table before +running the second script. The second script, merge\_accounts.sql, performs the +merge by updating all records in the database in a transaction, but does not +commit it at the end. This allows the administrator to double check any records +by query before committing the merge result for good. + +# load\_merge.sql + +``` +CREATE TEMP TABLE links +(from_id INT NOT NULL +,to_id INT NOT NULL); + +DELETE FROM links; + +INSERT INTO links (from_id, to_id) +SELECT + f.account_id +,t.account_id +FROM + accounts f +,accounts t +WHERE + f.preferred_email is not null + AND t.preferred_email is not null + AND f.account_id <> t.account_id + AND f.preferred_email = t.preferred_email + AND f.registered_on > t.registered_on + AND NOT EXISTS (SELECT 1 FROM links l + WHERE l.from_id = f.account_id + AND l.to_id = t.account_id); + +INSERT INTO links (from_id, to_id) +SELECT DISTINCT + f.account_id +,t.account_id +FROM + account_external_ids e_t +,account_external_ids e_f +,accounts f +,accounts t +WHERE + e_t.external_id = 'Google Account ' || e_f.email_address + AND e_f.account_id <> e_t.account_id + AND e_f.account_id = f.account_id + AND e_t.account_id = t.account_id + AND f.registered_on > t.registered_on + AND NOT EXISTS (SELECT 1 FROM links l + WHERE l.from_id = f.account_id + AND l.to_id = t.account_id); + +SELECT + l.from_id +,l.to_id +,f.registered_on +,t.registered_on +,t.preferred_email +FROM + links l +,accounts f +,accounts t +WHERE + f.account_id = l.from_id +AND t.account_id = l.to_id +ORDER BY t.preferred_email; +``` + +# merge\_accounts.sql + +``` +DROP TABLE to_del; +CREATE TEMP TABLE to_del (old_id INT); + +CREATE TEMP TABLE tmp_ids +(email_address VARCHAR(255) +,account_id INT NOT NULL +,from_account_id INT NOT NULL +,external_id VARCHAR(255) NOT NULL +); + +BEGIN TRANSACTION; + +DELETE FROM tmp_ids; +INSERT INTO tmp_ids +(account_id +,from_account_id +,email_address +,external_id) +SELECT + l.to_id +,l.from_id +,e.email_address +,e.external_id +FROM links l, account_external_ids e +WHERE e.account_id = l.from_id +AND NOT EXISTS (SELECT 1 FROM account_external_ids q + WHERE q.account_id = l.to_id + AND q.external_id = e.external_id); + +DELETE FROM account_external_ids +WHERE EXISTS (SELECT 1 FROM tmp_ids t + WHERE account_external_ids.external_id = t.external_id + AND account_external_ids.account_id = t.from_account_id); + +INSERT INTO account_external_ids +(account_id +,email_address +,external_id) +SELECT + account_id +,email_address +,external_id +FROM tmp_ids; + +INSERT INTO account_ssh_keys +(ssh_public_key +,valid +,account_id +,seq) +SELECT + k.ssh_public_key +,k.valid +,l.to_id +,100 + k.seq +FROM links l, account_ssh_keys k +WHERE k.account_id = l.from_id +AND NOT EXISTS (SELECT 1 FROM account_ssh_keys p + WHERE p.account_id = l.to_id + AND p.ssh_public_key = k.ssh_public_key); + +INSERT INTO starred_changes +(account_id, change_id) +SELECT l.to_id, s.change_id +FROM links l, starred_changes s +WHERE l.from_id IS NOT NULL + AND l.to_id IS NOT NULL + AND s.account_id = l.from_id + AND NOT EXISTS (SELECT 1 FROM starred_changes e + WHERE e.account_id = l.to_id + AND e.change_id = s.change_id); + +INSERT INTO account_project_watches +(account_id, project_name) +SELECT l.to_id, s.project_name +FROM links l, account_project_watches s +WHERE l.from_id IS NOT NULL + AND l.to_id IS NOT NULL + AND s.account_id = l.from_id + AND NOT EXISTS (SELECT 1 FROM account_project_watches e + WHERE e.account_id = l.to_id + AND e.project_name = s.project_name); + +INSERT INTO account_group_members +(account_id, group_id) +SELECT l.to_id, s.group_id +FROM links l, account_group_members s +WHERE l.from_id IS NOT NULL + AND l.to_id IS NOT NULL + AND s.account_id = l.from_id + AND NOT EXISTS (SELECT 1 FROM account_group_members e + WHERE e.account_id = l.to_id + AND e.group_id = s.group_id); + +UPDATE changes +SET owner_account_id = (SELECT l.to_id + FROM links l + WHERE l.from_id = owner_account_id) +WHERE EXISTS (SELECT 1 FROM links l + WHERE l.to_id IS NOT NULL + AND l.from_id IS NOT NULL + AND l.from_id = owner_account_id); + +UPDATE patch_sets +SET uploader_account_id = (SELECT l.to_id + FROM links l + WHERE l.from_id = uploader_account_id) +WHERE EXISTS (SELECT 1 FROM links l + WHERE l.to_id IS NOT NULL + AND l.from_id IS NOT NULL + AND l.from_id = uploader_account_id); + +UPDATE patch_set_approvals +SET account_id = (SELECT l.to_id + FROM links l + WHERE l.from_id = account_id) +WHERE EXISTS (SELECT 1 FROM links l + WHERE l.to_id IS NOT NULL + AND l.from_id IS NOT NULL + AND l.from_id = account_id) + AND NOT EXISTS (SELECT 1 FROM patch_set_approvals e, links l + WHERE e.change_id = patch_set_approvals.change_id + AND e.patch_set_id = patch_set_approvals.patch_set_id + AND e.account_id = l.to_id + AND e.category_id = patch_set_approvals.category_id + AND l.from_id = patch_set_approvals.account_id); + +UPDATE change_messages +SET author_id = (SELECT l.to_id + FROM links l + WHERE l.from_id = author_id) +WHERE EXISTS (SELECT 1 FROM links l + WHERE l.to_id IS NOT NULL + AND l.from_id IS NOT NULL + AND l.from_id = author_id); + +UPDATE patch_comments +SET author_id = (SELECT l.to_id + FROM links l + WHERE l.from_id = author_id) +WHERE EXISTS (SELECT 1 FROM links l + WHERE l.to_id IS NOT NULL + AND l.from_id IS NOT NULL + AND l.from_id = author_id); + + +-- Destroy the from account +-- +INSERT INTO to_del +SELECT from_id FROM links +WHERE to_id IS NOT NULL +AND from_id IS NOT NULL; + +DELETE FROM account_agreements WHERE account_id IN (SELECT old_id FROM to_del); +DELETE FROM account_external_ids WHERE account_id IN (SELECT old_id FROM to_del); +DELETE FROM account_group_members WHERE account_id IN (SELECT old_id FROM to_del); +DELETE FROM account_project_watches WHERE account_id IN (SELECT old_id FROM to_del); +DELETE FROM account_ssh_keys WHERE account_id IN (SELECT old_id FROM to_del); +DELETE FROM accounts WHERE account_id IN (SELECT old_id FROM to_del); +DELETE FROM starred_changes WHERE account_id IN (SELECT old_id FROM to_del); +DELETE FROM patch_set_approvals WHERE account_id IN (SELECT old_id FROM to_del); +```