commit | 00fc15ac0073b86270e7c0f40d386f95dfe31e86 | [log] [tgz] |
---|---|---|
author | Hugo Arès <hugo.ares@ericsson.com> | Mon Nov 27 13:59:19 2017 -0500 |
committer | Hugo Arès <hugares@gmail.com> | Thu Feb 22 18:53:47 2018 +0000 |
tree | b6fe2e84e780a83542c4fa35749b127ec2a81549 | |
parent | 0f2faafdda92c1fe70f18cb112772d160b27aea2 [diff] |
Reduce chance of deadlock in account cache This deadlock is not the typical deadlock where 2 threads locked a resource and each one is waiting to lock a second resource already locked by the other thread. The thread owning the account cache lock is parked, which tell us that the locked was not released. I could not determine the exact sequence of events leading to this deadlock making it really hard to report/fix the problem. While investigating, I realized that there quite a few reported issues in Guava that could be major for Gerrit: Our deadlock happening in account cache https://bugs.chromium.org/p/gerrit/issues/detail?id=7645 Other deadlock https://github.com/google/guava/issues/2976 https://github.com/google/guava/issues/2863 Performance https://github.com/google/guava/issues/2063 Race condition https://github.com/google/guava/issues/2971 Because I could not reproduce the deadlock in a dev environment or in a unit test making it almost impossible to fix, I considered other options such as replacing Guava by something else. The maintainer of Caffeine[1] cache claims that Caffeine is a high performance[2], near optimal caching library designed/implemented base on the experience of designing Guava's cache and ConcurrentLinkedHashMap. I also did some benchmarks about spawning a lot of threads reading/writing values from the caches. I ran those benchmarks on both Guava and Caffeine and Guava was always taking at least double the time than Caffeine to complete all operations. Migrating to Caffeine is almost a drop-in replacement. Caffeine interface are very similar to Guava cache and there is an adapter to migrate to Caffeine and keep using Guava's interfaces. After migrating to Caffeine, we saw that deadlock occurrence was reduced from once a day to once every 2 weeks in our production server. The maintainer of Caffeine, Ben Manes pointed us to the possible cause[3] of this issue, a bug[4] in the kernel and its fix[5]. Our kernel version is supposed to have the fix but we will try another OS and kernel version. Replacing Guava caches by Caffeine brings 2 things, it reduces the chances of having the deadlock most likely caused by a kernel bug and improve the cache performance. [1]https://github.com/ben-manes/caffeine [2]https://github.com/ben-manes/caffeine/wiki/Benchmarks [3]https://groups.google.com/forum/#!topic/mechanical-sympathy/QbmpZxp6C64 [4]https://github.com/torvalds/linux/commit/b0c29f79ecea0b6fbcefc999e70f2843ae8306db [5]https://github.com/torvalds/linux/commit/76835b0ebf8a7fe85beb03c75121419a7dec52f0 Bug: Issue 7645 Change-Id: I8d2b17a94d0e9daf9fa9cdda563316c5768b29ae
Gerrit is a code review and project management tool for Git based projects.
Gerrit makes reviews easier by showing changes in a side-by-side display, and allowing inline comments to be added by any reviewer.
Gerrit simplifies Git based project maintainership by permitting any authorized user to submit changes to the master Git repository, rather than requiring all approved changes to be merged in by hand by the project maintainer.
For information about how to install and use Gerrit, refer to the documentation.
Our canonical Git repository is located on googlesource.com. There is a mirror of the repository on Github.
Please report bugs on the issue tracker.
Gerrit is the work of hundreds of contributors. We appreciate your help!
Please read the contribution guidelines.
Note that we do not accept Pull Requests via the Github mirror.
The IRC channel on freenode is #gerrit. An archive is available at: echelog.com.
The Developer Mailing list is repo-discuss on Google Groups.
Gerrit is provided under the Apache License 2.0.
Install Bazel and run the following:
git clone --recursive https://gerrit.googlesource.com/gerrit cd gerrit && bazel build release
The instruction how to configure GerritForge/BinTray repositories is here
On Debian/Ubuntu run:
apt-get update & apt-get install gerrit=<version>-<release>
NOTE: release is a counter that starts with 1 and indicates the number of packages that have been released with the same version of the software.
On CentOS/RedHat run:
yum clean all && yum install gerrit-<version>[-<release>]
On Fedora run:
dnf clean all && dnf install gerrit-<version>[-<release>]
Docker images of Gerrit are available on DockerHub
To run a CentOS 7 based Gerrit image:
docker run -p 8080:8080 gerritforge/gerrit-centos7[:version]
To run a Ubuntu 15.04 based Gerrit image:
docker run -p 8080:8080 gerritforge/gerrit-ubuntu15.04[:version]
NOTE: release is optional. Last released package of the version is installed if the release number is omitted.