Introduce a PriorityQueue sorting RevCommits by commit timestamp

The DateRevQueue uses a tailor-made algorithm to keep
RevCommits sorted by reversed commit timestamp, which has a O(n*n/2)
complexity and caused the explosion of the Git fetch times to
tens of seconds.

The standard Java PriorityQueue provides a O(n*log(n)) complexity
and scales much better with the increase of the number of
RevCommits.

Introduce a new implementation DateRevPriorityQueue of the DateRevQueue
based on PriorityQueue.

Enable usage of the new DateRevPriorityQueue implementation by setting
the system property REVWALK_USE_PRIORITY_QUEUE=true. By default the old
implementation DateRevQueue is used.

Benchmark results:
```
(numCommits)  (usePriorityQueue)  Mode  Cnt     Score Error  Units
           5                true  avgt   10    39,4 ±   6,1  ns/op
           5               false  avgt   10    14,1 ±   2,2  ns/op
          10                true  avgt   10    29,7 ±   3,5  ns/op
          10               false  avgt   10    13,2 ±   2,0  ns/op
          50                true  avgt   10    50,4 ±   5,3  ns/op
          50               false  avgt   10    18,6 ±   0,2  ns/op
         100                true  avgt   10    58,3 ±   5,0  ns/op
         100               false  avgt   10    20,5 ±   0,8  ns/op
         500                true  avgt   10    51,7 ±   2,6  ns/op
         500               false  avgt   10    43,3 ±   0,5  ns/op
        1000                true  avgt   10    49,2 ±   2,4  ns/op
        1000               false  avgt   10    62,7 ±   0,2  ns/op
        5000                true  avgt   10    48,8 ±   1,5  ns/op
        5000               false  avgt   10   228,3 ±   0,5  ns/op
       10000                true  avgt   10    44,2 ±   0,9  ns/op
       10000               false  avgt   10   377,6 ±   2,7  ns/op
       50000                true  avgt   10    50,3 ±   1,6  ns/op
       50000               false  avgt   10   637,0 ± 111,8  ns/op
      100000                true  avgt   10    61,8 ±   4,4  ns/op
      100000               false  avgt   10   965,1 ± 268,0  ns/op
      500000                true  avgt   10   127,2 ±   7,9  ns/op
      500000               false  avgt   10  9610,2 ± 184,8  ns/op
```

Memory allocation results:
```
Number of commits loaded: 850 000
Custom implementation: 378 245 120 Bytes
Priority queue implementation: 340 495 616 Bytes
```

Bug: 580137
Change-Id: I8b33df6e9ee88933098ecc81ce32bdb189715041
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
10 files changed
tree: ce76680b90ddd2fa117bff7974c3009d4e118719
  1. .mvn/
  2. .settings/
  3. Documentation/
  4. lib/
  5. org.eclipse.jgit/
  6. org.eclipse.jgit.ant/
  7. org.eclipse.jgit.ant.test/
  8. org.eclipse.jgit.archive/
  9. org.eclipse.jgit.benchmarks/
  10. org.eclipse.jgit.coverage/
  11. org.eclipse.jgit.gpg.bc/
  12. org.eclipse.jgit.gpg.bc.test/
  13. org.eclipse.jgit.http.apache/
  14. org.eclipse.jgit.http.server/
  15. org.eclipse.jgit.http.test/
  16. org.eclipse.jgit.junit/
  17. org.eclipse.jgit.junit.http/
  18. org.eclipse.jgit.junit.ssh/
  19. org.eclipse.jgit.lfs/
  20. org.eclipse.jgit.lfs.server/
  21. org.eclipse.jgit.lfs.server.test/
  22. org.eclipse.jgit.lfs.test/
  23. org.eclipse.jgit.packaging/
  24. org.eclipse.jgit.pgm/
  25. org.eclipse.jgit.pgm.test/
  26. org.eclipse.jgit.ssh.apache/
  27. org.eclipse.jgit.ssh.apache.agent/
  28. org.eclipse.jgit.ssh.apache.test/
  29. org.eclipse.jgit.ssh.jsch/
  30. org.eclipse.jgit.ssh.jsch.test/
  31. org.eclipse.jgit.test/
  32. org.eclipse.jgit.ui/
  33. tools/
  34. .bazelrc
  35. .bazelversion
  36. .gitattributes
  37. .gitignore
  38. .mailmap
  39. BUILD
  40. CODE_OF_CONDUCT.md
  41. CONTRIBUTING.md
  42. LICENSE
  43. pom.xml
  44. README.md
  45. SECURITY.md
  46. WORKSPACE
README.md

Java Git

An implementation of the Git version control system in pure Java.

This project is licensed under the EDL (Eclipse Distribution License).

JGit can be imported straight into Eclipse and built and tested from there. It can be built from the command line using Maven or Bazel. The CI builds use Maven and run on Jenkins.

  • org.eclipse.jgit

    A pure Java library capable of being run standalone, with no additional support libraries. It provides classes to read and write a Git repository and operate on a working directory.

    All portions of JGit are covered by the EDL. Absolutely no GPL, LGPL or EPL contributions are accepted within this package.

  • org.eclipse.jgit.ant

    Ant tasks based on JGit.

  • org.eclipse.jgit.archive

    Support for exporting to various archive formats (zip etc).

  • org.eclipse.jgit.http.apache

    Apache httpclient support.

  • org.eclipse.jgit.http.server

    Server for the smart and dumb Git HTTP protocol.

  • org.eclipse.jgit.lfs

    Support for LFS (Large File Storage).

  • org.eclipse.jgit.lfs.server

    Basic LFS server support.

  • org.eclipse.jgit.packaging

    Production of Eclipse features and p2 repository for JGit. See the JGit Wiki on why and how to use this module.

  • org.eclipse.jgit.pgm

    Command-line interface Git commands implemented using JGit (“pgm” stands for program).

  • org.eclipse.jgit.ssh.apache

    Client support for the SSH protocol based on Apache Mina sshd.

  • org.eclipse.jgit.ssh.apache.agent

    Optional support for SSH agents for org.eclipse.jgit.ssh.apache.

  • org.eclipse.jgit.ui

    Simple UI for displaying git log.

Tests

  • org.eclipse.jgit.junit, org.eclipse.jgit.junit.http, org.eclipse.jgit.junit.ssh: Helpers for unit testing
  • org.eclipse.jgit.ant.test: Unit tests for org.eclipse.jgit.ant
  • org.eclipse.jgit.http.test: Unit tests for org.eclipse.jgit.http.server
  • org.eclipse.jgit.lfs.server.test: Unit tests for org.eclipse.jgit.lfs.server
  • org.eclipse.jgit.lfs.test: Unit tests for org.eclipse.jgit.lfs
  • org.eclipse.jgit.pgm.test: Unit tests for org.eclipse.jgit.pgm
  • org.eclipse.jgit.ssh.apache.test: Unit tests for org.eclipse.jgit.ssh.apache
  • org.eclipse.jgit.test: Unit tests for org.eclipse.jgit

Warnings/Caveats

  • Native symbolic links are supported, provided the file system supports them. For Windows you must use a non-administrator account and have the SeCreateSymbolicLinkPrivilege.

  • Only the timestamp of the index is used by JGit if the index is dirty.

  • JGit 6.0 and newer requires at least Java 11. Older versions require at least Java 1.8.

  • CRLF conversion is performed depending on the core.autocrlf setting, however Git for Windows by default stores that setting during installation in the “system wide” configuration file. If Git is not installed, use the global or repository configuration for the core.autocrlf setting.

  • The system wide configuration file is located relative to where C Git is installed. Make sure Git can be found via the PATH environment variable. When installing Git for Windows check the “Run Git from the Windows Command Prompt” option. There are other options like Eclipse settings that can be used for pointing out where C Git is installed. Modifying PATH is the recommended option if C Git is installed.

  • We try to use the same notation of $HOME as C Git does. On Windows this is often not the same value as the user.home system property.

Features

  • org.eclipse.jgit

    • Read loose and packed commits, trees, blobs, including deltafied objects.

    • Read objects from shared repositories

    • Write loose commits, trees, blobs.

    • Write blobs from local files or Java InputStreams.

    • Read blobs as Java InputStreams.

    • Copy trees to local directory, or local directory to a tree.

    • Lazily loads objects as necessary.

    • Read and write .git/config files.

    • Create a new repository.

    • Read and write refs, including walking through symrefs.

    • Read, update and write the Git index.

    • Checkout in dirty working directory if trivial.

    • Walk the history from a given set of commits looking for commits introducing changes in files under a specified path.

    • Object transport

      Fetch via ssh, git, http, Amazon S3 and bundles. Push via ssh, git, http, and Amazon S3. JGit does not yet deltify the pushed packs so they may be a lot larger than C Git packs.

    • Garbage collection

    • Merge

    • Rebase

    • And much more

  • org.eclipse.jgit.pgm

    • Assorted set of command line utilities. Mostly for ad-hoc testing of jgit log, glog, fetch etc.
  • org.eclipse.jgit.ant

    • Ant tasks
  • org.eclipse.jgit.archive

    • Support for Zip/Tar and other formats
  • org.eclipse.http

    • HTTP client and server support

Missing Features

There are some missing features:

  • signing push
  • shallow and partial cloning
  • support for remote helpers
  • support for credential helpers
  • support for multiple working trees (git-worktree)
  • using external diff tools
  • support for HTTPS client certificates
  • SHA-256 object IDs
  • git protocol V2 (client side): packfile-uris
  • multi-pack index
  • split index

Support

Post questions, comments or discussions to the jgit-dev@eclipse.org mailing list. You need to be subscribed to post. File bugs and enhancement requests in Bugzilla.

Contributing

See the EGit Contributor Guide.

About Git

More information about Git, its repository format, and the canonical C based implementation can be obtained from the Git website.