SHA-1: collision detection support

Update SHA1 class to include a Java port of sha1dc[1]'s ubc_check,
which can detect the attack pattern used by the SHAttered[2] authors.

Given the shattered example files that have the same SHA-1, this
modified implementation can identify there is risk of collision given
only one file in the pair:

  $ jgit ...
  [main] WARN org.eclipse.jgit.util.sha1.SHA1 - SHA-1 collision 38762cf7f55934b34d179ae6a4c80cadccbb7f0a

When JGit detects probability of a collision the SHA1 class now warns
on the logger, reporting the object's SHA-1 hash, and then throws a
Sha1CollisionException to the caller.

From the paper[3] by Marc Stevens, the probability of a false positive
identification of a collision is about 14 * 2^(-160), sufficiently low
enough for any detected collision to likely be a real collision.

git-core[4] may adopt sha1dc before the system migrates to an entirely
new hash function.  This commit enables JGit to remain compatible with
that move to sha1dc, and help protect users by warning if similar
attacks as SHAttered are identified.

Performance declined about 8% (detection off), now:

  MessageDigest        238.41 MiB/s
  MessageDigest        244.52 MiB/s
  MessageDigest        244.06 MiB/s
  MessageDigest        242.58 MiB/s

  SHA1                 216.77 MiB/s (was ~240.83 MiB/s)
  SHA1                 220.98 MiB/s
  SHA1                 221.76 MiB/s
  SHA1                 221.34 MiB/s

This decline in throughput is attributed to the step loop unrolling in
compress(), which was necessary to easily fit the UbcCheck logic into
the hash function.  Using helper functions s1-s4 reduces the code
explosion, providing acceptable throughput.

With detection enabled (default):

  SHA1 detectCollision 180.12 MiB/s
  SHA1 detectCollision 181.59 MiB/s
  SHA1 detectCollision 181.64 MiB/s
  SHA1 detectCollision 182.24 MiB/s

  sha1dc (native C)   ~206.28 MiB/s
  sha1dc (native C)   ~204.47 MiB/s
  sha1dc (native C)   ~203.74 MiB/s

Average time across 100,000 calls to hash 4100 bytes (such as a commit
or tree) for the various algorithms available to JGit also shows SHA1
is slower than MessageDigest, but by an acceptable margin:

  MessageDigest        17 usec
  SHA1                 18 usec
  SHA1 detectCollision 22 usec

Time to index-pack for git.git (217982 objects, 69 MiB) has increased:

  MessageDigest   SHA1 w/ detectCollision
  -------------   -----------------------
         20.12s   25.25s
         19.87s   25.48s
         20.04s   25.26s

    avg  20.01s   25.33s    +26%

Being implemented in Java with these additional safety checks is
clearly a penalty, but throughput is still acceptable given the
increased security against object name collisions.

[1] https://github.com/cr-marcstevens/sha1collisiondetection
[2] https://shattered.it/
[3] https://marc-stevens.nl/research/papers/C13-S.pdf
[4] https://public-inbox.org/git/20170223230621.43anex65ndoqbgnf@sigill.intra.peff.net/

Change-Id: I9fe4c6d8fc5e5a661af72cd3246c9e67b1b9fee6
11 files changed
tree: a2c7da9b4d104e4fcfc910f66d8452772d150c17
  1. .mvn/
  2. lib/
  3. org.eclipse.jgit/
  4. org.eclipse.jgit.ant/
  5. org.eclipse.jgit.ant.test/
  6. org.eclipse.jgit.archive/
  7. org.eclipse.jgit.http.apache/
  8. org.eclipse.jgit.http.server/
  9. org.eclipse.jgit.http.test/
  10. org.eclipse.jgit.junit/
  11. org.eclipse.jgit.junit.http/
  12. org.eclipse.jgit.lfs/
  13. org.eclipse.jgit.lfs.server/
  14. org.eclipse.jgit.lfs.server.test/
  15. org.eclipse.jgit.lfs.test/
  16. org.eclipse.jgit.packaging/
  17. org.eclipse.jgit.pgm/
  18. org.eclipse.jgit.pgm.test/
  19. org.eclipse.jgit.test/
  20. org.eclipse.jgit.ui/
  21. tools/
  22. .buckconfig
  23. .buckversion
  24. .gitattributes
  25. .gitignore
  26. .mailmap
  27. BUCK
  28. BUILD
  29. CONTRIBUTING.md
  30. LICENSE
  31. pom.xml
  32. README.md
  33. WORKSPACE
README.md

Java Git

An implementation of the Git version control system in pure Java.

This package is licensed under the EDL (Eclipse Distribution License).

JGit can be imported straight into Eclipse, built and tested from there, but the automated builds use Maven.

  • org.eclipse.jgit

    A pure Java library capable of being run standalone, with no additional support libraries. It provides classes to read and write a Git repository and operate on a working directory.

    All portions of JGit are covered by the EDL. Absolutely no GPL, LGPL or EPL contributions are accepted within this package.

  • org.eclipse.jgit.ant

    Ant tasks based on JGit.

  • org.eclipse.jgit.archive

    Support for exporting to various archive formats (zip etc).

  • org.eclipse.jgit.http.apache

    Apache httpclient support

  • org.eclipse.jgit.http.server

    Server for the smart and dumb Git HTTP protocol.

  • org.eclipse.jgit.pgm

    Command-line interface Git commands implemented using JGit (“pgm” stands for program).

  • org.eclipse.jgit.packaging

    Production of Eclipse features and p2 repository for JGit. See the JGit Wiki on why and how to use this module.

Tests

  • org.eclipse.jgit.junit

    Helpers for unit testing

  • org.eclipse.jgit.test

    Unit tests for org.eclipse.jgit

  • org.eclipse.jgit.ant.test

  • org.eclipse.jgit.pgm.test

  • org.eclipse.jgit.http.test

  • org.eclipse.jgit.junit.test

    No further description needed

Warnings/Caveats

  • Native smbolic links are supported, provided the file system supports them. For Windows you must have Windows Vista/Windows 2008 or newer, use a non-administrator account and have the SeCreateSymbolicLinkPrivilege.

  • Only the timestamp of the index is used by jgit if the index is dirty.

  • JGit requires at least a Java 8 JDK.

  • CRLF conversion is performed depending on the core.autocrlf setting, however Git for Windows by default stores that setting during installation in the “system wide” configuration file. If Git is not installed, use the global or repository configuration for the core.autocrlf setting.

  • The system wide configuration file is located relative to where C Git is installed. Make sure Git can be found via the PATH environment variable. When installing Git for Windows check the “Run Git from the Windows Command Prompt” option. There are other options like Eclipse settings that can be used for pointing out where C Git is installed. Modifying PATH is the recommended option if C Git is installed.

  • We try to use the same notation of $HOME as C Git does. On Windows this is often not the same value as the user.home system property.

Package Features

  • org.eclipse.jgit/

    • Read loose and packed commits, trees, blobs, including deltafied objects.

    • Read objects from shared repositories

    • Write loose commits, trees, blobs.

    • Write blobs from local files or Java InputStreams.

    • Read blobs as Java InputStreams.

    • Copy trees to local directory, or local directory to a tree.

    • Lazily loads objects as necessary.

    • Read and write .git/config files.

    • Create a new repository.

    • Read and write refs, including walking through symrefs.

    • Read, update and write the Git index.

    • Checkout in dirty working directory if trivial.

    • Walk the history from a given set of commits looking for commits introducing changes in files under a specified path.

    • Object transport Fetch via ssh, git, http, Amazon S3 and bundles. Push via ssh, git and Amazon S3. JGit does not yet deltify the pushed packs so they may be a lot larger than C Git packs.

    • Garbage collection

    • Merge

    • Rebase

    • And much more

  • org.eclipse.jgit.pgm/

    • Assorted set of command line utilities. Mostly for ad-hoc testing of jgit log, glog, fetch etc.
  • org.eclipse.jgit.ant/

    • Ant tasks
  • org.eclipse.jgit.archive/

    • Support for Zip/Tar and other formats
  • org.eclipse.http.*/

    • HTTP client and server support

Missing Features

There are some missing features:

  • gitattributes support

Support

Post question, comments or patches to the jgit-dev@eclipse.org mailing list. You need to be subscribed to post, see here:

https://dev.eclipse.org/mailman/listinfo/jgit-dev

Contributing

See the EGit Contributor Guide:

http://wiki.eclipse.org/EGit/Contributor_Guide

About Git

More information about Git, its repository format, and the canonical C based implementation can be obtained from the Git website:

http://git-scm.com/