Fix core.autocrlf for non-normalized index

With text=auto or core.autocrlf=true, git does not normalize upon
check-in if the file in the index contains already CR/LFs. The
documentation says: "When text is set to "auto", the path is
marked for automatic end-of-line conversion. If Git decides that
the content is text, its line endings are converted to LF on
checkin. When the file has been committed with CRLF, no conversion
is done."[1]

Implement the last bit as in canonical git: check the blob in the
index for CR/LFs. For very large files, we check only the first 8000
bytes, like RawText.isBinary() and AutoLFInputStream do.

In Auto(CR)LFInputStream, ensure that the buffer is filled as much as
possible for the isBinary() check.

Regarding these content checks, there are a number of inconsistencies:

* Canonical git considers files containing lone CRs as binary.
* RawText checks the first 8000 bytes.
* Auto(CR)LFInputStream checks the first 8096 (not 8192!) bytes.

None of these are changed with this commit. It appears that canonical
git will check the whole blob, not just the first 8k bytes. Also
note: the check for CR/LF text won't work with LFS (neither in JGit
nor in git) since the blob data is not run through the smudge filter.
C.f. [2].

Two tests in AddCommandTest actually tested that normalization was
done even if the file was already committed with CR/LF.These tests
had to be adapted. I find the git documentation unclear about the
case where core.autocrlf=input, but from [3] it looks as if this
non-normalization also applies in this case.

Add new tests in CommitCommandTest testing this for the case where
the index entry is for a merge conflict. In this case, canonical git
uses the "ours" version.[4] Do the same.

[1] https://git-scm.com/docs/gitattributes
[2] https://github.com/git/git/blob/3434569fc/convert.c#L225
[3] https://github.com/git/git/blob/3434569fc/convert.c#L529
[4] https://github.com/git/git/blob/f2b6aa98b/read-cache.c#L3281

Bug: 470643
Change-Id: Ie7310539fbe6c737d78b1dcc29e34735d4616b88
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
7 files changed
tree: 47e943891f755cdfd2c169056d14292949033a8a
  1. .mvn/
  2. Documentation/
  3. lib/
  4. org.eclipse.jgit/
  5. org.eclipse.jgit.ant/
  6. org.eclipse.jgit.ant.test/
  7. org.eclipse.jgit.archive/
  8. org.eclipse.jgit.coverage/
  9. org.eclipse.jgit.http.apache/
  10. org.eclipse.jgit.http.server/
  11. org.eclipse.jgit.http.test/
  12. org.eclipse.jgit.junit/
  13. org.eclipse.jgit.junit.http/
  14. org.eclipse.jgit.junit.ssh/
  15. org.eclipse.jgit.lfs/
  16. org.eclipse.jgit.lfs.server/
  17. org.eclipse.jgit.lfs.server.test/
  18. org.eclipse.jgit.lfs.test/
  19. org.eclipse.jgit.packaging/
  20. org.eclipse.jgit.pgm/
  21. org.eclipse.jgit.pgm.test/
  22. org.eclipse.jgit.ssh.apache/
  23. org.eclipse.jgit.ssh.apache.test/
  24. org.eclipse.jgit.test/
  25. org.eclipse.jgit.ui/
  26. tools/
  27. .gitattributes
  28. .gitignore
  29. .mailmap
  30. BUILD
  31. CONTRIBUTING.md
  32. LICENSE
  33. pom.xml
  34. README.md
  35. WORKSPACE
README.md

Java Git

An implementation of the Git version control system in pure Java.

This project is licensed under the EDL (Eclipse Distribution License).

JGit can be imported straight into Eclipse and built and tested from there. It can be built from the command line using Maven or Bazel. The CI builds use Maven and run on Jenkins.

  • org.eclipse.jgit

    A pure Java library capable of being run standalone, with no additional support libraries. It provides classes to read and write a Git repository and operate on a working directory.

    All portions of JGit are covered by the EDL. Absolutely no GPL, LGPL or EPL contributions are accepted within this package.

  • org.eclipse.jgit.ant

    Ant tasks based on JGit.

  • org.eclipse.jgit.archive

    Support for exporting to various archive formats (zip etc).

  • org.eclipse.jgit.http.apache

    Apache httpclient support.

  • org.eclipse.jgit.http.server

    Server for the smart and dumb Git HTTP protocol.

  • org.eclipse.jgit.lfs

    Support for LFS (Large File Storage).

  • org.eclipse.jgit.lfs.server

    Basic LFS server support.

  • org.eclipse.jgit.packaging

    Production of Eclipse features and p2 repository for JGit. See the JGit Wiki on why and how to use this module.

  • org.eclipse.jgit.pgm

    Command-line interface Git commands implemented using JGit (“pgm” stands for program).

  • org.eclipse.jgit.ssh.apache

    Client support for the ssh protocol based on Apache Mina sshd.

  • org.eclipse.jgit.ui

    Simple UI for displaying git log.

Tests

  • org.eclipse.jgit.junit, org.eclipse.jgit.junit.http, org.eclipse.jgit.junit.ssh: Helpers for unit testing
  • org.eclipse.jgit.ant.test: Unit tests for org.eclipse.jgit.ant
  • org.eclipse.jgit.http.test: Unit tests for org.eclipse.jgit.http.server
  • org.eclipse.jgit.lfs.server.test: Unit tests for org.eclipse.jgit.lfs.server
  • org.eclipse.jgit.lfs.test: Unit tests for org.eclipse.jgit.lfs
  • org.eclipse.jgit.pgm.test: Unit tests for org.eclipse.jgit.pgm
  • org.eclipse.jgit.ssh.apache.test: Unit tests for org.eclipse.jgit.ssh.apache
  • org.eclipse.jgit.test: Unit tests for org.eclipse.jgit

Warnings/Caveats

  • Native symbolic links are supported, provided the file system supports them. For Windows you must use a non-administrator account and have the SeCreateSymbolicLinkPrivilege.

  • Only the timestamp of the index is used by JGit if the index is dirty.

  • JGit requires at least a Java 8 JDK.

  • CRLF conversion is performed depending on the core.autocrlf setting, however Git for Windows by default stores that setting during installation in the “system wide” configuration file. If Git is not installed, use the global or repository configuration for the core.autocrlf setting.

  • The system wide configuration file is located relative to where C Git is installed. Make sure Git can be found via the PATH environment variable. When installing Git for Windows check the “Run Git from the Windows Command Prompt” option. There are other options like Eclipse settings that can be used for pointing out where C Git is installed. Modifying PATH is the recommended option if C Git is installed.

  • We try to use the same notation of $HOME as C Git does. On Windows this is often not the same value as the user.home system property.

Features

  • org.eclipse.jgit

    • Read loose and packed commits, trees, blobs, including deltafied objects.

    • Read objects from shared repositories

    • Write loose commits, trees, blobs.

    • Write blobs from local files or Java InputStreams.

    • Read blobs as Java InputStreams.

    • Copy trees to local directory, or local directory to a tree.

    • Lazily loads objects as necessary.

    • Read and write .git/config files.

    • Create a new repository.

    • Read and write refs, including walking through symrefs.

    • Read, update and write the Git index.

    • Checkout in dirty working directory if trivial.

    • Walk the history from a given set of commits looking for commits introducing changes in files under a specified path.

    • Object transport

      Fetch via ssh, git, http, Amazon S3 and bundles. Push via ssh, git and Amazon S3. JGit does not yet deltify the pushed packs so they may be a lot larger than C Git packs.

    • Garbage collection

    • Merge

    • Rebase

    • And much more

  • org.eclipse.jgit.pgm

    • Assorted set of command line utilities. Mostly for ad-hoc testing of jgit log, glog, fetch etc.
  • org.eclipse.jgit.ant

    • Ant tasks
  • org.eclipse.jgit.archive

    • Support for Zip/Tar and other formats
  • org.eclipse.http

    • HTTP client and server support

Missing Features

There are some missing features:

  • verifying signed commits
  • signing tags
  • signing push

Support

Post questions, comments or discussions to the jgit-dev@eclipse.org mailing list. You need to be subscribed to post. File bugs and enhancement requests in Bugzilla.

Contributing

See the EGit Contributor Guide.

About Git

More information about Git, its repository format, and the canonical C based implementation can be obtained from the Git website.