commit | 7eff37c7b6f4649caf307d75004b337d139f157f | [log] [tgz] |
---|---|---|
author | Alice Kober-Sotzek <aliceks@google.com> | Thu Jul 05 15:58:50 2018 +0200 |
committer | Alice Kober-Sotzek <aliceks@google.com> | Wed Jul 11 17:26:42 2018 +0200 |
tree | 399d2e33f34939d902058e2056cbd65907d5283c | |
parent | 3fe0ac38d09c49e1984e4171262f201e401058cb [diff] |
Fix diff regarding newline at end of file Up to now, the Gerrit backend had been oblivious to newlines at the end of a file when asked for a diff. As users do want to know about added or deleted newlines at the end of a file, the GWT UI had special code which worked around those quirks of the backend. Instead of adding a similar workaround to PolyGerrit, we rather fix the behavior of the backend. Switching Gerrit's backend to consider newlines at the end isn't as simple as it may sound. The reason is that Gerrit's diff code heavily relies on JGit and JGit doesn't consider a newline at the end of a file as starting another line. This means that a file with 4 lines ending with newline characters is considered to be a 4 line file by JGit and also git-core. The idea seems to be that lines aren't *separated* but *terminated*. Probably because of this reason, JGit limits all of its code to that smaller assumed number of lines. For instance, it's not possible to query for line 5 of the above example. Instead of returning an empty string or something similar, JGit throws an exception. Another consequence is that edits returned from a JGit diff computation for an added/deleted newline at the end are 'cut off'. In the example above, deleting the newline in the last line (which is index 3 for 4 lines) results in a {3-4, 3-4} edit whereas deleting the newline in the line with index 1 results in a {1-3, 1-2} edit. I personally would have expected to get {3-5, 3-4} for the deleted newline at the end. Whether it's a bug or a feature of JGit isn't known and is probably difficult to change. As detailed above, all of this might just be the result of consistently following git-core's decision to treat newlines as terminators and not separators. The idea might also have been to map git-core's "No newline at end of file" statements, which git-core does print for various commands including "git diff", to an equivalent in JGit but this feature seems to be missing (or hidden very well) for JGit's diff computation. For Gerrit, we want a more user-friendly and obvious behavior, which also doesn't restrict users regarding their regular actions (e.g. adding comments on added/deleted newline characters at the end of a file). So far, user complaints especially focus on addition/deletion of those characters as the UI presentation in PolyGerrit is confusing in those situations. For a proper fix, we have to assess the whole representation on the UI, though. If we somehow only explicitly indicated added/deleted newlines at the end but didn't also reconsider to differently represent unmodified newlines at the end in general, users might be confused and think that those newlines were missing. For this reason, this change attempts to address the situation as a whole. Several options have been considered: 1) Represent a newline at the end with an empty string (= empty line content) in the diff output, similar to intermediary empty lines. 2) Return lines including their newline characters in the diff output. A missing newline at the end would be represented by a missing newline character in the last line. 3) Add flags to the diff output indicating whether the newline at the end is present/missing for each of the compared files. Option 1 has the benefit that it directly works in PolyGerrit and fixes the reported issue. Even possibly other existing UIs for Gerrit should most likely keep working as they already need to handle empty intermediary lines. Option 2 could possibly break existing UIs. It would definitely require further adjustments in PolyGerrit. In general, it would also require that each UI needs to know how to handle all possible types of line separators. Option 3 is the least risky one as it doesn't affect already existing fields of the diff output. On the other hand, it pushes the burden to the UI. Any UI of Gerrit would need special code for those flags and might get it wrong (e.g. not showing comments on such empty last lines). In addition, the flags feel like a crutch as Gerrit's API should rather directly do the right, least astonishing, and less error-prone thing. JGit uses the flag approach and after playing around with it as a user of that API as part of this change, I personally wouldn't want to force any user of Gerrit's API to cope with this variant. Given this consideration, option 1 was chosen as implementation for this change. Compared to before this change, the returned diff output now behaves differently in the following ways: - Added/Deleted newlines at the end of a file are indicated by an empty string in the modified lines of side b/a and a missing empty string on the other side. - If a file has a newline at the end which isn't modified, an empty string occurs as last entry of the common 'ab' lines. - The number of lines of the complete files indicated in the diff output counts a newline at the end as an additional line. The last mentioned aspect was necessary for everything to work correctly. It introduces a small discrepancy, though. Gerrit also has a REST endpoint which outputs some details about a file, including the number of added/deleted lines. For added/deleted files with newlines at the end, the number of added/deleted lines from that endpoint is one smaller than the number of lines of the complete file when requesting the diff output for those files. Fixing this would require some effort and make our code even more complex. It's questionable whether users will notice the discrepancy and hence we punt on this for now. Adding the empty string to the common lines has the benefit that users can clearly see on the diff screen whether a file as a newline at the end (= an empty line is present) or not (= the last line is filled). This behavior is consistent with other text editors and can be customized further in PolyGerrit if necessary. There's another positive side-effect of this fix: Comments which were added on the last empty line in the GWT UI or through the REST API now show up also in PolyGerrit. Users have complained about this, too. It also means that reviewers can directly comment on added/removed newlines at the end in PolyGerrit. To ensure that the diff REST endpoint is now working as expected regarding missing/present newlines at the end, a lot of new tests were added. When necessary, different whitespace settings were tested. Other tests rely on the default whitespace behavior (which is IGNORE_NONE) as covering all possible combinations would have resulted in an explosion of the number of tests. All in all, our test coverage for the diff REST endpoint has increased another step with this change. Bug: Issue 5486 Change-Id: I1e15767fc139e58cd1b9855e4e9d325027a16e31
Gerrit is a code review and project management tool for Git based projects.
Gerrit makes reviews easier by showing changes in a side-by-side display, and allowing inline comments to be added by any reviewer.
Gerrit simplifies Git based project maintainership by permitting any authorized user to submit changes to the master Git repository, rather than requiring all approved changes to be merged in by hand by the project maintainer.
For information about how to install and use Gerrit, refer to the documentation.
Our canonical Git repository is located on googlesource.com. There is a mirror of the repository on Github.
Please report bugs on the issue tracker.
Gerrit is the work of hundreds of contributors. We appreciate your help!
Please read the contribution guidelines.
Note that we do not accept Pull Requests via the Github mirror.
The IRC channel on freenode is #gerrit. An archive is available at: echelog.com.
The Developer Mailing list is repo-discuss on Google Groups.
Gerrit is provided under the Apache License 2.0.
Install Bazel and run the following:
git clone --recursive https://gerrit.googlesource.com/gerrit cd gerrit && bazel build release
The instruction how to configure GerritForge/BinTray repositories is here
On Debian/Ubuntu run:
apt-get update & apt-get install gerrit=<version>-<release>
NOTE: release is a counter that starts with 1 and indicates the number of packages that have been released with the same version of the software.
On CentOS/RedHat run:
yum clean all && yum install gerrit-<version>[-<release>]
On Fedora run:
dnf clean all && dnf install gerrit-<version>[-<release>]
Docker images of Gerrit are available on DockerHub
To run a CentOS 7 based Gerrit image:
docker run -p 8080:8080 gerritforge/gerrit-centos7[:version]
To run a Ubuntu 15.04 based Gerrit image:
docker run -p 8080:8080 gerritforge/gerrit-ubuntu15.04[:version]
NOTE: release is optional. Last released package of the version is installed if the release number is omitted.