Teach SafeHtml.linkify() to ignore trailing ">"

Because we are running linkify on the HTML safe URL, a string
such as "<http://foo>" is actually appearing to our regex as the
string "&lt;http://foo&gt;".  As "&gt;" is a valid sequence of URL
characters we were pulling the "&gt;" into the URL, when in fact
our intent was to leave it out.

We now skip "&lt;" and "&gt;" within a URL, as these are meant to
be read by the browser after parsing as "<" and ">", and these are
not considered to be part of the URL.

Bug: GERRIT-277
Change-Id: Ide9a63c3c998eac6a3ce9f23066668c2e7a9aba6
Signed-off-by: Shawn O. Pearce <sop@google.com>
diff --git a/src/main/java/com/google/gwtexpui/safehtml/client/SafeHtml.java b/src/main/java/com/google/gwtexpui/safehtml/client/SafeHtml.java
index 4c61588..b19ad6c 100644
--- a/src/main/java/com/google/gwtexpui/safehtml/client/SafeHtml.java
+++ b/src/main/java/com/google/gwtexpui/safehtml/client/SafeHtml.java
@@ -70,13 +70,15 @@
 
   /** Convert bare http:// and https:// URLs into &lt;a href&gt; tags. */
   public SafeHtml linkify() {
+    final String part = "(?:" +
+		"[a-zA-Z0-9$_.+!*',%;:@=?#/-]" +
+		"|&(?!lt;|gt;)" +
+		")";
     return replaceAll(
         "(https?://" +
-          "[a-zA-Z0-9$_.+!*',%;:@&=?#/-]{2,}" +
-          "([(]" +
-          "[a-zA-Z0-9$_.+!*',%;:@&=?#/-]*" +
-          "[)])*" +
-          "[a-zA-Z0-9$_.+!*',%;:@&=?#/-]*" +
+          part + "{2,}" +
+          "(?:[(]" + part + "*" + "[)])*" +
+          part + "*" +
         ")",
         "<a href=\"$1\">$1</a>");
   }