Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some deprecation optimizations #37597

Merged
merged 5 commits into from
Jan 18, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -157,14 +157,13 @@ public void deprecatedAndMaybeLog(final String key, final String msg, final Obje
* arbitrary token; here we use the Elasticsearch version and build hash. The warn text must be quoted. The warn-date is an optional
* quoted field that can be in a variety of specified date formats; here we use RFC 1123 format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only semi-related to your PR but I think the RFC mentioned in this comment is wrong and it should say "RFC 7231 format" instead of "RFC 1123 format".

*/
private static final String WARNING_FORMAT =
private static final String WARNING_PREFIX =
String.format(
Locale.ROOT,
"299 Elasticsearch-%s%s-%s ",
"299 Elasticsearch-%s%s-%s",
Version.CURRENT.toString(),
Build.CURRENT.isSnapshot() ? "-SNAPSHOT" : "",
Build.CURRENT.shortHash()) +
"\"%s\" \"%s\"";
Build.CURRENT.shortHash());

/*
* RFC 7234 section 5.5 specifies that the warn-date is a quoted HTTP-date. HTTP-date is defined in RFC 7234 Appendix B as being from
Expand Down Expand Up @@ -223,7 +222,7 @@ public void deprecatedAndMaybeLog(final String key, final String msg, final Obje
.toFormatter(Locale.getDefault(Locale.Category.FORMAT));
}

private static final ZoneId GMT = ZoneId.of("GMT");
private static final String STARTUP_TIME = RFC_7231_DATE_TIME.format(ZonedDateTime.now(ZoneId.of("GMT")));

/**
* Regular expression to test if a string matches the RFC7234 specification for warning headers. This pattern assumes that the warn code
Expand Down Expand Up @@ -339,7 +338,9 @@ public Void run() {
* @return a warning value formatted according to RFC 7234
*/
public static String formatWarning(final String s) {
return String.format(Locale.ROOT, WARNING_FORMAT, escapeAndEncode(s), RFC_7231_DATE_TIME.format(ZonedDateTime.now(GMT)));
return WARNING_PREFIX + " "
+ "\"" + escapeAndEncode(s) + "\"" + " "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to your PR: While I am not aware that any RFC specifies a maximum field length for HTTP headers there are practical limits and I wonder whether we should enforce one (in a follow-up) as deprecation messages might be long?

+ "\"" + STARTUP_TIME + "\"";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rewrite as

return WARNING_PREFIX + " \"" + escapeAndEncode(s) + "\" \"" + STARTUP_TIME + "\"";

Or did you think it is too subtle to miss the required spaces in between? Either way, I don't feel too strongly about it.

Copy link
Member Author

@jasontedor jasontedor Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what is here is clearer IMO, and the compiler will fold the constants anyway (so fold " " and "\"" into " \"", and "\"" and " " and "\"" into "\" \"" so that the number of invocations to StringBuilder#append will be reduced).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also find it a bit odd to always use STARTUP_TIME as the value for warn-date but on the other hand I did not find anything in RFC 7234 that forbids doing that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately I want to remove the warn-date from the header. It's not required by the specification (it's optional). However, removing it requires coordinating with Kibana which parses these warning headers for displaying in Console (elastic/kibana#10470). So here I eliminate the performance issue to show the advantages (otherwise, it does show up in profiling output, ~15%).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with that, it just puzzled me in the beginning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/**
Expand All @@ -359,7 +360,31 @@ public static String escapeAndEncode(final String s) {
* @return the escaped string
*/
static String escapeBackslashesAndQuotes(final String s) {
return s.replaceAll("([\"\\\\])", "\\\\$1");
/*
* We want a fast path check to avoid creating the string builder and copying characters if needed. So we walk the string looking
* for either of the characters that we need to escape. If we find a character that needs escaping, we start over and
*/
boolean escapingNeeded = false;
for (int i = 0; i < s.length(); i++) {
final char c = s.charAt(i);
if (c == '\\' || c == '"') {
escapingNeeded = true;
break;
}
}

if (escapingNeeded) {
final StringBuilder sb = new StringBuilder();
for (final char c : s.toCharArray()) {
if (c == '\\' || c == '"') {
sb.append("\\");
}
sb.append(c);
}
return sb.toString();
} else {
return s;
}
}

private static BitSet doesNotNeedEncoding;
Expand All @@ -384,7 +409,7 @@ static String escapeBackslashesAndQuotes(final String s) {
for (int i = 0x80; i <= 0xFF; i++) {
doesNotNeedEncoding.set(i);
}
assert !doesNotNeedEncoding.get('%');
assert doesNotNeedEncoding.get('%') == false : doesNotNeedEncoding;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

}

private static final Charset UTF_8 = Charset.forName("UTF-8");
Expand All @@ -396,8 +421,21 @@ static String escapeBackslashesAndQuotes(final String s) {
* @return the encoded string
*/
static String encode(final String s) {
final StringBuilder sb = new StringBuilder(s.length());
// first check if the string needs any encoding; this is the fast path and we want to avoid creating a string builder and copying
boolean encodingNeeded = false;
for (int i = 0; i < s.length(); i++) {
int current = s.charAt(i);
if (doesNotNeedEncoding.get(current) == false) {
encodingNeeded = true;
break;
}
}

if (encodingNeeded == false) {
return s;
}

final StringBuilder sb = new StringBuilder(s.length());
for (int i = 0; i < s.length();) {
int current = s.charAt(i);
/*
Expand All @@ -420,10 +458,9 @@ static String encode(final String s) {
for (int j = 0; j < bytes.length; j++) {
sb.append('%').append(hex(bytes[j] >> 4)).append(hex(bytes[j]));
}
encodingNeeded = true;
}
}
return encodingNeeded ? sb.toString() : s;
return sb.toString();
}

private static char hex(int b) {
Expand Down