Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding errror with content-encoding 'gzip' #751

Closed
skaldarnar opened this issue Mar 21, 2020 · 13 comments · Fixed by #762
Closed

Decoding errror with content-encoding 'gzip' #751

skaldarnar opened this issue Mar 21, 2020 · 13 comments · Fixed by #762
Labels

Comments

@skaldarnar
Copy link
Contributor

Describe the bug

When trying to fetch an organization from public GitHub via getOrganization("...") the response body cannot be parsed correctly and the method throws an IOException.

There seems to be an issue with whitespaces in the (assumed) JSON response, indicated by Illegal character ((CTRL-CHAR, code 31)). Some debugging down to GithubResponse::parseBody shows that

responseInfo.headerField("content-encoding")

resolves to "gzip" for some reason.

Here is the full stacktrace:

org.kohsuke.github.HttpException: Server returned HTTP response code: 200, message: 'null' for URL: https://api.github.com/orgs/MovingBlocks
	at org.kohsuke.github.GitHubClient.interpretApiError(GitHubClient.java:443) ~[github-api-1.108.jar:na]
	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:366) ~[github-api-1.108.jar:na]
	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:310) ~[github-api-1.108.jar:na]
	at org.kohsuke.github.Requester.fetch(Requester.java:71) ~[github-api-1.108.jar:na]
	at org.kohsuke.github.GitHub.getOrganization(GitHub.java:489) ~[github-api-1.108.jar:na]
	at org.terasology.launcher.updater.LauncherUpdater.updateAvailable(LauncherUpdater.java:91) ~[main/:na]
	at org.terasology.launcher.LauncherInitTask.checkForLauncherUpdates(LauncherInitTask.java:181) [main/:na]
	at org.terasology.launcher.LauncherInitTask.call(LauncherInitTask.java:81) [main/:na]
	at org.terasology.launcher.LauncherInitTask.call(LauncherInitTask.java:44) [main/:na]
	at javafx.concurrent.Task$TaskCallable.call(Task.java:1423) [jfxrt.jar:na]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_232]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_232]
Caused by: com.fasterxml.jackson.core.JsonParseException: Illegal character ((CTRL-CHAR, code 31)): only regular white space (\r, \n, \t) is allowed between tokens
 at [Source: (String)"�      �??Mo?0?????�?n?5�?�C??????P?b?6�??EC�.???}?��7;e7??C?z_?s?G?y?�????]a~?<??I�o??:�??�?>?�v????{?�z??T?^??+?�????;"?Q?X9??$E#?G?*?-s?C4G�^\e?A?^??�D?A�?]M?�?�??jz????????H�^C???????p\|P??gJ??��?L??V8a.?�?6?d?�L????>?l?[??
??�l�?=?
�?|???L&?�pV?D?j,?�K??j?v?�6�?�?Sc??�O?W*?�=g?8Y?_??n(?fHL??�fBQj!i.F^?�#K	4S?P�?�?�-?p?�|??Y?8??????9?T�?"??]q??/?
x?J?�Yn???�v????_????pP???/?(ZD?"?�??$?$??�5??S??�???!?Mnn?M????q?6??�?XP?�  "; line: 1, column: 2]
	at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1840) ~[jackson-core-2.10.2.jar:2.10.2]
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:712) ~[jackson-core-2.10.2.jar:2.10.2]
	at com.fasterxml.jackson.core.base.ParserMinimalBase._throwInvalidSpace(ParserMinimalBase.java:690) ~[jackson-core-2.10.2.jar:2.10.2]
	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._skipWSOrEnd(ReaderBasedJsonParser.java:2373) ~[jackson-core-2.10.2.jar:2.10.2]
	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:672) ~[jackson-core-2.10.2.jar:2.10.2]
	at com.fasterxml.jackson.databind.ObjectReader._initForReading(ObjectReader.java:357) ~[jackson-databind-2.10.2.jar:2.10.2]
	at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:1704) ~[jackson-databind-2.10.2.jar:2.10.2]
	at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1261) ~[jackson-databind-2.10.2.jar:2.10.2]
	at org.kohsuke.github.GitHubResponse.parseBody(GitHubResponse.java:84) ~[github-api-1.108.jar:na]
	at org.kohsuke.github.Requester.lambda$fetch$1(Requester.java:71) ~[github-api-1.108.jar:na]
	at org.kohsuke.github.GitHubClient.createResponse(GitHubClient.java:404) ~[github-api-1.108.jar:na]
	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:358) ~[github-api-1.108.jar:na]
	... 10 common frames omitted

To Reproduce

This is the MWE from my code to reproduce this error.

try {
	Github github = GitHub.connectAnonymously();
	GHOrganization movingBlocks = github.getOrganization("MovingBlocks");
} catch (IOException e) {
	e.printStacktrace();
}

Expected behavior

I expect that a response with 200 OK is parsed correctly and does not throw an IOException.

Alternatively, is there a way to set custom headers, e.g., setting "accept-encoding" to ensure the lib can parse the response body?

Desktop (please complete the following information):

  • OS: macOS / Windows 10
  • Java: 1.8.0_232 BellSoft
  • Version 1.108
@bitwiseman
Copy link
Member

@skaldarnar When I run your code above I don't get the error you're seeing.

The library intentionally add gzip as an accepted encoding and then unzips the data if that encoding is returned:

https://github.com/github-api/github-api/blob/a42305dd59f0be426f0c9091748e947f60d76bcd/src/main/java/org/kohsuke/github/GitHubHttpUrlConnectionClient.java#L135

https://github.com/github-api/github-api/blob/a42305dd59f0be426f0c9091748e947f60d76bcd/src/main/java/org/kohsuke/github/GitHubHttpUrlConnectionClient.java#L226-L234

You've provided an excellent issue report, but I'm not sure what is causing the error you're seeing.

@skaldarnar
Copy link
Contributor Author

skaldarnar commented Mar 27, 2020

You've provided an excellent issue report, but I'm not sure what is causing the error you're seeing.

@bitwiseman thank you, and yeah, me neither 😕 I'll give my debugging attempts another try and put some breakpoints at the line you've pointed me to.

Edit: I've just run the example with different Java versions - no difference on macOS:

  • Java 13
    openjdk version "13.0.2" 2020-01-14
    OpenJDK Runtime Environment AdoptOpenJDK (build 13.0.2+8)
    OpenJDK 64-Bit Server VM AdoptOpenJDK (build 13.0.2+8, mixed mode, sharing)
    
  • Java 11
    openjdk version "11.0.2" 2019-01-15
    openJDK Runtime Environment 18.9 (build 11.0.2+9)
    OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
    
  • Java 1.8
    openjdk version "1.8.0_222"
    OpenJDK Runtime Environment (build 1.8.0_222-BellSoft-b11)
    OpenJDK 64-Bit Server VM (build 25.222-b11, mixed mode)
    

@jdrueckert
Copy link

Tried on 64-bit Windows with Java 8

java -version
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-BellSoft-b10)
OpenJDK 64-Bit Server VM (build 25.232-b10, mixed mode)

Also getting the following:

org.kohsuke.github.HttpException: Server returned HTTP response code: 200, message: 'null' for URL: https://api.github.com/orgs/MovingBlocks
        at org.kohsuke.github.GitHubClient.interpretApiError(GitHubClient.java:443)
        at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:366)
        at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:310)
        at org.kohsuke.github.Requester.fetch(Requester.java:71)
        at org.kohsuke.github.GitHub.getOrganization(GitHub.java:489)
        at de.skaldarnar.github.Main.main(Main.java:16)
Caused by: com.fasterxml.jackson.core.JsonParseException: Illegal character ((CTRL-CHAR, code 31)): only regular white space (\r, \n, \t) is allowed between tokens
 at [Source: (String)"??Mo?0
                            ??????n?5?C??????P?b?6??EC.???}?7;e7??C?z_?s?G?y?????]a~?<??Io??:???>?v????{?z??T?^??+?????;"?Q?X9??$
                                                                                                                                 E#?G?*?-s?C4G
???????P????
            5?K??gLZ?(?????FX$?v
??l?=?                          4YX/}??
?|???L&?pV?D?j,?K??j?v?6??Sc??O?W*?=g?8Y?_??n(?fHL??fBQ
                                                       j!i.F^?#K     4S?P?▒?
                                                                            -?
                                                                              p?|??Y?8??????9?T?"??]q??/?
x?J?n???v????_????pP???/?(ZD?"???$?$??
                                      5??S?????!?Mnn?M????q?6??
                                                               ?XP?"; line: 1, column: 2]
        at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1840)
        at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:712)
        at com.fasterxml.jackson.core.base.ParserMinimalBase._throwInvalidSpace(ParserMinimalBase.java:690)
        at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._skipWSOrEnd(ReaderBasedJsonParser.java:2373)
        at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:672)
        at com.fasterxml.jackson.databind.ObjectReader._initForReading(ObjectReader.java:357)
        at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:1704)
        at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1261)
        at org.kohsuke.github.GitHubResponse.parseBody(GitHubResponse.java:84)
        at org.kohsuke.github.Requester.lambda$fetch$1(Requester.java:71)
        at org.kohsuke.github.GitHubClient.createResponse(GitHubClient.java:404)
        at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:358)
        ... 4 more

@jdrueckert
Copy link

Same when trying on 64-bit Linux (same Java Version and JDK). Could it be related to the JDK? 🤔

@tomakehurst
Copy link

I'm seeing the same issue.

While I'm debugging in wrapStream(...), headerField("Content-Encoding"); returns null, despite the fact the content is clearly gzipped (I can ungzip it in the debug session using a utility class).

It's not clear why this is resolving to null, since if I call the same GitHub API with curl it returns a Content-Encoding:gzip header as expected if I've sent Accept:gzip in the request.

@tomakehurst
Copy link

tomakehurst commented Mar 30, 2020

OK, found the problem: headerField is case-sensitive and GitHub have changed their response headers to be all lower case.

@tomakehurst
Copy link

Switching the implementation of GitHubResponse.headerField() to this seems to fix the problem:

public String headerField(String name) {
    return headers.entrySet()
        .stream()
        .filter(entry -> name.equalsIgnoreCase(entry.getKey()))
        .findFirst()
        .map(entry -> entry.getValue().get(0))
        .orElse(null);
}

I'll raise a PR, but I'm struggling to create a test case at the moment because (the ironies) I can't persuade WireMock not to capitalise the content-encoding header name.

@tomakehurst
Copy link

Would appreciate some input from maintainers about whether lambdas/streams ☝️ are acceptable from a style perspective.

@bitwiseman
Copy link
Member

bitwiseman commented Mar 30, 2020

@tomakehurst

Lambdas and streams are fine.This library only recently moved to Java 8+ so there's probably a number of places where streams and lambdas would be helpful. We're using lambdas in a number of places already.

Note, I'm not loving using it in this case. Multiple calls to headerField() will be costly for no good reason. Given this is a special situation, what about extending HashMap like this:

    public V get(String key) {
        Node<K,V> e = super.get(key);
        if (e == null) {
            e = super.get(key.toLowerCase());
        }
        return e;
    }

Probably have to do the same with containsKey(). But that would protect all consumers not just one method.

@bitwiseman
Copy link
Member

bitwiseman commented Mar 30, 2020

@tomakehurst
Or even better:

protected ResponseInfo(@Nonnull GitHubRequest request,
                       int statusCode,
                       @Nonnull Map<String, List<String>> headers) {
    this.request = request;
    this.statusCode = statusCode;

    // Response header field names must be case-insensitive.
    TreeMap<String, List<String>> caseInsensitiveMap = new TreeMap<>(Comparator.nullsFirst(String.CASE_INSENSITIVE_ORDER));
    caseInsensitiveMap.putAll(headers);

    this.headers = Collections.unmodifiableMap(caseInsensitiveMap);
}

bitwiseman added a commit to bitwiseman/github-api that referenced this issue Mar 31, 2020
GitHub has started changing their headers from capitalized words to all lowercase.
A recent change made the header fields querying case-senstive.

This fixes that mistake and also extends that funcionality to anyone consuming the
headers generated by ResponseInfo.

Fixes hub4j#751
bitwiseman added a commit to bitwiseman/github-api that referenced this issue Mar 31, 2020
GitHub has started changing their headers from capitalized words to all lowercase.
A recent change made the header fields querying case-senstive which broke gzip content detection.
This was not caught by tests because recorded files remain unchanged.
It is also possible that WireMock is auto-capitalizing.

This fixes the case-sensitivity issue and also extends that funcionality to anyone consuming the
headers generated by ResponseInfo.

Fixes hub4j#751
@bitwiseman
Copy link
Member

@tomakehurst
Copy link

The issue is definitely Jetty, but it happens even when GzipHandler isn't present (I created a stub with the header set explicitly and the body pre-gzipped).

I haven't yet managed to figure out why/where it's rewriting that particular header.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants