Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jib:build Remaining allocation units less than 0 #1512

Closed
mecseid opened this issue Feb 27, 2019 · 10 comments
Closed

jib:build Remaining allocation units less than 0 #1512

mecseid opened this issue Feb 27, 2019 · 10 comments

Comments

@mecseid
Copy link

mecseid commented Feb 27, 2019

Description of the issue:
The jib-maven-plugin failed with this error message:
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.google.cloud.tools:jib-maven-plugin:1.0.1:build (default-cli) on project backend: Remaining allocation units less than 0 for 'pull container configuration sha256:0ea51d9f12788eb9ab2fad601aeda6fd2d444fde75b1cfb853994913ed88243d': -1

I use Nexus 3 as a local Docker repository for my images, and as a proxy to Docker Hub.
In front of Nexus, I have a Traefik, which setup itself from Docker labels.

The docker labels of the Nexus repository:

  • traefik.docker.network=public
  • traefik.frontend.entryPoints=http,https
  • traefik.app.port=8081
  • traefik.app.frontend.rule=Host:nexus.192.168.100.101.nip.io
  • traefik.registry_pull.port=5055
  • traefik.registry_pull.frontend.rule=Host:nexus.192.168.100.101.nip.io;PathPrefix:/v2;Method: GET
  • traefik.registry_pull.priority=30
  • traefik.registry_push_1.port=5050
  • traefik.registry_push_1.frontend.rule=Host:nexus.192.168.100.101.nip.io; PathPrefix:/v2
  • traefik.registry_push_1.priority=20
  • traefik.registry_push_2.port=5050
  • traefik.registry_push_2.frontend.rule=Host:nexus.192.168.100.101.nip.io; PathPrefix:^/(v1|v2)/[^/]+/?[^/]+/blobs/;Method: POST, PUT, PATCH, DELETE, HEAD
  • traefik.registry_push_2.priority=40

With this configuration, the docker pull nexus.192.168.100.101.nip.io/alpine:latest and others work as expected, but jib cannot pull docker layers.
If a publish the port 5050, 5055, and configure jib to pull and push from/to these ports, then jib works as expected.

Environment:
Build from Jenkins Pipeline with Docker Maven container (maven:3-jdk-11-slim)

jib-maven-plugin Configuration (relevant section):

<plugin>
	<groupId>com.google.cloud.tools</groupId>
	<artifactId>jib-maven-plugin</artifactId>
	<version>1.0.1</version>
	<configuration>
		<allowInsecureRegistries>true</allowInsecureRegistries>
		<from>
			<!--suppress UnresolvedMavenProperty -->
			<image>${nexus.location}/adoptopenjdk/openjdk11:alpine</image>
		</from>
		<to>
			<!--suppress UnresolvedMavenProperty -->
			<image>${nexus.location}/mekh/phonebook-backend:${project.version}</image>
		</to>
		<container>
			<ports>
				<port>8080</port>
			</ports>
		</container>
	</configuration>
</plugin>

Additional Information:
The jib-maven-plugin:1.0.0 can use HTTP, but the jib-maven-plugin:1.0.1 works with only HTTPS connection.

@chanseokoh
Copy link
Member

Traefik is completely new to us. I'm curious to check a few things.

What if you 1) remove ;Method: GET from registry_pull; and 2) remove traefik.registry_push_2 altogether so that the config looks like this?

traefik.docker.network=public
traefik.frontend.entryPoints=http,https
traefik.app.port=8081
traefik.app.frontend.rule=Host:nexus.192.168.100.101.nip.io
traefik.registry_pull.port=5055
traefik.registry_pull.frontend.rule=Host:nexus.192.168.100.101.nip.io;PathPrefix:/v2
traefik.registry_pull.priority=30
traefik.registry_push_1.port=5050
traefik.registry_push_1.frontend.rule=Host:nexus.192.168.100.101.nip.io; PathPrefix:/v2
traefik.registry_push_1.priority=20

Just trying to by-pass with minimal restrictions. I also think there should be some log on the Traefik side when Jib requests don't go through to reach the Nexus registry?

@chanseokoh
Copy link
Member

Another possibility we've been thinking is that traefik is somehow manipulating/ignoring/tampering transferring correct Content-Length header value when routing requests/responses.

Could you obtain a network trace and upload the full log?

@chanseokoh
Copy link
Member

(Don't forget to add -DjibSerialize=true to serialize network calls.)

@mecseid
Copy link
Author

mecseid commented Feb 27, 2019

#1512 (comment)
I can't delete anything from configuration, because the Nexus can't handle push actions for a group, and Traefik use first match based on priority.
More information about that:
Traefik Configuration for Nexus Docker
Nexus Docker Group (grouping handle pull requests only)
Traefik priority

#1512 (comment)
#1512 (comment)
I attached the output of the logs (only copied the right step output from Jenkins Pipeline).
I figured out that the JIB throws an exception if I enable Traefik compression.
Without compression, the JIB v1.0.0 can handle the HTTP and HTTPS connection, but v1.0.1 can handle the HTTPS connection only (it only tries at HTTPS)

All test case made with enabled compression:
jib-1.0.0.http.log
jib-1.0.0.https.log
jib-1.0.1.http.log
jib-1.0.1.https.log

Traefik Compression
So, the JIB cannot handle this situtation, but Docker can pull/push with enabled compression, and the JIB v1.0.1 don't connect to HTTP with -DsendCredentialsOverHttp=true property and enabled allowInsecureRegistries configuration.

@chanseokoh
Copy link
Member

chanseokoh commented Feb 28, 2019

@mecseid your log did not capture network traces. With correct configuration, the log should capture all HTTP requests/responses as shown in this example trace such as

Feb 26, 2019 10:46:54 AM com.google.api.client.http.HttpRequest execute
CONFIG: -------------- REQUEST  --------------
GET https://myregistry:5000/v2/
Accept: 
Accept-Encoding: gzip
User-Agent: jib 1.0.0 Google-HTTP-Java-Client/1.23.0 (gzip)

Feb 26, 2019 10:46:54 AM com.google.api.client.http.HttpRequest execute
CONFIG: curl -v --compressed -H 'Accept: ' -H 'Accept-Encoding: gzip' -H 'User-Agent: jib 1.0.0 Google-HTTP-Java-Client/1.23.0 (gzip)' -- 'https://myregistry:5000/v2/'
Feb 26, 2019 10:46:54 AM com.google.api.client.http.HttpResponse <init>
CONFIG: -------------- RESPONSE --------------
HTTP/1.1 403 Forbidden
Server: squid
Mime-Version: 1.0
Date: Tue, 26 Feb 2019 09:52:45 GMT
Content-Type: text/html;charset=utf-8
Content-Length: 3414
X-Squid-Error: ERR_ACCESS_DENIED 0
X-Cache: MISS from synnefo-proxy
X-Cache-Lookup: NONE from synnefo-proxy:3128
Connection: keep-alive

Please set the network configuration in a logging.properites file and make the system property java.util.logging.config.file to point to the file.

Anyways, now you mentioned that this happens only when turning on Traefik compression, I think it further supports our conjecture that Traefik routing results in dropping or messing with Content-Length in responses, because, understandably, I expect compression to happen on the fly with streaming, and Traefik would have no way to know in advance the exact Content-Length. I'd like to confirm this with a network trace.

(As a side note, I see jib-1.0.1-https is able to communicate over HTTPS (albeit insecure HTTP). Why do you want to use HTTP for what?)

@chanseokoh
Copy link
Member

chanseokoh commented Feb 28, 2019

the JIB v1.0.0 can handle the HTTP and HTTPS connection, but v1.0.1 can handle the HTTPS connection only (it only tries at HTTPS)

For this, I noticed we've upgraded the Google HTTP Client in v1.0.1, and this seems to be the cause. Filed #1517. This HTTP-failover issue will be fixed separately.

@mecseid
Copy link
Author

mecseid commented Feb 28, 2019

Sorry for wrong log files, I created another two:
one with compression (which is failed), and another one without it (which is successful)
jib-1.0.1-with-compression.log
jib-1.0.1-without-compression.log

I don't want to use HTTP connection at all (I was happy about that v1.0.1 don't accept HTTP connections), but we have some very awful policy about ingress network and communication (I hope we will delete these).

Thanks for #1517 .

@chanseokoh
Copy link
Member

Thanks for the logs. (In hindsight, Traefik is completely invisible as a proxy, of course.) In any case, it is obvious that Jib crashed on inaccurate/incomplete progress usage, and I think this is a likely cause of the issue. #1521 should fix it, and hopefully, your issue will be gone.

@chanseokoh
Copy link
Member

chanseokoh commented Feb 28, 2019

I think we know why compression is causing this issue. This shouldn't be specific to Traefik. The root cause of the inaccurate/incomplete progress usage is probably #1522.

Anyways, #1521 will make Jib proceed normally with inaccurate/incomplete progress usage, so I expect #1521 will unblock you.

@chanseokoh chanseokoh added this to the v1.1.0 milestone Mar 1, 2019
@chanseokoh chanseokoh removed this from the v1.1.0 milestone Mar 5, 2019
@chanseokoh
Copy link
Member

@mecseid v1.0.2 released that will prevent crashing when enabling compression. v1.0.2 also fixes the regression that may fail to try HTTP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants