cosa CI fails intermittently in `checkout scm` step #3372

dustymabe · 2023-02-23T03:52:49Z

I've seen this some recently where the checkout scm step fails:

[2023-02-23T03:26:24.785Z] [Pipeline] checkout
[2023-02-23T03:26:24.810Z] The recommended git tool is: git
[2023-02-23T03:26:27.617Z] using credential github-coreosbot-token-username-password
[2023-02-23T03:26:27.715Z] Warning: JENKINS-30600: special launcher org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1@66e5ad67; decorates RemoteLauncher[hudson.remoting.Channel@349ba253:JNLP4-connect connection from 10.130.2.121/10.130.2.121:46332] will be ignored (a typical symptom is the Git executable not being run inside a designated container)
[2023-02-23T03:26:27.718Z] Cloning the remote Git repository
[2023-02-23T03:26:27.718Z] Cloning with configured refspecs honoured and without tags
[2023-02-23T03:26:27.732Z] Cloning repository https://github.com/coreos/coreos-assembler.git
[2023-02-23T03:26:28.415Z]  > git init /home/jenkins/agent/workspace/coreos-assembler_PR-3371 # timeout=10
[2023-02-23T03:26:28.427Z] Fetching upstream changes from https://github.com/coreos/coreos-assembler.git
[2023-02-23T03:26:28.427Z]  > git --version # timeout=10
[2023-02-23T03:26:28.431Z]  > git --version # 'git version 2.27.0'
[2023-02-23T03:26:28.431Z] using GIT_ASKPASS to set credentials GitHub coreosbot token as username/password
[2023-02-23T03:26:28.432Z]  > git fetch --no-tags --force --progress -- https://github.com/coreos/coreos-assembler.git +refs/pull/3371/head:refs/remotes/origin/PR-3371 +refs/heads/main:refs/remotes/origin/main # timeout=10
[2023-02-23T03:26:34.312Z] [Pipeline] }
[2023-02-23T03:26:34.313Z] [Pipeline] // container
[2023-02-23T03:26:34.374Z] [Pipeline] }
[2023-02-23T03:26:34.425Z] [Pipeline] // node
[2023-02-23T03:26:34.437Z] [Pipeline] }
[2023-02-23T03:26:34.440Z] [Pipeline] // podTemplate
[2023-02-23T03:26:34.450Z] [Pipeline] End of Pipeline
[2023-02-23T03:26:34.459Z] java.nio.channels.ClosedChannelException
[2023-02-23T03:26:34.459Z] Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 10.130.2.121/10.130.2.121:46332
[2023-02-23T03:26:34.459Z] 		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1784)
[2023-02-23T03:26:34.459Z] 		at hudson.remoting.Request.call(Request.java:199)
[2023-02-23T03:26:34.459Z] 		at hudson.remoting.Channel.call(Channel.java:999)
[2023-02-23T03:26:34.459Z] 		at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.execute(RemoteGitImpl.java:143)
[2023-02-23T03:26:34.459Z] 		at jdk.internal.reflect.GeneratedMethodAccessor955.invoke(Unknown Source)
[2023-02-23T03:26:34.459Z] 		at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-02-23T03:26:34.459Z] 		at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-02-23T03:26:34.459Z] 		at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.invoke(RemoteGitImpl.java:129)
[2023-02-23T03:26:34.459Z] 		at com.sun.proxy.$Proxy101.execute(Unknown Source)
[2023-02-23T03:26:34.459Z] 		at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1226)
[2023-02-23T03:26:34.459Z] 		at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1308)
[2023-02-23T03:26:34.459Z] 		at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:129)
[2023-02-23T03:26:34.459Z] 		at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:97)
[2023-02-23T03:26:34.459Z] 		at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:84)
[2023-02-23T03:26:34.459Z] 		at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
[2023-02-23T03:26:34.459Z] 		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
[2023-02-23T03:26:34.459Z] 		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[2023-02-23T03:26:34.459Z] Caused: hudson.remoting.RequestAbortedException
[2023-02-23T03:26:34.459Z] 	at hudson.remoting.Request.abort(Request.java:345)
[2023-02-23T03:26:34.459Z] 	at hudson.remoting.Channel.terminate(Channel.java:1080)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:240)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:221)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:825)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:289)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:168)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:825)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:155)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:143)
[2023-02-23T03:26:34.459Z] 	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:789)
[2023-02-23T03:26:34.459Z] 	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
[2023-02-23T03:26:34.459Z] 	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
[2023-02-23T03:26:34.459Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[2023-02-23T03:26:34.459Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[2023-02-23T03:26:34.459Z] 	at java.base/java.lang.Thread.run(Thread.java:829)
[2023-02-23T03:26:35.185Z] 
[2023-02-23T03:26:35.185Z] GitHub has been notified of this commit’s build result
[2023-02-23T03:26:35.185Z] 
[2023-02-23T03:26:35.185Z] Finished: FAILURE

This happened in PR-3371#1 and also consistently 3 or 4 times in #3366 (though those CI runs have been garbage collected at this point).

The text was updated successfully, but these errors were encountered:

jlebon · 2023-02-27T14:55:16Z

It seems like restarting Jenkins helps this. I wouldn't be surprised if it came back though.

By default, the jnlp container has a memory request of 256Mi and no memory limit. In 9eed927 ("manifests/jenkins: set resource limits and requests for jnlp container"), we added an explicit memory limit so that we don't inherit the default 10G memory limit from the limitrange in the RHCOS OpenShift project. We set it to 256Mi since that's what the default memory *request* was, but it's too little as a hard limit. Bump it to 512Mi. And also bump the request since in this project we try to have requests match what the pod actually needs. Fixes coreos/coreos-assembler#3372.

By default, the jnlp container has a memory request of 256Mi and no memory limit. In 9eed927 ("manifests/jenkins: set resource limits and requests for jnlp container"), we added an explicit memory limit so that we don't inherit the default 10G memory limit from the limitrange in the RHCOS OpenShift project. We set it to 256Mi since that's what the default memory *request* was, but it's too little as a hard limit. Bump it to 512Mi. And also bump the request since in this project we try to have requests match what the pod actually needs. While we're here, add a comment about why we set these values. Fixes coreos/coreos-assembler#3372.

By default, the jnlp container has a memory request of 256Mi and no memory limit. In 9eed927 ("manifests/jenkins: set resource limits and requests for jnlp container"), we added an explicit memory limit so that we don't inherit the default 10G memory limit from the limitrange in the RHCOS OpenShift project. We set it to 256Mi since that's what the default memory *request* was, but it's too little as a hard limit. Most of the time, the jnlp unexplicably dies, but at least once, I saw it fail with: ``` java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached ``` Bump it to 512Mi. And also bump the request since in this project we try to have requests match what the pod actually needs. While we're here, add a comment about why we set these values. Fixes coreos/coreos-assembler#3372.

dustymabe mentioned this issue Feb 23, 2023

build.sh: freeze grub2 since it's not working for ppc64le PXE tests #3371

Merged

jlebon mentioned this issue Feb 27, 2023

mantle/kola: fix systemd generator failure detection #3374

Merged

jlebon added the jira for syncing to jira label Feb 27, 2023

jlebon self-assigned this Feb 27, 2023

jlebon mentioned this issue Feb 27, 2023

manifests/jenkins: bump jnlp memory to 512Mi coreos/fedora-coreos-pipeline#825

Merged

jlebon closed this as completed in coreos/fedora-coreos-pipeline#825 Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cosa CI fails intermittently in `checkout scm` step #3372

cosa CI fails intermittently in `checkout scm` step #3372

dustymabe commented Feb 23, 2023

jlebon commented Feb 27, 2023

cosa CI fails intermittently in checkout scm step #3372

cosa CI fails intermittently in checkout scm step #3372

Comments

dustymabe commented Feb 23, 2023

jlebon commented Feb 27, 2023

cosa CI fails intermittently in `checkout scm` step #3372

cosa CI fails intermittently in `checkout scm` step #3372