Skip to content
This repository has been archived by the owner on Jul 23, 2020. It is now read-only.

Lots of config errors on jenkins startup #2608

Closed
jfchevrette opened this issue Mar 15, 2018 · 15 comments · Fixed by fabric8-jenkins/jenkins-openshift-base#20
Closed

Lots of config errors on jenkins startup #2608

jfchevrette opened this issue Mar 15, 2018 · 15 comments · Fixed by fabric8-jenkins/jenkins-openshift-base#20

Comments

@jfchevrette
Copy link
Contributor

Yesterday an update to OpenShift went in to the the latest CVEs [1]. That update made secrets and configmaps mounts go read-only.

Presumably on startup jenkins tries to update the files mounted from configmap before it copies it to /var/lib/jenkins. That doesn't work anymore. The file from the configmap mount should be copied elsewhere and updated there.

[1] kubernetes/kubernetes#60814

OPENSHIFT_JENKINS_JVM_ARCH is set to i686 so using 32 bit Java
mkdir: cannot create directory '/var/lib/jenkins/logs': File exists
No resources found.
No resources found.
Generating kubernetes-plugin configuration (/opt/openshift/configuration/config.xml.tpl) ...
/usr/libexec/s2i/run: line 111: /opt/openshift/configuration/config.xml: Read-only file system
Generating kubernetes-plugin credentials (/var/lib/jenkins/credentials.xml.tpl) ...
/usr/libexec/s2i/run: line 123: /opt/openshift/configuration/credentials.xml: Read-only file system
Copying Jenkins configuration to /var/lib/jenkins ...
rm: cannot remove '/opt/openshift/configuration/config.xml.tpl': Read-only file system
rm: cannot remove '/opt/openshift/configuration/credentials.xml.tpl': Read-only file system
rm: cannot remove '/opt/openshift/configuration/hudson.tasks.Maven.xml': Read-only file system
rm: cannot remove '/opt/openshift/configuration/io.fabric8.jenkins.openshiftsync.GlobalPluginConfiguration.xml': Read-only file system
rm: cannot remove '/opt/openshift/configuration/jenkins.plugins.nodejs.tools.NodeJSInstallation.xml': Read-only file system
rm: cannot remove '/opt/openshift/configuration/org.jenkinsci.main.modules.sshd.SSHD.xml': Read-only file system
rm: cannot remove '/opt/openshift/configuration/org.jenkinsci.plugins.updatebot.GlobalPluginConfiguration.xml': Read-only file system
rm: cannot remove '/opt/openshift/configuration/scriptApproval.xml': Read-only file system
Copying 143 Jenkins plugins to /var/lib/jenkins ...
Creating initial Jenkins 'admin' user ...
sed: can't read /var/lib/jenkins/users/admin/config.xml: No such file or directory
Running from: /usr/lib/jenkins/jenkins.war
webroot: EnvVars.masterEnvVars.get("JENKINS_HOME")
Mar 15, 2018 12:01:05 AM Main deleteWinstoneTempContents
WARNING: Failed to delete the temporary Winstone file /tmp/winstone/jenkins.war
Mar 15, 2018 12:01:05 AM org.eclipse.jetty.util.log.Log initialized
INFO: Logging initialized @1221ms to org.eclipse.jetty.util.log.JavaUtilLog
Mar 15, 2018 12:01:06 AM winstone.Logger logInternal
INFO: Beginning extraction from war file
Mar 15, 2018 12:01:06 AM org.eclipse.jetty.server.handler.ContextHandler setContextPath
WARNING: Empty contextPath
Mar 15, 2018 12:01:06 AM org.eclipse.jetty.server.Server doStart
INFO: jetty-9.4.z-SNAPSHOT
Mar 15, 2018 12:01:07 AM org.eclipse.jetty.webapp.StandardDescriptorProcessor visitServlet
INFO: NO JSP Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
Mar 15, 2018 12:01:08 AM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
INFO: DefaultSessionIdManager workerName=node0
Mar 15, 2018 12:01:08 AM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
INFO: No SessionScavenger set, using defaults
Mar 15, 2018 12:01:08 AM org.eclipse.jetty.server.session.HouseKeeper startScavenging
INFO: Scavenging every 660000ms
Jenkins home directory: /var/lib/jenkins found at: EnvVars.masterEnvVars.get("JENKINS_HOME")
Mar 15, 2018 12:01:09 AM org.eclipse.jetty.server.handler.ContextHandler doStart
INFO: Started w.@112aa7c{/,file:///var/lib/jenkins/war/,AVAILABLE}{/var/lib/jenkins/war}
Mar 15, 2018 12:01:09 AM org.eclipse.jetty.server.AbstractConnector doStart
INFO: Started ServerConnector@1cb6996{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
Mar 15, 2018 12:01:09 AM org.eclipse.jetty.server.Server doStart
INFO: Started @5077ms
Mar 15, 2018 12:01:09 AM winstone.Logger logInternal
INFO: Winstone Servlet Engine v4.0 running: controlPort=disabled
Mar 15, 2018 12:01:11 AM jenkins.InitReactorRunner$1 onAttained
INFO: Started initialization
@kbsingh
Copy link
Collaborator

kbsingh commented Mar 15, 2018

we do need to get this done, but might not be a SEV1 - i think it needs a bit more of a deep dive from build team to scope up.

@chmouel
Copy link

chmouel commented Mar 15, 2018

added to our backlog, @pradeepto feel free to prioritize,

@stevengutz
Copy link
Collaborator

@chmouel Let's assume this is high priority (probably #3 in our current list of issues). It's not really breaking anything but it is polluting log files at a time when SD needs them to be concise to diagnose real issues.

@pradeepto
Copy link

Copy pasting from MM.

@lordofthejars mentioned :

@sthaha I have been investigating #2608 my bets is that we can remove the rm without any problem. The reason that now is failing is that kubernetes changed the permissions of mount path of secret key to just readable for this reason now it is failing.So why I think that we can remove it? Basically because notice thatthecp from /opt/openshift to JENKINS_HOME always occurs and it is with-f so any change on /opt/.... will be copied to JENKINS_HOME

@sthaha
Copy link
Collaborator

sthaha commented Mar 20, 2018

@jfchevrette

That update made secrets and configmaps mounts go read-only.

Is there a commit you can point me to that made this change? It will be quite helpful to the future maintainers if I can add the link to the commit in the fix that I am making.

@sthaha
Copy link
Collaborator

sthaha commented Mar 20, 2018

Status update

I have a PR#20 in progress that fixes this issue with rm -rf but I think the issue is much more than that since I see that the same mount point is being used to store expanded templates - see: run file which I hope to address in a separate commit but part of the same PR#20

@sthaha
Copy link
Collaborator

sthaha commented Mar 20, 2018

@jfchevrette did we onboard any users after this change to the mount point was made? I wonder if the build succeeds or even jenkins starts for them at all since IIUC credentials.xml and config.xml won't be generated based on line 111 and line 123 of run

@sthaha
Copy link
Collaborator

sthaha commented Mar 20, 2018

Status update

After reading the code a bit more I think I now understand how this is all supposed to work. I am not sure if this has been documented anywhere. I will add the doc to the PR.

Existing implementation

As Jenkins bootstraps one of the things it does is to sync the configuration information from the configmap,

  1. expand the templates (*.tmpl) in the configuration/ dir; generates config.xml, credentials.xml
  2. copy the files to the jenkins-home dir (PV)

To ensure that these steps aren't repeated in the next boot, it deletes all files from the configuration dir (rm -rf ${img_config_dir}). This ensures that configuration is regenerated only when there is a change to the configmap as the it won't be empty anymore.

ReadOnly mount breaks the above

By making configuration readonly, for new users, it not only breaks the optimisation but also the expansion of templates which is critical and thus resulting in missing config.xml and credentials.xml. I think jenkins might autogenerate a config.xml based on defaults but I am guessing it would be broken and the builds shouldn't run due to missing credentials.

Solution

A solution to this would be to figure out a way to detect if the configmap has been updated or not and I am thinking of ...

  1. adding a flag in the form of a file with some content say timestamp at the time of generation of the configmap
  2. Jenkins diffs its copy of the timestamp (initially empty; stored in jenkins-home) with the one in the configuration dir
  3. if it detects a change, it generates all configurations and updates the timestamp

@aslakknutsen @jfchevrette any thoughts ? Is there a simpler/better way to figure out if the configmap has been updated?

@jfchevrette
Copy link
Contributor Author

@sthaha kubernetes/kubernetes#58720

@aslakknutsen
Copy link
Collaborator

@sthaha A ConfigMap has a Revision id you could use to detect change.

@chmouel
Copy link

chmouel commented Mar 27, 2018

Working on this as well here #2749

@sthaha
Copy link
Collaborator

sthaha commented Apr 3, 2018

Reopening as the issue isn't closed. Somehow the issues get automatically closed when the PR is merged to master.

@joshuawilson
Copy link
Member

@sthaha if the PR has fixes #[issue-number] then it closes it when it gets merged.

@sthaha
Copy link
Collaborator

sthaha commented Apr 3, 2018

@joshuawilson Thanks. The PR does fix the issue but I kept it open to indicate that it isn't released to prod. I guess I will track that elsewhere.

@sthaha sthaha closed this as completed Apr 3, 2018
@joshuawilson
Copy link
Member

@sthaha you should track it in https://github.com/openshiftio/openshift.io/projects/3
Go over about 15 columns to Implementation Review, that is where things go when they are completed but not "done".

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants