-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BFS7.7.1 Failing Jenkins Job Issue #57
Comments
Lets focus on addressing these problems, this is top priority, thanks |
Update on Above Issue 2 & 3:
|
Analysis on Issue 1:Problem:
Diagnosis:
Update:
|
what does random means? beside ciGroup5, did we see any other ciGroup random failure? did we even pass the ciGroup5? is it passed local run? |
@seraphjiang in the below job, CiGroup 9 fails because of optimization failure. The difference between the failed CiGroup and passed CiGroup is an optimization issue, where some modules in |
how about run single ciGroup separately, could we always pass the tests? |
Running tests separately/sequentially takes a couple of hours to finish functional tests compared to running it parallel, which takes about half an hour to 45 mins. Coming to the question of " will running CiGroups separately always pass the tests?". There isn't enough data to show that it'll pass. A couple of Jenkins job that runs functional tests sequentially will be started this evening. That can give us a picture of the test results. Update (on running sequential functional tests):5 Jenkins jobs were ran and they all failed due to two functional test case failure in Jenkin jobs log:
Error log:
|
I have spent some time from Friday to research on this issue
The cause of this is the combination of the two factors, the optimization process during Kibana startup and the Jenkins pipeline docker plugin. Please see more details below Background knowledge:
Problem Root Cause:
Now, we can clearly see that the Jenkins Pipeline is trying to mount the same workspace to multiple parallel containers with read/write permissions at the same time. Each parallel container starts up a Kibana instance and goes through the optimization process, overwriting the same files at the same location on the host. This creates a race condition among all the parallel containers during the Kibana optimization processes. Hence, we see the plugin file not found errors, with unpredictable files each time. How to solve this problem?
Please let me know if you have any questions. Thanks. |
In order to speed up the progress to remove this blocker, I have tested three different approaches yesterday. I would like to share some test results. Please see details below
My conclusion is that the second approach to change the Jenkins file is a cleaner, more efficient, and least changes to the original test cases and environment. We can consider to use this approach. However, please take a challenge to improve this approach by making it even cleaner. In my test code, I created a directory outside the current Jenkins workspace. This is a quick and dirty way to PoC. We should contain everything within the current Jenkins workspace. Therefore, a better approach is to create a directory ${env.WORKSPACE}/.optimize/${currentCiGroup}, and mount this directory into each container. Eventually remove these temp directories. Please let me know if you have questions. |
Applying the following second approach, functional parallel tests passes.
Jenkins job test was run to confirm stable results. The past 24 test jobs passes: https://jenkins.bfs.sichend.people.aws.dev/job/Kibana/job/bfs7.7.1/ |
Description:
Currently Jenkins jobs for bfs7.71 randomly fails. Below is identified issue:
Kibana does not support the current Node.js version v8.14.0. Please use Node.js v10.19.0.
The text was updated successfully, but these errors were encountered: