-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jenkins Multi Master support #373
Comments
I still have question regarding this, as in each master will handle a portion of the workflows. Lets say if the master of build workflows offline, will another master able to pick up the workflow, or we have to wait for the original one to go online again? Thanks. |
@jordarlu Thank you for taking this up. I am wondering if we can just have one more master node added in the existing code with similar settings as the existing one, except for name and labels, and then register it as a new target group under the existing load balancer. We then route the traffic based on url path, e.g., if it is |
Hopefully once we splilt the Jenkins to process on each category of jobs ( for example, we will have a Jenkins for 'build', another Jenkins for 'gradle-check', and another Jenkins for 'benchmark' ), we won't face this master down issue anymore ( if the mastet down root cause was casued by the workload ), but that is a good point that to have a HA on Master |
I would suggest keeping both masters mutually exclusive of each other and use them to distribute our jobs based on their functionality. |
Hey! Just wondering did we research if having 2 masters will cause split brain issues? Sometime back I had read about this on jenkins forum. Worth researching a bit and experimenting with local set up before we move to implementation. AFAIK jenkins is not supposed to have more masters but I might be wrong and technology might have evolved since last I read but please do confirm. |
:) wonderful! that is also the direction I learned that we are moving toward to; from the end result, we may end of having https://build.ci.opensearch.org/build/ for the 'build' ; https://build.ci.opensearch.org/benchmark/ for the 'benchmark' ; and https://build.ci.opensearch.org/gradlecheck/ for the 'Gradle Check' ( just name of few to use as an example ... we will certainly discuss how we want to categorize it ) .. thanks for the good suggestion |
Understood ... thanks for bring this up, @gaiksaya , and let me do more reasearch on that ... the original idea was to distribute the load to be on a seperated Jenkins master ( based on the assumption that the master downtimes happened last month were caused by the increasing of workload ) while keeping using the same access FQDN ; but if we can have a way to do HA on master (without causing the issue you mentioned) , that will be even better I believe .. appreciate the consideration on all possible downside of having the HA and the experience sharing ~ |
Jenkins does not support multi master with Active Active load distribution, assume they have some load balancing with enterprise version https://www.cloudbees.com/capabilities/continuous-integration. However we have two options here.
I would go for option 2 as it has many advantages like Jenkins job level isolation, easy upgrades, less blast radius, easy to manage and more. https://welltempereddeveloper.com/ci/cd/2019/04/08/jenkins-ha-multiple-masters.html |
Thanks for the insight, @prudhvigodithi , should we explore both options that you mentioned above as they are not interfere with each other? While we seperate Jenkins master per category of workload, we can still have 'sort of' HA on each master to prevent single point of failure ? |
Sure Jeff, Once the Jenkins master are split, we are take that up as a new enhancement to add active/passive mechanism, should be easy as the underlying data store is EFS. |
I am closing this issue as we are moving on to creating mulitple Jenkins instance instead of spliting the master node, hopefully avoid the confusion between them. Let me also create a new issue to track on multiple Jenkins instace feature. |
Is your feature request related to a problem? Please describe
The existing Jenkins CI infrastructure serves as the exclusive system for executing a diverse range of critical tasks, including Gradle checks for Pull Requests (PR), release processes, benchmark tests, and various other functions.
Recently, there were instances of Jenkins performance degradation, possibly due to an escalating workload or other factors , which ultimately resulted in the Jenkins Master node going down, leading to Jenkins service downtime. Details about the most recent incident and the steps taken to restore Jenkins to functionality can be found at opensearch-project/opensearch-build#4130.
We need a long-term solution that will be capable of handling the growing workload to prevent future instances of Jenkins failure.
Describe the solution you'd like
The proposal in high level is to split the Jenkins into multiple Jenkins masters, and each Jenkins handling a set (category) of workloads and is isolated from other Jenkins masters and its associated categorized workloads.
Describe alternatives you've considered
In addition to the proposal mentioned above, we are open to any other proposals and ideas from the community to make Jenkinss even better, please feel free to make comments and describe your suggestions.
Additional context
This issue serves as the main issue to implement Jankins Multi Master support.
As we progress, we will consistently add/update comments, discussions, designs, and relevant issues and PRs to keep tracking all activities.
The text was updated successfully, but these errors were encountered: