-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a CI workflow that creates new AMIs using packer #258
Comments
Either way would work and I prefer using Jenkins. |
|
More to consider:
|
Need 3 secrets to hold the value of these:
|
There might be another problem since the node that runs packer is the source, and needs to connect to destination on 22/5985 ports. This means if the workflow is running on an agent node, then the connection would be agent -> agent where our existing SG only allows connection from main -> agent. Either we add a new SG to allow agent -> agent connection (which is highly not recommended for security measures), or we restrict the AMI/Packer builder workflow run on only the main node (main -> agent). |
Add @gaiksaya @rishabh6788 @prudhvigodithi into the conversation on above issues ^^. Thanks. |
Why are we using jenkins? GHA can do all of these using roles. All you need to provide is right vpc and subnet right? |
In our discussion yesterday we were already talking about using it in Jenkins. Average mac build time is 2+ hrs and average windows build time is 1+ hour, cause inconsistency in the build overall. Thanks. |
@gaiksaya AMI building is an expensive task, it requires some resources and mainly could take lot of time to complete the end to end AMI building, for this GH runners would end up same issues like we had for manifest workflow failure, so better to use jenkins job. |
Got it! Forgot about the resources section. But even though with that all a machine needs is right credentials which has nothing to do with agent or main node. If agent node or AMI build is provided with right credentials we should be good. Running anything on main node is restricted as a security measure so we cannot and should not run on main node.
We already have that in place. |
See #258 (comment) |
New docker image |
We will use the same agentnode sg after making sure jenkins agent are all running on private subnet. |
This is completed. |
Is your feature request related to a problem? Please describe
Currently the AMI's used by agent nodes using a specific base image that may go out of date or need updates as new kernel updates come in.
This happens as often as per quarter. Even though we run yum update, apt updates, etc we still need to reboot the EC2 to apply those updates which does not fit jenkins' agent nodes' lifecycle management. If a SSH connection is lost (when we reboot) a new agent will be brought up.
Describe the solution you'd like
In order to apply regular updates to the base AMI image we need to build a new AMI.
Using packer it is a pretty straight forward process. https://github.com/opensearch-project/opensearch-ci/tree/main/packer
Below are 2 possible approaches:
Please keep in mind that this needs to be a blue green deployment and that's why old AMI's need to be deprecated (made private) only after confirming new AMI's are working fine. This can be a manual process to start with but can be automated via GHA too if we maintain a list somewhere.
Describe alternatives you've considered
Do the entire process manually. However building AMI even using packer takes more than half a day for the number of AMI's we have.
Additional context
No response
The text was updated successfully, but these errors were encountered: