This repository has been archived by the owner on Jul 30, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6
Better error handling in bootstrap_bastion.sh #1
Labels
enhancement
New feature or request
Comments
rmkraus
added a commit
that referenced
this issue
May 6, 2020
Added better messaging and alerting when there is a failure. Added a success message if no failures. This addresses issue #1
rmkraus
added a commit
that referenced
this issue
May 6, 2020
* Fixed permission issues in container image. The .ssh config file permissions were bad. The Docker file now ensures that these will be valid. * Fixed builder files. Dockerfile - fixed permissions settings for root ssh keys. Makefile - Added publish_dev action. * Bumped version * Fixed issues with hypervisor role. * Made documentation more explicit during the install process. * Added cluster create command to CLI. * Removed dedicated check for API to come up. The check was unreliable as the API can return different status codes at different times. The openshift-install wait-for command will check for the API anyway. * Removed unneeded commands from farosctl script. Context, forget, and run were all relics from another project that are mush less useful for this faros. Removing them so they do not cause chaos. This addresses issue #2. * Made the bastion bootstrap script more user friendly. Added better messaging and alerting when there is a failure. Added a success message if no failures. This addresses issue #1
rmkraus
added a commit
that referenced
this issue
May 9, 2020
Debugged cluster install process. Added startup and shutdown commands. * Fixed permission issues in container image. The .ssh config file permissions were bad. The Docker file now ensures that these will be valid. * Fixed builder files. Dockerfile - fixed permissions settings for root ssh keys. Makefile - Added publish_dev action. * Bumped version * Fixed issues with hypervisor role. * Made documentation more explicit during the install process. * Added cluster create command to CLI. * Removed dedicated check for API to come up. The check was unreliable as the API can return different status codes at different times. The openshift-install wait-for command will check for the API anyway. * Removed unneeded commands from farosctl script. Context, forget, and run were all relics from another project that are mush less useful for this faros. Removing them so they do not cause chaos. This addresses issue #2. * Made the bastion bootstrap script more user friendly. Added better messaging and alerting when there is a failure. Added a success message if no failures. This addresses issue #1 * Added startup cli command. 1) Control plane nodes are powered on 2) wait for 3 minutes for nodes to come up 3) approve any pending CSRs This addresses #8 * Added shutdown cli command. 1) The bootstrap certs are restored on app nodes. 2) The Cert CA is regenerated on the control plane. 3) The nodes are powered off. This addresses #8 * Bumped version. * Fixed some issues with startup and shutdown scripts. This addresses #8 * Updated README with shutdown and startup docs. * Fixes to shutdown roles. The daemonset to restore bootstrap certs should be removed when it is done. * Increased idle timeout on load balancer to 10 minutes. Was 30 seconds, but rsh sessions were getting disconnected far too quickly. * Defaulted back to dense output.
rmkraus
added a commit
that referenced
this issue
May 19, 2020
* Fixed permission issues in container image. The .ssh config file permissions were bad. The Docker file now ensures that these will be valid. * Fixed builder files. Dockerfile - fixed permissions settings for root ssh keys. Makefile - Added publish_dev action. * Bumped version * Fixed issues with hypervisor role. * Made documentation more explicit during the install process. * Added cluster create command to CLI. * Removed dedicated check for API to come up. The check was unreliable as the API can return different status codes at different times. The openshift-install wait-for command will check for the API anyway. * Removed unneeded commands from farosctl script. Context, forget, and run were all relics from another project that are mush less useful for this faros. Removing them so they do not cause chaos. This addresses issue #2. * Made the bastion bootstrap script more user friendly. Added better messaging and alerting when there is a failure. Added a success message if no failures. This addresses issue #1 * Added startup cli command. 1) Control plane nodes are powered on 2) wait for 3 minutes for nodes to come up 3) approve any pending CSRs This addresses #8 * Added shutdown cli command. 1) The bootstrap certs are restored on app nodes. 2) The Cert CA is regenerated on the control plane. 3) The nodes are powered off. This addresses #8 * Bumped version. * Fixed some issues with startup and shutdown scripts. This addresses #8 * Updated README with shutdown and startup docs. * Fixes to shutdown roles. The daemonset to restore bootstrap certs should be removed when it is done. * Increased idle timeout on load balancer to 10 minutes. Was 30 seconds, but rsh sessions were getting disconnected far too quickly. * Defaulted back to dense output. * Changed the startup operator heal timeout 15 minutes from 5. Seems like more time is occasionally necessary. * Added poweron and poweroff aliases. * Fixed verbosity issues with dense callback plugin. * Added VIP configuration for load balancer. Added a VIP with keepalived allows the IP address to be moved to the cluster when the cluster is done booting. * Configured Cluster DNS entries to point to the loadbalancer VIP. Also squashed a bug where wildcard entries were not properly updated if they already existed. * Added recipe for hosted-loadbalancer installation. This addresses issue #3. * Added code for deploy cli command. Also removed /app/deploy.sh as it's logic was folded into the cli command. This adddreses issue #3 * Added KUBECONFIG to default profile. * Bumped version, renamed cookbook directory. * Fixed permissions issue with hosted loadbalancer operator.
rmkraus
added a commit
that referenced
this issue
May 25, 2020
* Fixed permission issues in container image. The .ssh config file permissions were bad. The Docker file now ensures that these will be valid. * Fixed builder files. Dockerfile - fixed permissions settings for root ssh keys. Makefile - Added publish_dev action. * Bumped version * Fixed issues with hypervisor role. * Made documentation more explicit during the install process. * Added cluster create command to CLI. * Removed dedicated check for API to come up. The check was unreliable as the API can return different status codes at different times. The openshift-install wait-for command will check for the API anyway. * Removed unneeded commands from farosctl script. Context, forget, and run were all relics from another project that are mush less useful for this faros. Removing them so they do not cause chaos. This addresses issue #2. * Made the bastion bootstrap script more user friendly. Added better messaging and alerting when there is a failure. Added a success message if no failures. This addresses issue #1 * Added startup cli command. 1) Control plane nodes are powered on 2) wait for 3 minutes for nodes to come up 3) approve any pending CSRs This addresses #8 * Added shutdown cli command. 1) The bootstrap certs are restored on app nodes. 2) The Cert CA is regenerated on the control plane. 3) The nodes are powered off. This addresses #8 * Bumped version. * Fixed some issues with startup and shutdown scripts. This addresses #8 * Updated README with shutdown and startup docs. * Fixes to shutdown roles. The daemonset to restore bootstrap certs should be removed when it is done. * Increased idle timeout on load balancer to 10 minutes. Was 30 seconds, but rsh sessions were getting disconnected far too quickly. * Defaulted back to dense output. * Changed the startup operator heal timeout 15 minutes from 5. Seems like more time is occasionally necessary. * Added poweron and poweroff aliases. * Fixed verbosity issues with dense callback plugin. * Added VIP configuration for load balancer. Added a VIP with keepalived allows the IP address to be moved to the cluster when the cluster is done booting. * Configured Cluster DNS entries to point to the loadbalancer VIP. Also squashed a bug where wildcard entries were not properly updated if they already existed. * Added recipe for hosted-loadbalancer installation. This addresses issue #3. * Added code for deploy cli command. Also removed /app/deploy.sh as it's logic was folded into the cli command. This adddreses issue #3 * Added KUBECONFIG to default profile. * Bumped version, renamed cookbook directory. * Fixed permissions issue with hosted loadbalancer operator. * Added deploy container-storage command. 1) Added the logic to the deploy container storage. This addresses issue #4 2) Rearranged deploy folder to be more flexible 3) Added ability for ansible to save stats to a yaml file * Bumped version number. * Initial pass at cluster version pinning. This addresses #18 * Squashing some bugs. * Cleaned up my_dense stdout callback plugin. 1) broke out the post message logic to a new plugin 2) broke out the save stats logic to a new plugin 3) fixed the error handling 4) cleaned up the runtime messages This addresses issue #21 * Resolving issue with bootstrap node not being shutdown. This addresses issue #22 * Resolved issue with errors shutting down. The K8s API would briefly drop out of service which caused the Ansible to crash. The code is not more tolerant of this. This addresses issue #20 * Fixed issue with status messages not being correct. This addresses issue #21 * Final touches on container storage deployement recipe. This addresses issue #4 * Added verbosity to a failed install because of cert age. This addresses issue #23
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
If there is an error while running the installation script, the installation should stop and that error should be made visible to the user.
The text was updated successfully, but these errors were encountered: