-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC Error #101
Comments
Thanks for you submission @dtonnesen, we'll triage this shortly. |
Hi @dtonnesen , can you re-install the CSM installer and set |
Thank you I tried that too I should have mentioned sorry. I can do it again if you think the logs might be different. |
Yes, please try again and check if the logs are different. Prior to re-installing you can delete the |
Sure, I did do that last time also but I'll do it again. |
Looks to be the same errors: I211117 13:12:51.808172 28 server/init.go:197 ⋮ [n?] 30 awaiting transport: Error while dialing dial tcp: i/o timeout" (retrying) |
Currently reviewing the environment |
Closing this question as we've discussed a workaround using the helm charts. |
* [replication] Added upgrade page and updated install info (#57) * Added note about repctl logs file * Added upgrade instructions for both controller and sidecar * modified installation\upgrade section * Fixed couple of grammar mistakes * Added new entry to troubleshooting page * Addressed review comments * Changed link address Co-authored-by: Maxim Sklyarov <[email protected]> * Update deployment steps for CSM Authorization (#58) * begin updating deployment * fixed typos * add auth upgrade doc * updated powerscale with authorization * updated authorization documentation for powermax, powerflex, and powerscale * refactored for powermax * added vxflexos related docs for auth deployment and configuration * consolidated proxy server root cert * fix grammar, notes, value.yaml parameters, update auth deployment * added note for driver configurations with auth * updated note * add auth note to drivers * update upgrade path Co-authored-by: atye <[email protected]> Co-authored-by: sharmilarama <[email protected]> Co-authored-by: Logan Jones <[email protected]> * Fix operator install docs (#62) * Small update to the contributing doc (#54) * Update _index.md * Update _index.md * fixed sidecar instructions * Update _index.md * making changes requested by Aron * trying to get rid of unwanted changes Co-authored-by: gallacher <[email protected]> * add Volume Health Monitor section (#67) * add Volume Health Monitor section * PR feedback * pv/pvc metrics csi-powerstore changes (#64) * Added troubleshooting documentation about gateway timeout for authorization (#63) * Upgrade and Rollback Support for CSM for Authorization proxy server (#66) * added auth upgrade and rollback, updated auth notes for drivers * fixed spacing * [replication] Added uninstall page, updated repctl readme (#70) * static provisioning and ephemeral changes (#71) * Update uninstall.md * updated auuth deployment steps (#72) * add healthMonitorInterval to values table (#79) * Helm install update (#74) * updating helm install instructions * adding troubleshooting for helm update * minor changes and updates * more minor changes * word change * more minor changes * addressing comments from Jacob * fixing numbers * update code owners (#76) * Move health monitor section to correct file (#81) * update correct file * remove feature from wrong file * Removed older OpenShift and added new driver versions (#84) * Feature rwop csi powerstore (#89) * Documentation for RWOP - CSI Powerstore * Addressed review comment * Update powerstore.md Co-authored-by: shanmydell <[email protected]> * Feature rwop accessmode support for csi-powerscale (#90) Co-authored-by: shanmydell <[email protected]> * Tenant documentation for both csi-unity and operator (#85) Co-authored-by: shanmydell <[email protected]> * Replication prerequisites & troubleshooting (#93) Co-authored-by: shanmydell <[email protected]> * Feature/pvc metrics csi powerstore update (#91) * volume health monitoring update (#92) * volume health monitoring update * Update powerscale.md * update documentation for health monitoring Co-authored-by: shanmydell <[email protected]> Co-authored-by: Randeep Sharma <[email protected]> Co-authored-by: Bahubali Jain <[email protected]> * Changed replication support matrix (#94) * Changed replication support matrix * Changed to X * Add health values (#95) * add new values to values table * Add note to features section * fix typo * Common changes (#86) * Unity - RWOP Access Mode and Volume Health Monitoring (#77) * RWOP support matrix change (#96) * Added known issue for unity (#97) * Update powerflex.md (#98) * powerscale release notes updated (#99) * Operator Docs changes related to Unity features (#102) * Operator upgrade documentation for volume health monitor changes (#104) * Added note about how to list volume snapshots (#101) * restructured deployment docs (#106) * Improve operator install steps (#107) * Update versions (#100) * Added note that clarifies keys for csm installer (#108) * Added volume health monitor in CSI spec support (#109) * updated sample update for topology usage (#112) #82 Co-authored-by: Andrey Schipilo <[email protected]> Co-authored-by: Maxim Sklyarov <[email protected]> Co-authored-by: shaynafinocchiaro <[email protected]> Co-authored-by: atye <[email protected]> Co-authored-by: sharmilarama <[email protected]> Co-authored-by: Logan Jones <[email protected]> Co-authored-by: Jooseppi Luna <[email protected]> Co-authored-by: JacobGros <[email protected]> Co-authored-by: Ashish Verma <[email protected]> Co-authored-by: Trevor Dawe <[email protected]> Co-authored-by: gilltaran <[email protected]> Co-authored-by: hoppea2 <[email protected]> Co-authored-by: Francis Nijay <[email protected]> Co-authored-by: shanmydell <[email protected]> Co-authored-by: Bahubali Jain <[email protected]> Co-authored-by: karthikk92 <[email protected]> Co-authored-by: Sakshi-dell <[email protected]> Co-authored-by: Randeep Sharma <[email protected]> Co-authored-by: Bahubali Jain <[email protected]> Co-authored-by: rensyct <[email protected]> Co-authored-by: Rajendra Indukuri <[email protected]> Co-authored-by: abhi16394 <[email protected]> Co-authored-by: panigs7 <[email protected]> Co-authored-by: Prasanna M <[email protected]>
How can the Team help you today?
Attempting to install CSM. Vanilla K8s, v1.20.9 Ubuntu. Using default parameters and only modifying the 4 parameters in values.yaml with no certificates.
root@dsib0211:~/csm# more values.yaml
jwtKey: key
cipherKey: "aasdfgafhgshsffadgshsdffgsdggggg"
adminUserName: admin
adminPassword: admin
helm install -n csm-installer --set-string scheme=http --set-string dbSSLEnabled="false" --create-namespace csm-installer dell/csm-installer -f values.yaml
NAME: csm-installer
LAST DEPLOYED: Tue Nov 16 10:15:21 2021
NAMESPACE: csm-installer
STATUS: deployed
REVISION: 1
TEST SUITE: None
Deployment seems to succeed but the cockroachdb does not start:
NAMESPACE NAME READY STATUS
csm-installer cluster-init-sjtdr 1/1 Running
csm-installer cockroachdb-0 0/1 Running
csm-installer cockroachdb-1 0/1 Running
csm-installer dell-csm-installer-86665ffb7d-dphzh 1/1 Running
Looking at the logs for the cluster-init I see constant repetition of error below which I assume is referring to the db:
kubectl logs -f cluster-init-sjtdr -n csm-installer
warning: node not ready to perform cluster initialization: initial connection heartbeat failed: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: i/o timeout" (retrying)
cockroachdb logs have connectivity issue:
kubectl logs -f cockroachdb-0 -n csm-installer
W211116 15:16:03.965422 115 vendor/google.golang.org/grpc/internal/channelz/logging.go:73 ⋮ [-] 26 ‹grpc: addrConn.createTransport failed to connect to {cockroachdb-1.cockroachdb:26257 0 }. Err: connection error: desc = "transport: Error while dialing dial tcp: i/o timeout". Reconnecting...›
W211116 15:16:03.965681 46 server/init.go:374 ⋮ [n?] 27 outgoing join rpc to ‹cockroachdb-1.cockroachdb:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: i/o timeout"›
I have thoroughly searched for troubleshooting information and tried changing values like host and api server in the values.yaml but same error. I'm happy to provide additional information if useful. Thanks.
The text was updated successfully, but these errors were encountered: