Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspaces websockets not working properly on AWS and EKS Kubernetes clusters using nginx as reverse proxy #19434

Closed
5 of 24 tasks
bmboucher opened this issue Mar 27, 2021 · 13 comments
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/bug Outline of a bug - must adhere to the bug report template. new&noteworthy For new and/or noteworthy issues that deserve a blog post, new docs, or emphasis in release notes severity/P1 Has a major impact to usage or development of the system. status/release-notes-review-done Issues that have been reviewed by the doc team for the Release Notes wording
Milestone

Comments

@bmboucher
Copy link

bmboucher commented Mar 27, 2021

Describe the bug

I've been running 7.20.0 for a while, after upgrading to 7.27.2 I can't get the websockets endpoints to connect. The dashboard and the REST endpoints are available (I can successfully test them through Swagger), but the dashboard shows an error on loading and the console shows a bunch of websocket-related errors:

https://issues.redhat.com/browse/CRW-2082

Che version

  • latest
  • nightly
  • other: 7.27.2

Steps to reproduce

Fails on first loading the dashboard after logging in.

Runtime

  • kubernetes
    Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:38:50Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.6-eks-49a6c0", GitCommit:"49a6c0bf091506e7bafcdb1b142351b69363355a", GitTreeState:"clean", BuildDate:"2020-12-23T22:10:21Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
  • Openshift (include output of oc version)
  • minikube (include output of minikube version and kubectl version)
  • minishift (include output of minishift version and oc version)
  • docker-desktop + K8S (include output of docker version and kubectl version)
  • other: (please specify)

Screenshots

image

image

Installation method

  • chectl
  • Helm
  • OperatorHub
  • I don't know

Environment

  • my computer
    • Windows
    • Linux
    • macOS
  • Cloud
    • Amazon
    • Azure
    • GCE
    • other (please specify)
  • Dev Sandbox (workspaces.openshift.com)
  • other: please specify

Eclipse Che Logs

che-dev.zip

Release Notes Text

A bug that prevented workspaces websockets to work properly on AWS and EKS Kubernetes clusters using nginx as reverse proxy has been fixed.

@bmboucher bmboucher added the kind/bug Outline of a bug - must adhere to the bug report template. label Mar 27, 2021
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Mar 27, 2021
@benoitf
Copy link
Contributor

benoitf commented Mar 27, 2021

looks like a duplicate of #19403

@sleshchenko
Copy link
Member

WebSocket failure should not lead to the share state.
@bmboucher could you check other failed requests on Network tab of Developer Tools?

@l0rd l0rd added severity/P1 Has a major impact to usage or development of the system. area/dashboard and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Mar 29, 2021
@bmboucher
Copy link
Author

WebSocket failure should not lead to the share state.
@bmboucher could you check other failed requests on Network tab of Developer Tools?

I don't see any other failed requests; when I load the dashboard page I get some 304/302 redirect responses and a couple of 200 responses whenever it hits the "/api" endpoints. It's just websockets that is not connecting. I've confirmed that our older deployment (7.20.0) on similar infrastructure does establish the ws connection just fine.

image
image

@bmboucher
Copy link
Author

UPDATE: I reverted to version 7.26.0 and it fixed the "unable to get available storage types" error; I think that one is coming from #19403 as @benoitf indicated. I do have the same issue with websockets, but now I see the expected error notification at the top of the page:
image

@sleshchenko
Copy link
Member

I've confirmed that our older deployment (7.20.0) on similar infrastructure does establish the ws connection just fine.

@bmboucher that's strange.
WebSocket is finished just after it's established.
Could you share websocket frames? Anything interesting in Che Server logs?

@bmboucher
Copy link
Author

OK, I figured it out - first off, apologies that I was a little inaccurate in my problem description. The networking setup was not identical between the instance that was working and the broken one as I had thought - we had moved to a new nginx deployment and that was either confused about where to route WS traffic or whether it was enabled at all.

The solution was to add the following annotation to our che-ingress:

nginx.org/websocket-services: che-host

We installed from the Helm chart which doesn't contain this annotation so you might want to consider adding it; I'm not sure when NGINX began requiring it or if chectl would have solved the problem.

@sleshchenko sleshchenko added area/install Issues related to installation, including offline/air gap and initial setup and removed area/dashboard labels Mar 30, 2021
@tolusha
Copy link
Contributor

tolusha commented Mar 30, 2021

@bmboucher
Thank you for investigation.
Some info about nginx.org/websocket-services annotation
nginx/kubernetes-ingress#322

@tolusha
Copy link
Contributor

tolusha commented Jul 16, 2021

Fixed by eclipse-che/che-server#54

@l0rd l0rd added the new&noteworthy For new and/or noteworthy issues that deserve a blog post, new docs, or emphasis in release notes label Jul 20, 2021
@l0rd l0rd changed the title Websockets not connecting after upgrade from 7.20.0 to 7.27.2 (AWS/EKS) Websockets not connecting on AWS and EKS Kubernetes using nginx as reverse proxy Jul 20, 2021
@l0rd l0rd changed the title Websockets not connecting on AWS and EKS Kubernetes using nginx as reverse proxy Workspaces websockets not working properly on AWS and EKS Kubernetes using nginx as reverse proxy Jul 20, 2021
@l0rd l0rd changed the title Workspaces websockets not working properly on AWS and EKS Kubernetes using nginx as reverse proxy Workspaces websockets not working properly on AWS and EKS Kubernetes clusters using nginx as reverse proxy Jul 20, 2021
@l0rd l0rd added the status/release-notes-review-needed Issues that needs to be reviewed by the doc team for the Release Notes wording label Jul 20, 2021
@MichalMaler MichalMaler added status/release-notes-review-done Issues that have been reviewed by the doc team for the Release Notes wording and removed status/release-notes-review-needed Issues that needs to be reviewed by the doc team for the Release Notes wording labels Jul 22, 2021
@kphilly1
Copy link

Do you when the next Docker image version of Che (minor/micro), which includes these fixes, will be published? Looks like it's close based on the latest activity here. If not today or tomorrow, is there documentation on how we can build the affected Docker images locally (with code from 'main' branch)? Thank you very much.

@tolusha
Copy link
Contributor

tolusha commented Jul 23, 2021

The fix is included in the next version of Eclipse Che.
You can deploy it using chectl installed from the next channel.

@kphilly1
Copy link

Thanks for your reply @tolusha . So we can pull the 'next' version of the Docker images for the various Che components from quay.io?

https://quay.io/repository/eclipse/che-server?tab=tags

If so, we can try pulling the 'next' version and trying. Thanks again!

@tolusha
Copy link
Contributor

tolusha commented Jul 23, 2021

@kphilly1
yes, you can

@kphilly1
Copy link

Thanks @tolusha !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/bug Outline of a bug - must adhere to the bug report template. new&noteworthy For new and/or noteworthy issues that deserve a blog post, new docs, or emphasis in release notes severity/P1 Has a major impact to usage or development of the system. status/release-notes-review-done Issues that have been reviewed by the doc team for the Release Notes wording
Projects
None yet
Development

No branches or pull requests

8 participants