Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Lens v5.0.0 first connection attempt fails #3226

Closed
Kolossi opened this issue Jun 30, 2021 · 23 comments
Closed

Windows Lens v5.0.0 first connection attempt fails #3226

Kolossi opened this issue Jun 30, 2021 · 23 comments
Assignees
Labels
Milestone

Comments

@Kolossi
Copy link

Kolossi commented Jun 30, 2021

Describe the bug
After adding external "sync" configs to lens for windows v5.0.0, first attempt to connect to the cluster always fails with "Invalid Credentials error"

To Reproduce
Steps to reproduce the behavior:

  1. Add a cluster external config file as a "Kubeconfig Sync" in preferences "Kubernetes" tab
  2. restart Lens
  3. Attempt to connect to cluster
  4. Error "invalid credentials" is shown
  5. Attempt to connect again and connection can suceed

Expected behavior
Where config is valid, first attempt should suceed without error.

Screenshots

Screenshot, including failed attempt to get debug logs for win version using WSL terminal:

image

Environment (please complete the following information):

  • Lens Version: v5.0.0
  • OS: Windows
  • Installation method : auto upgraded from v5.0.0

Logs:
When you run the application executable from command line you will see some logging output. Please paste them here:

Unable to get debug logs from windows lens - see later part of discussion on https://k8slens.slack.com/archives/C1U90NQN8/p1625045050141900 and screenshot in this post.

Kubeconfig:
Quite often the problems are caused by malformed kubeconfig which the application tries to load. Please share your kubeconfig, remember to remove any secret and sensitive information.

N/A: connection works on second attempt
first time fail occurs with multiple clusters and with valid files using both ServiceAccount and Dex/LDAP login.

** Additional **

Would like to grab and post debug logging, will do so when advised how to do it in windows version of lens.

@Kolossi Kolossi added the bug Something isn't working label Jun 30, 2021
@jeffjones-kbx
Copy link

I am also getting the failed connect at initial connection to a cluster. Hit Reconnect a few times and got further but received the output in the screenshot below. I am able to view the cluster resources now though.

image

@Kolossi
Copy link
Author

Kolossi commented Jul 1, 2021

Additional observation:

When performing the initial failing connect attempt there is only the message "Connecting..." and no message "Authentication proxy started" at any stage.

When clicking to try again there is the message "Reconnecting..." and then after a while "Authentication proxy started" after which the connection then succeeds.

@Nokel81
Copy link
Collaborator

Nokel81 commented Jul 2, 2021

That is a good hint thanks

@jakolehm jakolehm modified the milestones: 5.0.1, 5.0.3 Jul 5, 2021
@andyliddle
Copy link

Can take 2 or 3 times to connect and it's not quick compared to previous versions.

Also, I just get the error "Oops, something went wrong."

@Nokel81 Nokel81 modified the milestones: 5.0.3, 5.1.0 Jul 12, 2021
@kumargauravsinha
Copy link

When can we expect the fix for this recurring issue please ? This is so frustrating

@fatpowaranga
Copy link

fatpowaranga commented Jul 15, 2021

I assumed this would be resolved before 5.0 release since it was noted by the team in the beta, but it's still a very prevalent issue. Now when Lens has very high CPU usage (after opening a couple clusters or so now) and I go to restart Lens, I run into this error for every cluster individually and it can take 1 to 5 reconnection attempts before it eventually works.

Any thoughts on what could be the cause?

Basically:

  1. Try to connect to cluster after restarting Lens.
  2. Get "Authentication proxy started - Oops, something went wrong" and hit Reconnect
  3. Most of the time it connects here, sometimes it goes into a bit of a loop and then eventually reconnects after a few attempts.

I have noticed that if I then immediately or in the next few minutes try to connect to another cluster, it works more often than not.

@Nokel81 Nokel81 modified the milestones: 5.1.0, 5.1.1 Jul 15, 2021
@Nokel81
Copy link
Collaborator

Nokel81 commented Jul 15, 2021

Marking this as a blocker for 5.1.1 as we plan to release 5.1 (with some other bug fixes) today.

@maartengo
Copy link

Updated just now, the first connection fails with the following error: Oops, something went wrong. Error: connect ECONNREFUSED 127.0.0.1:80
image

This is on 5.1.0-latest.20210715.1

@ramanNarasimhan77
Copy link

ramanNarasimhan77 commented Jul 16, 2021

I can also confirm that this is still happening in the latest release build

image

image

@Nokel81
Copy link
Collaborator

Nokel81 commented Jul 16, 2021

Yes we are currently investigating why that proxy connection is failing on windows. Thanks for the additional report.

@Nokel81 Nokel81 modified the milestones: 5.1.1, 5.1.2 Jul 17, 2021
@m8ram
Copy link

m8ram commented Jul 19, 2021

Every first connection to a cluster fails. Example for an AWS EKS cluster:
image

This fails in the same way for docker desktop kubernetes cluster, GCP K8s cluster.

I have my K8s kubeconfig set up locally, no sync or Lens cloud.

After clicking reconnect it always worked so far.

Is the local web server (127.0.0.1:80) part of Lens?

@Nokel81
Copy link
Collaborator

Nokel81 commented Jul 19, 2021

Yes the local web server is part of Lens, we are still looking into this.

@Nokel81 Nokel81 modified the milestones: 5.1.3, 5.2.0 Jul 22, 2021
@maartengo
Copy link

maartengo commented Jul 23, 2021

#3458 seems to have fixed the issue 🎉
image

@Nokel81
Copy link
Collaborator

Nokel81 commented Jul 23, 2021

🎉 Excellent, what about you @Kolossi ?

@ramanNarasimhan77
Copy link

I also confirm that the latest build has fixed this issue.

image

@Kolossi
Copy link
Author

Kolossi commented Jul 27, 2021

@Nokel81 : Unfortunately not quite there for me on Windows 10 and exactly same "help->about" as image in ramanNarasimhan77 previous post.

When I click to connect to first cluster, it goes through "connecting..." ->5secs delay -> "Authentication proxy started", but then spins forever (at least to 5 mins), no error shown.

If I click a second cluster I again get "connecting..." -> "Authentication proxy started", but then spins forever, no error shown.

Clicking back on the first cluster, it says "connecting..." then connection completes after about a second and overview shown.

Clicking back on the second cluster, it says "connecting..." then connection completes after about a second and overview shown.

Having done this, if I click on a third cluster I still get "connecting..." -> "Authentication proxy started" and it spins forever, no error shown.

Clicking off to first or second cluster then back to third cluster, it says "connecting..." then connection completes after about a second and overview shown.

@HarelM
Copy link

HarelM commented Aug 8, 2021

I'm too experiencing connection issues with Lens 5. It doesn't get solved for me in the second attempt. Not sire if I need to open a separate issue for this.
I have downgraded to version 4.2.5 and connection is working.
Let me know if I can help in any way - logs etc.
kubectl is working fine without any issues using the same config file.

@Kolossi
Copy link
Author

Kolossi commented Aug 11, 2021

@Nokel81 Any luck tracking this down?

Mostly it still does as per my previous comment, but just occasionally (like this morning) it connects straight in.

This does make me wonder again (with no evidence other than suspicion) whether #2750 is involved?

@Nokel81
Copy link
Collaborator

Nokel81 commented Aug 12, 2021

We have #3511 which is slated to be part of the 5.2.0 release. As for 2750, I wonder if that is a race condition... We use the port that is automatically given to us by the OS. However, in 4.2.4 there was a race condition where we would try and get a port and then stop using it before trying to use it again, that has been removed.

@m8ram
Copy link

m8ram commented Sep 10, 2021

I still experienced the issue with 5.1.3 as well.
With 5.2.0 on first use the connections did not throw these errors. I'll monitor in the coming days.

@Kolossi
Copy link
Author

Kolossi commented Sep 10, 2021

@Nokel81 I agree with @m8ram that initial indications with 5.2.0 are good - thanks!

As it's an issue that sometimes happens and sometimes doesn't, it seems fair to hold on closing for now though - I'll also monitor over the next few days and report back.

@Nokel81
Copy link
Collaborator

Nokel81 commented Sep 13, 2021

Will close this then. Please open a new issue (referencing this one) if you encounter it again.

@Nokel81 Nokel81 closed this as completed Sep 13, 2021
@HarelM
Copy link

HarelM commented Sep 14, 2021

It's now connecting in version 5.2 but causes a blue screen of death in Windows... I'll open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests