-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoclustering broke in master #1013
Comments
Just to be clear, I only see one host in each of the apps. |
I did have 0.12 running before this and ran this on each host to upgrade:
|
After looking at https://github.com/weaveworks/scope/pull/867/files#diff-79e626e243584a7a4f65f233eca99889R59 I would say the problem is #867 broke Instead of using the package-level client I think we should use |
Actually, we use a (hardcoded) FQDN (scope.weave.works). @errordeveloper are you by any chance using a non-default Weave domain? |
you probably mean |
Asked @errordeveloper in person who said he is not using a custom weave domain. |
cannot reproduce... |
Yep, sorry. @errordeveloper Can you come up with an self-contained repro we can use? |
Step 1: create 7 DigitaOcean machines (need to
Step 3: access the app on any of the hosts, make sure there 7 nodes in the hosts view
|
Looks like 5 hosts is already enough to break it... will try 4 now. |
Can someone look into the implementation details, may be underlying DNS library doesn't handle too many records? We do have 3 records for each host, you know... |
Looks like it breaks with 5 hosts. |
With 0.12 I see this:
|
Also, |
That's already wrong. @errordeveloper has 5 nodes and the probe is reporting to 6 nodes, mixing up public interface IPs ( |
This appears to be quite likely to do with the upgrade sequence, as shown in #1013 (comment). Once 0.12 is running, here is what the logs look like:
After upgrade to latest build (1bfbf67), the logs show probes existing the publish loop:
|
So, there are two problems:
|
Also, doing |
I've just tried stoping all, downgrading and upgrading again and have in fact noticed that for a brief period of time after upgrading the probes are publishing and that's visible in the app. |
Using df2c21e.
I have 7 nodes, and 24 A record in DNS:
I'm also seeing a bunch of WebSocket errors:
The text was updated successfully, but these errors were encountered: