Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

* Dimension "bandwidth exceeded" exhausted on 1 nodes #7405

Closed
aep opened this issue Mar 20, 2020 · 4 comments
Closed

* Dimension "bandwidth exceeded" exhausted on 1 nodes #7405

aep opened this issue Mar 20, 2020 · 4 comments

Comments

@aep
Copy link
Contributor

aep commented Mar 20, 2020

Nomad version

Nomad v0.10.4 (f750636)

Operating system and Environment details

archlinux.

Issue

there appears to be no way to set network_speed or at least ignore it if the interface is not the public interface (which i need to set, otherwise nomad announced the wrong address internally to consul)

unclear if network_speed is ignored because nomad node-status --verbose does not show any network resources, so there is no way to know why it is exhausted either, since no jobs require any network resources

Reproduction steps

client {
network_interface = "wireguard"
network_speed = 1000
}

launch any job without network resource constrains

###logs

2020-03-20T23:21:38.091+0100 [DEBUG] client.fingerprint_mgr.network: unable to read link speed: path=/sys/class/net/cluster/speed
2020-03-20T23:21:38.091+0100 [DEBUG] client.fingerprint_mgr.network: setting link speed to user configured speed: mbits=1000
2020-03-20T23:21:38.091+0100 [DEBUG] client.fingerprint_mgr.network: detected interface IP: interface=cluster IP=172.22.1.1

so apparently not ignored. still cant launch anything, even tho no jobs require network resources

@aep
Copy link
Contributor Author

aep commented Mar 23, 2020

i dont understand why, but rebooting the server (rather than the client where this is failing) made the problem go away

@aep aep closed this as completed Mar 23, 2020
@notnoop
Copy link
Contributor

notnoop commented Mar 23, 2020

Hi @aep Thanks for reporting the bug and sorry for the slowness. This seems very similar to #7232 that I'm investigating this week.

@aep
Copy link
Contributor Author

aep commented Mar 23, 2020

FWIW, i can't reproduce it. I think this might happen when you start without network_interface, then change it later. possibly the old value is cached somewhere.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants