Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In mirrored network mode the Kafka client is unable to connect to the server properly. #11450

Closed
1 of 2 tasks
JustLookAtNow opened this issue Apr 11, 2024 · 18 comments
Closed
1 of 2 tasks
Labels

Comments

@JustLookAtNow
Copy link

JustLookAtNow commented Apr 11, 2024

Windows Version

10.0.22635.3430

WSL Version

2.2.1.0

Are you using WSL 1 or WSL 2?

  • WSL 2
  • WSL 1

Kernel Version

5.15.150.1

Distro Version

Ubuntu 22.04

Other Software

org.springframework.kafka:spring-kafka:2.9.13
jdk 1.8

Repro Steps

Using the Java program I've written to connect to the Kafka server.

Expected Behavior

Successfully connected.

Actual Behavior

It threw an error and raised an exception.

 java.net.BindException: Cannot assign requested address
	at sun.nio.ch.Net.connect0(Native Method) ~[?:1.8.0_402]
	at sun.nio.ch.Net.connect(Net.java:482) ~[?:1.8.0_402]
	at sun.nio.ch.Net.connect(Net.java:474) ~[?:1.8.0_402]
	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:647) ~[?:1.8.0_402]
	at org.apache.kafka.common.network.Selector.doConnect(Selector.java:277) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.common.network.Selector.connect(Selector.java:255) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:990) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:301) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.tryConnect(ConsumerNetworkClient.java:575) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$FindCoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:854) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$FindCoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:830) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:206) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:169) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:129) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:602) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:412) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:297) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:246) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.coordinatorUnknownAndUnready(ConsumerCoordinator.java:460) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:488) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1262) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1231) ~[kafka-clients-3.1.2.jar:?]
	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1211) ~[kafka-clients-3.1.2.jar:?]
	at sun.reflect.GeneratedMethodAccessor385.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_402]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_402]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) ~[spring-aop-5.3.27.jar:5.3.27]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213) ~[spring-aop-5.3.27.jar:5.3.27]
	at com.sun.proxy.$Proxy511.poll(Unknown Source) ~[?:?]
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1601) ~[spring-kafka-2.9.13.jar:2.9.13]
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1576) ~[spring-kafka-2.9.13.jar:2.9.13]
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1377) ~[spring-kafka-2.9.13.jar:2.9.13]
	at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1291) ~[spring-kafka-2.9.13.jar:2.9.13]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_402]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_402]
	at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_402]

It seems that according to my comparison, the connect0 method includes an isIPv6Available method. This method always returns false in NAT or bridged network modes but returns true in mirrored mode. This could be the cause of the error, especially considering that your machine and the Kafka server are only connected via IPv4 network.

Diagnostic Logs

WslLogs-2024-04-11_15-55-32.zip

Copy link

View similar issues

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it!

Open similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

Diagnostic information
.wslconfig found
Detected appx version: 2.2.1.0
Unexpected format in optional-component.txt: State       : DisabledWithPayloadRemoved

@JustLookAtNow
Copy link
Author

update:
At the beginning, the connection to the Kafka server was normal. However, once the connection count reached 299, the error 'Cannot assign requested address' started to occur.

@JustLookAtNow
Copy link
Author

update 2:
I found the cause of the problem. It's because the connection port is specified only from 60500 to 60800 in /proc/sys/net/ipv4/ip_local_port_range, with only 300 connections allowed! Naturally, Kafka throws an error when it reaches this connection limit. However, when I attempted to modify net.ipv4.ip_local_port_range, I found that after any changes, no TCP connections could be created. So, how can I increase the client connection limit?

@JustLookAtNow
Copy link
Author

update 2: I found the cause of the problem. It's because the connection port is specified only from 60500 to 60800 in /proc/sys/net/ipv4/ip_local_port_range, with only 300 connections allowed! Naturally, Kafka throws an error when it reaches this connection limit. However, when I attempted to modify net.ipv4.ip_local_port_range, I found that after any changes, no TCP connections could be created. So, how can I increase the client connection limit?

@OneBlue "Is there any WSL configuration that can change this parameter?"

@dickens7
Copy link

I had the same problem
Setting the firewall to false is normal, so guess what the firewall rules should be causing it

[experimental]
firewall=false

@JustLookAtNow
Copy link
Author

I had the same problem Setting the firewall to false is normal, so guess what the firewall rules should be causing it

[experimental]
firewall=false

it's do nothing

@chanpreetdhanjal
Copy link

Hi. Can you please collect networking logs by following the instructions below?
https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

@JustLookAtNow
Copy link
Author

Hi. Can you please collect networking logs by following the instructions below? https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

Please check my previous response. I have identified the cause of the issue. It appears that in /proc/sys/net/ipv4/ip_local_port_range, there are only 300 ports available, ranging from 60500 to 60800. This limitation results in an error "Cannot assign requested address" when attempting to create more than 300 connections to the same IP and port. Additionally, in bridge mode, the range is expanded to 32768 to 60999, providing over thirty thousand ports. If you still require network logs, please let me know, and I will collect them for you.

@chanpreetdhanjal
Copy link

Yes we will need logs to further assist you. thanks

@JustLookAtNow
Copy link
Author

JustLookAtNow commented May 6, 2024

Yes we will need logs to further assist you. thanks

/emailed-logs
It is too big, I have email it to you.

@keith-horton
Copy link
Member

Hi there. I see 2 bind requests that showed up about 1 minute before the end of the trace - one for ::0 port 62698, one for ::0 port 6400. both were successful. was this run native on the root, or within a container that something created within Linux?

@JustLookAtNow
Copy link
Author

Hi there. I see 2 bind requests that showed up about 1 minute before the end of the trace - one for ::0 port 62698, one for ::0 port 6400. both were successful. was this run native on the root, or within a container that something created within Linux?

Both of these ports are being listened on by a Java program running in a Linux environment.My current issue isn't with listening ports, but rather, as a network client, I'm running out of available client ports. Currently, it seems there are only 300 available ports in the /proc/sys/net/ipv4/ip_local_port_range file. This results in a 'Cannot assign requested address' error when I try to establish more than 300 socket connections to the server.

@keith-horton
Copy link
Member

Oh, thank you for clarifying.
We can definitely make the number of ephemeral ports reserved for the Linux container configurable. I'll work on that right now.

@erSitzt
Copy link

erSitzt commented Jun 3, 2024

@keith-horton Is there any workaround for this while this is not configurable ?
Just setting the port range like on any other linux does not help.. or communication outside the initial range is blocked ?
When increasing the range i get i/o timouts talking to my dns

read udp 192.168.1.xxx:60895->192.168.1.1:53: i/o timeout

with 60895 being outside the default range of

❯ cat /proc/sys/net/ipv4/ip_local_port_range                                                                                                                                                                                                
60500   60800

im in mirrored network mode by the way...

And another note... most tools i have problem with are GO applications kubectl / kapp (carvel kapp) and medusa ( a tool to import export hashicorp vault secrets )
not sure if those tend to make excessive use of connections using up the local port range ???

@erSitzt
Copy link

erSitzt commented Jun 3, 2024

GO applications seem to be prone for this problem because many tools do not set Transport.MaxConnsPerHost when using net/http, which defaults to unlimited.

@keith-horton
Copy link
Member

There's not an immediate work around, but we have a fix ready.
Is there a target # for the ephemeral range needed in your scenarios?

Thanks!

@erSitzt
Copy link

erSitzt commented Jun 3, 2024

@keith-horton its hard to guess as i cant really verify the connection count and im not sure how exactly the next usable free port is allocated 🤷

If there is a fix coming, it would be nice to:

  • make it configurable
  • increase the default range ( maybe double for starters ? )
  • document it ( if this is related to mirrored network mode ?)

For me this problem occurs in two sceanrios:

  • Long running WSL, workstation not restartet in several days/weeks, often with many vscode instances running
  • Even directly after booting the workstation and WSL, but trying to work with stuff that could result in multiple requests, like querying lots of resources in a kubernetes cluster or exporting a whole tree structure from hashicorp vault.

The first is not really a problem, but the second leads to a point where WSL will always hit the limit, forcing me or WSL users in general to switch to a linux vm for some commands.

@benhillis
Copy link
Member

Fixed with https://github.com/microsoft/WSL/releases/tag/2.3.11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants