-
Notifications
You must be signed in to change notification settings - Fork 137
Properly truncate and randomly load balance large answers #237
Comments
@jdef: I think our answers should be limited to what fits in a DNS UDP datagram without truncation. I don't think we need a limit the number of IPs in the server memory, only in DNS answers. We should randomize the which IPs are included in the answer. |
@tsenart But this would break for example the prometheus DNS based SD. That expects that we return all slaves. |
@discordianfish: We need to do this regardless of whether we support EDNS(0), but that should alleviate the problem for Prometheus, as long as it has EDNS(0) enabled DNS client. The upper bound of records we can have in the answer section is 65535 due to the |
@tsenart Yes, that's what we should do but you said above that we should always limit the number so it fits into UDP. Instead we should return all, even if this means we need to truncate and have clients fall back to tcp. |
@discordianfish: By definition, when you truncate, you can't return all :-) |
Ok, let me rephrase: We should truncate the responses to whatever the client supports (512 without EDNS, whatever is indicated with EDNS) and 65535 records for TCP. |
👍 |
@discordianfish Is this what your PR that you merged today does? |
#330 (merged today) should take care of proper truncation. is there a need On Tue, Nov 17, 2015 at 12:08 PM, Sargun Dhillon [email protected]
|
@jdef So, digging through the code, it looks like P.S. perhaps |
if we're already shuffling then it sounds like we're all set with this On Tue, Nov 17, 2015 at 3:45 PM, Sargun Dhillon [email protected]
|
😸 |
wondering if we need some kind of max-record-count that limits the number if IPs returned for A records, or addresses returned for SRVs in large clusters (10k). for example, the "slave.mesos" name that maps to all slave IPs in the cluster probably doesn't scale very well to large cluster sizes. same goes for 'task.framework.name' for very large numbers of tasks with the same name.
what's the value in getting back 10k addresses or 10k IPs?
The text was updated successfully, but these errors were encountered: