-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core) implement internal dns with loadbalancing, replace dnsmasq #1587
Conversation
Cool stuff 👏 |
@@ -111,7 +112,7 @@ _send = function(premature, self, to_send) | |||
local client = http.new() | |||
client:set_timeout(self.connection_timeout) | |||
|
|||
local ok, err = client:connect(self.host, self.port) | |||
local ok, err = connect(client, self.host, self.port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could implement this in our globalpatches
module now, to avoid diverging from the tcpsock regular API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you see that? the connect
method is metatable property of the socket userdata. I don't think we should mess with that. Or did you have something different in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
-- globalpatches.lua
local tcp = ngx.socket.tcp
_G.ngx.socket.tcp = function(...)
local sock = tcp(...)
local connect = sock.connect
sock.connect = function(...)
return dns.connect(sock, ...)
end
return sock
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you try that? I'd expect sock
to be a userdata, so the line sock.connect = ...
will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might just work
> resty -e "local s = ngx.socket.tcp(); print(type(s));s.connect = function() end"
table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does work, but was rejected by resty-cli
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean 'was rejected by resty-cli'?
Should we make |
Is there any reason why we couldn't do that? resty-cli runs in a |
balancer_by_lua_block { | ||
kong.balancer() | ||
} | ||
keepalive 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for polishing this PR: make this configurable.
Also worth considering allowing users to add custom upstream
blocks (managed by our template) to allow for different upstream connection pools maybe? With different keepalive settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With different keepalive settings.
I think that making it globally configurable in the configuration file would be enough for now.
loadbalancing; lookups are rotated (round-robin) over the entire dns record Would be cool if the LB will be customizable with custom policies in the future. |
If patching the the connect method in globalpatches works, then it would be easy. |
patch the global tcp.connect function to use the internal dns resolver
Currently having this error:
|
…migration process
…rs an empty table if nothing is specified instead of the previous `nil`. So if table length is 0, drop it and revert to defaults.
@@ -1,16 +1,12 @@ | |||
#!/usr/bin/env resty | |||
#!/usr/bin/env resty -c 65535 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would override whatever ulimit
I have already set - do we really need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it, latest dns.lua
master doesn't use it anymore
-- @param target the table with the target details | ||
-- @return balancer if found, or nil if not found, or nil+error on error | ||
local get_balancer = function(target) | ||
return nil -- TODO: place holder, forces dns use to first fix regression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Tieske said we can merge with this TODO in place, because it's fixed in the other upstream PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local ok, err = balancer_execute(balancer_address) | ||
if not ok then | ||
ngx.log(ngx.ERR, "failed the initial dns/balancer resolve: ", err) | ||
return ngx.exit(500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use return responses.send_HTTP_INTERNAL_SERVER_ERROR(err)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
for _, row in ipairs(rows) do | ||
if not row.retries then -- only if retries is not set already | ||
local _, err = dao.apis:update(row, { id = row.id }, {full = true}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: this is not doing anything. It's probably missing a row.retries = 5
before executing the update(..)
method.
@@ -163,7 +163,8 @@ return { | |||
request_path = {type = "string", unique = true, func = check_request_path}, | |||
strip_request_path = {type = "boolean", default = false}, | |||
upstream_url = {type = "url", required = true, func = validate_upstream_url_protocol}, | |||
preserve_host = {type = "boolean", default = false} | |||
preserve_host = {type = "boolean", default = false}, | |||
retries = {type = "number", default = 5, func = check_retries}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the previous row.retries = 5
in the Cassandra migration was not set because the expectation was that re-updating the same entity would populate the default values.
Not sure if this really happens, and if it doesn't, should it be fixed? This needs to be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checked it and it works as it currently is. Adding an api on 0.9.2, then restarting Kong with this branch (running the migrations), results in this api output;
Mashapes-MacBook-Pro-2:kong thijs$ http get localhost:8001/apis
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Date: Wed, 12 Oct 2016 18:10:31 GMT
Server: kong/0.9.2
Transfer-Encoding: chunked
{
"data": [
{
"created_at": 1476295721000,
"id": "13af5ca7-a617-4e22-92b7-540fe9607e65",
"name": "tieske",
"preserve_host": false,
"request_path": "/",
"retries": 5,
"strip_request_path": false,
"upstream_url": "http://www.thijsschreijer.nl"
}
],
"total": 1
}
Retries default is now set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^^ being with Cassandra configured as the database
local ok, err = balancer_execute(addr) | ||
if not ok then | ||
ngx.log(ngx.ERR, "failed to retry the balancer/resolver: ", err) | ||
return ngx.exit(500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use responses.send_HTTP..
local ok, err = set_current_peer(addr.ip, addr.port) | ||
if not ok then | ||
ngx.log(ngx.ERR, "failed to set the current peer: ", err) | ||
return ngx.exit(500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use responses.send_HTTP..
We need to add more integration tests that check if |
keepalive is now configurable (one global setting) |
@Tieske one thing I wanted to bring up here, for ELB's in AWS, they will resolve to an ip address, however that underlying ip can eventually change. I noticed one of the features is dns resolution is cached in Lua memory, but it should be looked up every few seconds to mitigate this. Here is a much better post describing the issue and solution: |
@pgieniec5tar apparently (what I read from the post) is that nginx does not honour the This dns resolver branch will honour |
initial merge commit, untested. Probably needs some fixes still.
@Tieske can you please fix the conflicts? I would like to execute some benchmarks too on the latest code. |
reverting some previously breaking changes
…. See comments for caveats.
Summary
implements an internal dns resolver, removes the dnsmasq dependency.
Features
balancer_by_lua
directiveconnect
method also resolves ports. So auxiliary systems can now also be configured using SRV records. The effect is that for example, the LDAP plugin, when connecting to a host named "myHost" and port "123", will be looked up against an SRV record as well. If the SRV record returns 3 hosts with ports, those will be used in round robin mode (the provided port '123' will be overridden by the dns results)todo
dns.lua
and publish itset_more_retries
as discussed hereretries
keepalive
setting configurableadditional todos, not making it here...
setTimeouts()
(optional)related to
instructions
To use this branch the dns.lua module must be manually installed