Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hash on cookie #3472

Merged
merged 1 commit into from
May 30, 2018
Merged

Add hash on cookie #3472

merged 1 commit into from
May 30, 2018

Conversation

jtschulz
Copy link

@jtschulz jtschulz commented May 17, 2018

Summary

Add ability to hash on cookie. Per a discussion on the feature suggestion board with @Tieske , this adds a cookie option for the hash_on feature, which works very similarly to the `header option. The use case for this is to allow load balancing with cookie injection, creating a sticky session for API consumers who all behind the same NAT gateway. Specifically this is useful for websockets.

Full changelog

  • Implement cookie source for hashing algorithm
    • Upstream config can use cookie for hash_on and hash_fallback value
    • Upstream config uses hash_on_cookie or hash_fallback_cookie for the name of the cookie to use
    • Upstream config uses hash_on_cookie_path or hash_fallback_cookie_path to add the path to the Set-Cookie header
  • Add related unit and integration tests

Notes

I'm new to Lua (and, hence, this project), so feel encouraged to tell me everything that's wrong with my code ;) I'm not sure if there's a place for me to add documentation.
I will clean up git history after making any requested changes/ getting final approval.

@@ -748,5 +748,16 @@ return {
CREATE INDEX IF NOT EXISTS ON consumers(custom_id);
CREATE INDEX IF NOT EXISTS ON consumers(username);
]]
},
{
name = "2018-05-17-173100_hash_on_cookie",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some automated way contributors are generating these migrations? I just manually created the timestamp/ name and it seemed to work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no migrations have to be created manual (for now). Also datastore specific. And indeed the name is date+timestamp, just to make it unique (within a migrations file).

@jtschulz jtschulz changed the title WIP - Add hash on cookie Add hash on cookie May 18, 2018
@@ -656,6 +657,9 @@ local create_hash = function(upstream)
if type(identifier) == "table" then
identifier = table_concat(identifier)
end

elseif hash_on == "cookie" then
identifier = ngx.var["cookie_" .. upstream[cookie_field_name]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you need to store the cookie in the ngx.ctx table. And if there is no cookie, you need to create it.

additionally, in the header-filter phase you need to add the cookie to the response

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh great, I misunderstood and thought that a plugin would manage the generation or the cookie, but I think this is a lot better. Thanks for the quick feedback!

@jtschulz jtschulz changed the title Add hash on cookie WIP - Add hash on cookie May 18, 2018
@jtschulz
Copy link
Author

@Tieske I think this handles what you were asking for, please take a look. I will add tests of course, but I want to make sure this is on the right path before I go further.

@@ -707,6 +708,17 @@ return {
header[upstream_status_header] = matches[0]
end
end

if ctx.SET_HASH_ON_COOKIE then
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this in the before hook, presumably this will give plugins the ability to access/ modify the cookie value if there is some use case for it.

local cookies = get_request_cookies()
table.insert(cookies, cookie_field_name .. "=" .. identifier)
ngx.header["Cookie"] = table_concat(cookies, "; ")
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is all we need here:

elseif hash_on == "cookie" then
  identifier = ngx.var["cookie_" .. upstream[cookie_field_name]] or utils.uuid()

  ctx.SET_HASH_ON_COOKIE = {
    key = upstream[cookie_field_name],
    value = identifier,
    path = upstream[cookie_path_field_name],
  }

We only need to store the retrieved values to insert the cookie on the way out (in the response), no need to add it to the request here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that makes sense. Thanks! Although, I think we don't want to set the cookie again if it already exists, correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you're right! good catch


insert(cookies, ctx.SET_HASH_ON_COOKIE)
header['Set-Cookie'] = cookies
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the header:

local ck = require "resty.cookie"

and here we do something like:

local cookie_hash_data = ctx.SET_HASH_ON_COOKIE
if cookie_hash_data then
  local cookie, ok, err
  cookie, err = ck:new()
  if cookie then
    ok, err = cookie:set(cookie_hash_data)
  end
  if not ok then
    log(ngx.WARN, "failed to set hashing cookie, key=`", cookie_hash_data.key, 
                             "`, value=`", cookie_hash_data.value, 
                             "`, path=`", cookie_hash_data.path,
                             "` : ", err)
  end
end

hash_fallback_cookie = {
-- cookie name, if `hash_fallback == "cookie"`
type = "string",
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some validation on the contents of the cookie name and path are needed here imo

return false, Errors.schema("Hashing on 'cookie', " ..
"but no cookie name provided")
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also relevant here: if the primary is set to "cookie" then the fallback must be "none". Because if the cookie isn't set, it will be created, hence the fallback will never be invoked in that case.

@@ -797,16 +797,16 @@ end
-- a-z, A-Z, 0-9 and '-' are allowed.
-- @param name (string) the header name to verify
-- @return the valid header name, or `nil+error`
_M.validate_header_name = function(name)
_M.validate_http_token = function(name, token_type)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cookie name spec technically allows a handful of other characters, but the same is true of headers. Seems fair to generalize these rules for both. Unless there are any objections?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the purpose it serves, I don't see any issues with that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry forget about that, I think we shouldn't change this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, and I was on the fence about this, too. Do you prefer me to just duplicate the logic for the cookie name validation? The only difference would be the error message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, let's do that for now. Let's get it compete functionality/test wise first and then look at some final things like style and stuff like this.

@@ -12,7 +12,7 @@ end

local function check_keys(keys)
for _, key in ipairs(keys) do
local res, err = utils.validate_header_name(key, false)
local res, err = utils.validate_http_token(key, "header")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did this come from?

type = "string",
default = "/",
},
hash_fallback_cookie_path = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fallback setting can go. Since there can only be one cookie setting be it primary or fallback.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not support falling back to a cookie value if the primary hash_on is header or ip? I can't think of any use cases for it, but currently header falling back on ip and vice-versa are supported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should. But with headers we can have 2 different headers, one for primary, one for fallback. With cookies, there is never more than one since the cookie is created when absent.

Hence having hash_on_cookie and hash_on_cookie_path is enough. We do not need hash_fallback_cookie_path (actually in the current state, hash_fallback_cookie is missing).

type = "string",
default = "/",
},
hash_fallback_cookie_path = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should. But with headers we can have 2 different headers, one for primary, one for fallback. With cookies, there is never more than one since the cookie is created when absent.

Hence having hash_on_cookie and hash_on_cookie_path is enough. We do not need hash_fallback_cookie_path (actually in the current state, hash_fallback_cookie is missing).

@@ -189,6 +191,20 @@ return {
-- header name, if `hash_fallback == "header"`
type = "string",
},
hash_on_cookie = {
-- cookie name, if `hash_on == "cookie"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be

-- cookie name, if `hash_on` or `hash_fallback` == `"cookie"`

type = "string",
},
hash_on_cookie_path = {
-- cookie name, if `hash_on == "cookie"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be

-- cookie path, if `hash_on` or `hash_fallback` == `"cookie"`

@jtschulz jtschulz force-pushed the add-hash-on-cookie branch 3 times, most recently from ebd2571 to f4899c1 Compare May 23, 2018 21:29
@jtschulz
Copy link
Author

@Tieske Thanks for that excellent feedback! I think the latest changes address your requests, as well as add some more test coverage. PTAL

-- a-z, A-Z, 0-9 and '-' are allowed.
-- @param name (string) the header name to verify
-- @return the valid cookie name, or `nil+error`
_M.validate_cookie_name = function(name)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've considered other possible ways to DRY this up, but I think they would mostly hurt the readability of the validation. It could be generalized to not refer to "cookie" or "header" at all in the validation message, and give the responsibility of the schema that is applying this validation to a field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the doc comments above still refer to 'header' so need to be updated

end)

describe("hashing on cookie", function()
it("when the cookie is set", function()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of any edge cases that testing when a cookie is set catch that are not caught in the next example (where the cookie is not set). I would favor removing this test as redundant, but defer to your judgement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. But there is no test that validates that no cookie will be set if it was already provided, so maybe update this test to do that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: it is good practice to write the intended result of a test in its description, rather than relying on the reader to infer the proper test results by reading the test case... In this case, a name such as does not reply with Set-Cookie if cookie is already set or something.

Tieske
Tieske previously requested changes May 25, 2018
-- a-z, A-Z, 0-9 and '-' are allowed.
-- @param name (string) the header name to verify
-- @return the valid cookie name, or `nil+error`
_M.validate_cookie_name = function(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the doc comments above still refer to 'header' so need to be updated

-- If the cookie doesn't exist, create it
if not identifier then
identifier = utils.uuid()
ngx.ctx.SET_HASH_ON_COOKIE = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the codebase for naming, and I think this should be lower case (only latency tracking is uppercase). Also, "hash_on" might not be most descriptive, so maybe just "balancer_cookie" or "balancer_hash_cookie", to make it clear it belong to the balancer logic.

hash_on = "cookie",
hash_on_cookie = "Foo",
})
assert.are.same(crc32(value), hash)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also add validation of the ctx variables?, so in this case that ngx.ctx.SET_HASH_ON_COOKIE is NOT created.

And add another test without a cookie set, that shows that ngx.ctx.SET_HASH_ON_COOKIE is properly created?

@@ -246,7 +269,7 @@ describe("Admin API: #" .. kong_config.database, function()
local json = cjson.decode(body)
assert.same({ message = "Header: bad header name 'not a <> valid <> header name', allowed characters are A-Z, a-z, 0-9, '_', and '-'" }, json)

-- Invalid header
-- Invalid fallback header
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

local json = cjson.decode(body)
assert.same({ message = "Cookie name: bad cookie name 'not a <> valid <> cookie name', allowed characters are A-Z, a-z, 0-9, '_', and '-'" }, json)

-- Invalid cookie
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate of the one above? maybe this is supposed to be the fallback?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, copy/ paste error. I meant to check validating the cookie path on hash fallback.

end)

describe("hashing on cookie", function()
it("when the cookie is set", function()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. But there is no test that validates that no cookie will be set if it was already provided, so maybe update this test to do that?

@jtschulz
Copy link
Author

@Tieske just addressed the last round of feedback, PTAL :)

@Tieske Tieske dismissed their stale review May 28, 2018 11:42

comments have been adressed

@Tieske Tieske requested a review from hishamhm May 28, 2018 11:43
@Tieske
Copy link
Member

Tieske commented May 28, 2018

@PepperTeasdale looks good to me now. Since I already did a lot of reviews on this, I requested @hishamhm for a final review.

Final thing to do would be to update the docs at https://github.com/Kong/getkong.org

Copy link
Contributor

@hishamhm hishamhm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! I only left notes on very minor things like comments to correct and suggested one test entry.

(I noticed that the things that don't match the current style guide, e.g. uppercase in error messages, are there to match the legacy code around it — and the Upstream entity is due to be converted to the new DAO anyway — so I think they are fine as is.)

type = "string",
},
hash_on_cookie_path = {
-- cookie name, if `hash_on` or `hash_fallback` == `"cookie"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: cookie namecookie path

assert.are.same(ngx.ctx.balancer_hash_cookie.key, "Foo")
assert.are.same(ngx.ctx.balancer_hash_cookie.path, "/")
end)
end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps also add a "cookie" entry to the "fallback" section below at https://github.com/Kong/kong/pull/3472/files#diff-354d3b9f422325fb4a8342988b3a500cR539 ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Didn't see that :)

hash_on = "cookie",
hash_on_cookie = "cookiename",
hash_fallback = "header",
hash_fallback_header = "Cool-Header", --> validate case insensitivity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stale comment from copy/paste, please remove

@@ -810,4 +810,22 @@ _M.validate_header_name = function(name)
"', allowed characters are A-Z, a-z, 0-9, '_', and '-'"
end

--- Validates a cookie name.
-- Checks characters used in a cookie name to be valid
-- a-z, A-Z, 0-9 and '-' are allowed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment mismatch: in the implementation below, _ is allowed as well.

The range of valid characters for cookie names seems to be a weird mess, so sticking to a small and safe subset like the one below seems like a good idea.

@jtschulz jtschulz force-pushed the add-hash-on-cookie branch from ba083fe to 847d316 Compare May 29, 2018 14:34
@jtschulz
Copy link
Author

OK @hishamhm thank you for that feedback! This ought to be good now, and I've cleaned up the git history. My only question pertains to updating the docs. I don't see a next branch in the website repo. I did see the release/ce-0.14.0 branch that has a no-merge PR in the project. Should I be branching off that or what? Thanks again!

@@ -562,6 +583,27 @@ describe("Balancer", function()
})
assert.are.same(crc32(table.concat(value)), hash)
end)
describe("cookie", function()
Copy link
Contributor

@hishamhm hishamhm May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this block is the code same as the block above, but it should exercise "cookie" as the hash_fallback entry instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops! Fixed

@hishamhm
Copy link
Contributor

My only question pertains to updating the docs. I don't see a next branch in the website repo. I did see the release/ce-0.14.0 branch that has a no-merge PR in the project. Should I be branching off that or what? Thanks again!

Yes, that would work! The branch has a no-merge mark because at this point it contains only the files that have changed from the 0.13 directory (feel free to copy over new files if you need to edit those). By the time we merge this PR, we'll copy over the other 0.13 docs as they will be in their future state (this way we minimize the need to keep track of changes made to the 0.13 docs in the 0.14 branch).

@jtschulz jtschulz force-pushed the add-hash-on-cookie branch from 847d316 to 614aadf Compare May 29, 2018 15:15
@jtschulz jtschulz changed the title WIP - Add hash on cookie Add hash on cookie May 29, 2018
Copy link
Member

@thibaultcha thibaultcha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @PepperTeasdale, thanks for the patch, looking good so far, but we'll need some polishing before merging. @Tieske @hishamhm any thoughts about including the cookie parameters as values in the balancer structure instead?

@@ -707,6 +709,20 @@ return {
header[upstream_status_header] = matches[0]
end
end

if cookie_hash_data then
local cookie, ok, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no benefits here in separating the declaration and initialization of those variables. Let's keep things more concise and declare/initialize them on the same line (by simply moving the local keyword to the below lines).

ok, err = cookie:set(cookie_hash_data)

if not ok then
log(ngx.WARN, "failed to set hashing cookie, key=`", cookie_hash_data.key,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the logs produced by Kong use or should use backticks, instead, they use single quotes '. Let's update this to keep it consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I am not sure logging the value (just a UUID) is very insightful here. It mostly feels like noise.

A more concise error message which could be more readable and clearer from a user's POV would also mention the failure is related to "load balancing". Something similar to:

log(ngx.WARN, "failed to set the cookie for hash-based load balancing: ",
              err, " (key=", key, ", path=", path, ")")

Not the format and conciseness allows us to drop quotes altogether and still remain redable imho.

@@ -690,6 +691,7 @@ return {
before = function(ctx)
local var = ngx.var
local header = ngx.header
local cookie_hash_data = ctx.balancer_hash_cookie
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there is no point in retrieving this value outside of the ctx.KONG_PROXIED branch. It would also be more readable if we retrieved it just above the if cookie_hash_data branch created below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, this gives the impression that it belongs to the ngx.ctx.balancer_address structure, in which we already store all balancer-related values for decision making and introspection purposes (its name should be revisited already). Doing so also allows us to reduce usage of the top-level ctx table (which is sensitive to naming conflicts), and self-documentation by adding this field to the ones documented with its initialization code. cc @Tieske @hishamhm

-- If the cookie doesn't exist, create it
if not identifier then
identifier = utils.uuid()
ngx.ctx.balancer_hash_cookie = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx is already locally cached above in this function, so better use it instead of invoking ngx.ctx again here.

end)

describe("hashing on cookie", function()
it("when the cookie is set", function()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: it is good practice to write the intended result of a test in its description, rather than relying on the reader to infer the proper test results by reading the test case... In this case, a name such as does not reply with Set-Cookie if cookie is already set or something.

@jtschulz
Copy link
Author

@thibaultcha Thanks for the good CR. I'll await your discussion on storing balancer cookie data the balancer structure. What would be a more descriptive name than balancer_address if so? balancer_(config|data) perhaps?

@hishamhm
Copy link
Contributor

I'd just go with balancer_data

@thibaultcha
Copy link
Member

@PepperTeasdale Choosing the proper name, if we were to rename this structure, would belong to another discussion, and should not be included in the scope of this PR. We'll take care of it later :)

@Tieske
Copy link
Member

Tieske commented May 30, 2018

any thoughts about including the cookie parameters as values in the balancer structure instead?

Either way is fine. That structure needs to be renamed (and maybe torn apart?), so imo that is part of/follow up of the SDK. Until that refactor, in which we can take this one along, it doesn't really matter whether it lives inside a badly named structure, or lives outside with a proper name.

@jtschulz
Copy link
Author

FWIW, from my perspective as someone new to this repo, I would find it a little confusing to find this data in the balancer_address. Thanks everyone for all the really awesome feedback and guidance :)

This allows better distribution for sticky sessions when many consumers
may be behind the same NAT gateway (e.g. a call center). When the
request comes in, we check for the configured cookie name. If not
present in the request, a random guid is created and used for balancing
and stored in `ngx.ctx` so the cookie can be added in the response
header. Because the cookie is created if it doesn't already exist, the
`hash_fallback` option is not available when cookies are the primary
identifier.
@jtschulz jtschulz force-pushed the add-hash-on-cookie branch from b21bc78 to 4e9581c Compare May 30, 2018 14:53

-- TODO: This should be added the `ngx.ctx.balancer_address`
-- structure, where other balancer-related values are stored,
-- when that is renamed/ refactored.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you ok with this comment to address the need to refactor the balancer structure? cc: @thibaultcha @Tieske

@hishamhm hishamhm merged commit acb56d1 into Kong:next May 30, 2018
@hishamhm
Copy link
Contributor

@PepperTeasdale Merged, thank you! I'll open a different PR addressing the naming concern.

@thibaultcha
Copy link
Member

@PepperTeasdale Thanks for the patch. By the way, you are now eligible to receive your very own Contributor T-shirt (details in the link). Enjoy! :)

@jtschulz
Copy link
Author

Thank you! A+ contributor experience. I look forward contributing more to this and other kong-related projects :)

@jtschulz jtschulz deleted the add-hash-on-cookie branch May 30, 2018 18:34
thibaultcha pushed a commit to Kong/docs.konghq.com that referenced this pull request Jun 28, 2018
ukiahsmith pushed a commit to Kong/docs.konghq.com that referenced this pull request Jul 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants