Kong upstream passive health check uses the HTTP code set by a plugin instead of the one returned by the target server #10281

shanexguo · 2023-02-10T21:42:06Z

Is there an existing issue for this?

I have searched the existing issues

Kong version (`$ kong version`)

3.0.0.0

Current Behavior

We noticed that when our custom Kong plugin set HTTP code to '502' via its exit hook ( function exit(status, body, headers) ), the involved upstream's passive unhealthy healthcheck using HTTP status code will count this as an incident of "unhealthy" state. When this occurs a few times in a short duration, as in a load test, the upstream will report "{"message":"failure to get a peer from the ring-balancer"}" in the errors log, essentially mark the upstream as "down".

Expected Behavior

We expect the upstream passive unhealthy healthcheck on HTTP status code should only mark the targets as "down" (not reachable or proxiable) using the returned HTTP status code from the target, and it should not be affected by the final HTTP status code set by Kong plugin, which is returned to the client.

Steps To Reproduce

On Kong 3.0.0.0:

Create a custom Kong plugin that contains the following:
1a) In its init_worker(), register itself as an exit hook:
kong.response.register_hook("exit", self.exit, self)
1b) In its body_filter function, generate an exit event as:
return kong.response.exit(500)
1c) In its exit hook function exit(status, body, headers), simply add a return statement as:
return status, body, headers
Create a Kong upstream with a target server, e.g. you can use any HTTP echo service, like http://httpbin.org/headers as the target server. Make sure the passive unhealthy health check is enabled. Leave everything else as default
Assign the upstream to a Kong service, and then attach the custom plugin to the service
Send a burst of HTTP requests that triggers the service and the cusotm plugin, and verify in the errors log that you see "{"message":"failure to get a peer from the ring-balancer"}"

Anything else?

O/S: Linux

The text was updated successfully, but these errors were encountered:

shanexguo · 2023-02-13T15:14:09Z

Attached is a sample plugin that demonstrates the issue:

The plugin

It inserts some headers in its access() & response() functions. While doing this in the response(), it also generates an exit event with the HTTP status code 502, and attached the original HTTP body & headers it received from the upstream. Since it also registers itself with an exit hook, its own exit() function will then get called. All relevant debugging info gets logged in Kong's error log. This simulates our own custom plugins.

The Pongo tests

To test it, start Pongo docker container using the "pongo up && pongo shell" command. Inside the Pongo shell, run the 03 spec file, as:
busted /kong-plugin/spec/api-version/03-integration_spec.lua

The test sends 3 requests to the same URL ["https://httpbin.org/anything"] via configured upstreams. To observe the bug, the above tests invoke the same request 3 times in a row. Since the plugin will exit with the HTTP code "502", which is specified as one of the upstream's passive unhealthy HTTP status codes, the Kong load balancer for the upstream will pass the first HTTP requests and fail the later HTTP requests. You will also see that Kong will return HTTP code 503 with the message '{"message":"failure to get a peer from the ring-balancer"}'. Note: 503 is set by Kong, not the plugin. This happens because Kong uses the plugin's HTTP status code, instead of the upstream's original HTTP status code, during the upstream's passive unhealthy health check. We believe this is a bug since Kong should always use the upstream's original HTTP status code to evaluate if the upstream targets are healthy or not. From the error log, you can see that the original status code is always 200.

To further verify the above, you can remove the '502' status code from the upstream's passive unhealthy HTTP status code check in the 03-integration_spec.lua, and put it as a passive healthy HTTP status code. Then you should see all the 3 requests run successfully.

shanexguo · 2023-02-13T15:17:41Z

kong-api-version-plugin.tar.gz
Here is the sample plugin and the Pongo test files described above.

locao · 2023-02-14T15:28:58Z

Hi @shanexguo! Thanks for your report. That's a known behavior, we are discussing if we should change it in the future or we should document it better.

We expect the upstream passive unhealthy healthcheck on HTTP status code should only mark the targets as "down" (not reachable or proxiable) using the returned HTTP status code from the target, and it should not be affected by the final HTTP status code set by Kong plugin, which is returned to the client. So we change the passive health check implementation to use nginx var `upstream_code`, this will not be changed by any plugin. FTI-4841 Fix #10281

shanexguo · 2023-03-01T17:21:10Z

Many thanks! Which Kong general release version will this be available?

…

On Wed, Mar 1, 2023 at 4:08 AM Datong Sun ***@***.***> wrote: Closed #10281 <#10281> as completed via #10325 <#10325>. — Reply to this email directly, view it on GitHub <#10281 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXINGJCQVKI34V3LNKRJ4RLWZ4GXTANCNFSM6AAAAAAUYIJUPQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

locao · 2023-03-01T17:52:49Z

This change will be included in Kong 3.3.

shanexguo · 2023-03-03T15:55:20Z

Hi, Do you have an ETA date on Kong 3.3.0 general release? Thanks,

…

On Wed, Mar 1, 2023 at 12:53 PM Vinicius Mignot ***@***.***> wrote: This change will be included in Kong 3.3. — Reply to this email directly, view it on GitHub <#10281 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXINGJBRXFDA7AZRBJFOTLDWZ6EHZANCNFSM6AAAAAAUYIJUPQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

hbagdi · 2023-03-03T20:21:47Z

Roughly middle of May.

shanexguo · 2023-03-08T16:27:31Z

Hi Kong support team, Is it possible to get a hot patch for this issue, for Kong 3.0? We do have an official Kong enterprise support contract. Thanks,

…

On Fri, Mar 3, 2023 at 3:21 PM Harry ***@***.***> wrote: Roughly middle of May. — Reply to this email directly, view it on GitHub <#10281 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXINGJG7QBKC6Y7UETMBXQ3W2JHGNANCNFSM6AAAAAAUYIJUPQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

locao · 2023-03-08T19:22:52Z

Hi @shanexguo,

Yes, Kong EE 3.0 is still under full support. Could you contact your technical support representative to open a ticket, so we can track this down and get it prioritized, please?

Thank you!

shanexguo · 2023-03-09T18:16:03Z

Will do. Thanks.

…

On Wed, Mar 8, 2023, 14:23 Vinicius Mignot ***@***.***> wrote: Hi @shanexguo <https://github.com/shanexguo>, Yes, Kong EE 3.0 is still under full support. Could you contact your technical support representative to open a ticket, so we can track this down and get it prioritized, please? Thank you! — Reply to this email directly, view it on GitHub <#10281 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXINGJDIALTFSVKQG5AU6S3W3DMBVANCNFSM6AAAAAAUYIJUPQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

x108207 · 2023-03-10T17:54:26Z

New case for Hotfix:
https://support.konghq.com/support/s/case/5001K00001D8zQSQAZ/hotfix-for-kong-upstream-passive-health-check

We expect the upstream passive unhealthy healthcheck on HTTP status code should only mark the targets as "down" (not reachable or proxiable) using the returned HTTP status code from the target, and it should not be affected by the final HTTP status code set by Kong plugin, which is returned to the client. So we change the passive health check implementation to use nginx var `upstream_code`, this will not be changed by any plugin. FTI-4841 Fix #10281 (cherry picked from commit dbe8d94)

https://konghq.atlassian.net/browse/KAG-5253 Cherry-picked from #13674

pluveto added the core/balancer label Feb 13, 2023

samugi added the bug label Feb 14, 2023

oowl mentioned this issue Feb 20, 2023

fix(runloop): use upstream status code in passive health check #10325

Merged

3 tasks

dndx closed this as completed in #10325 Mar 1, 2023

locao mentioned this issue Mar 17, 2023

fix(runloop): use upstream status code in passive health check (#10325) #10516

Closed

3 tasks

locao mentioned this issue Mar 17, 2023

fix(runloop): use upstream status code in passive health check (#10325) #10517

Merged

3 tasks

locao mentioned this issue Mar 17, 2023

fix(runloop): use upstream status code in passive health check (#10325) #10518

Merged

3 tasks

locao mentioned this issue Mar 17, 2023

fix(runloop): use upstream status code in passive health check (#10325) #10519

Merged

3 tasks

curiositycasualty pushed a commit that referenced this issue Oct 15, 2024

tests(helpers): move client functions (#10281)

6218872

https://konghq.atlassian.net/browse/KAG-5253 Cherry-picked from #13674

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kong upstream passive health check uses the HTTP code set by a plugin instead of the one returned by the target server #10281

Kong upstream passive health check uses the HTTP code set by a plugin instead of the one returned by the target server #10281

shanexguo commented Feb 10, 2023

shanexguo commented Feb 13, 2023 •

edited

Loading

shanexguo commented Feb 13, 2023

locao commented Feb 14, 2023

shanexguo commented Mar 1, 2023 via email

locao commented Mar 1, 2023

shanexguo commented Mar 3, 2023 via email

hbagdi commented Mar 3, 2023

shanexguo commented Mar 8, 2023 via email

locao commented Mar 8, 2023

shanexguo commented Mar 9, 2023 via email

x108207 commented Mar 10, 2023

Kong upstream passive health check uses the HTTP code set by a plugin instead of the one returned by the target server #10281

Kong upstream passive health check uses the HTTP code set by a plugin instead of the one returned by the target server #10281

Comments

shanexguo commented Feb 10, 2023

Is there an existing issue for this?

Kong version ($ kong version)

Current Behavior

Expected Behavior

Steps To Reproduce

Anything else?

shanexguo commented Feb 13, 2023 • edited Loading

The plugin

The Pongo tests

shanexguo commented Feb 13, 2023

locao commented Feb 14, 2023

shanexguo commented Mar 1, 2023 via email

locao commented Mar 1, 2023

shanexguo commented Mar 3, 2023 via email

hbagdi commented Mar 3, 2023

shanexguo commented Mar 8, 2023 via email

locao commented Mar 8, 2023

shanexguo commented Mar 9, 2023 via email

x108207 commented Mar 10, 2023

Kong version (`$ kong version`)

shanexguo commented Feb 13, 2023 •

edited

Loading