Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*-log - 4xx response from Kong are not logged #892

Closed
ahmadnassri opened this issue Jan 19, 2016 · 13 comments
Closed

*-log - 4xx response from Kong are not logged #892

ahmadnassri opened this issue Jan 19, 2016 · 13 comments
Assignees
Labels
task/feature Requests for new features in Kong

Comments

@ahmadnassri
Copy link
Contributor

when Kong responds in a 4xx (e.g. 401, 403) the logging plugins (all *-log and Galileo) are not logging those requests.

this might affect runscope as well.

@subnetmarco
Copy link
Member

I think this happens when a plugin terminates the request prematurely (like authentication not validated), but not when the upstream service returns a 4** response.

@thibaultcha
Copy link
Member

Yes, logging plugins have always been intended to log successful requests.

See #490 and https://groups.google.com/forum/#!topic/konglayer/Dx45HAHdf8k for why.

@nijikokun
Copy link
Contributor

Perhaps it should be configurable based on the client's needs. Such as per method, status code, path, and headers.

Not sure whether we want to get into filtering out certain strings and data that is being passed but I know for a fact that certain logging utilities (raygun) already provide such a service in their agents and libraries.

@thibaultcha thibaultcha added task/feature Requests for new features in Kong area/plugins and removed question labels Jan 20, 2016
@thibaultcha thibaultcha changed the title [plugin] *-log - 4xx response from Kong are not logged *-log - 4xx response from Kong are not logged Jan 20, 2016
@ahmadnassri
Copy link
Contributor Author

See #490 and https://groups.google.com/forum/#!topic/konglayer/Dx45HAHdf8k for why.

we'll have to come up with a possible solution for logging errors in this scenario, a production system would be lacking if those types of errors are not captured in logs.

@ryanwilsonperkin
Copy link

Hey all, I'd love to weigh in on this since it's a feature we'd be interested in taking advantage of. We're hoping to use the rate-limiting plugin and use Galileo for visualizing of how many requests are being rate limited, and especially being able to break those down by header or path. We'd also love to visualize this for other plugins, such as how many requests are being rejected for authentication reasons.

If there's any other information we can provide, feel free to ask. @ahmadnassri thanks for filing this issue.

@nicklanng
Copy link

Hi, I've just been working on a PoC of Kong and run into this very issue - has there been any more work done in this area? Thanks.

@ahmadnassri ahmadnassri added the BC label May 13, 2016
@BrianHutchison
Copy link

+1 for letting us control how Kong logs in potential DDOS scenarios.

We would like to enable alerting in our production environment if we are denying requests (past some threshold) based on reaching a rate limit.

@ahmadnassri Have you come up with a work-around?

@Tieske
Copy link
Member

Tieske commented May 25, 2016

Code wise it happens when a plugin (or any other piece of Kong) exits prematurely like @thefosk mentioned. To catch those we could try and wrap those ngx functions/methods and catch the errors in between.

To prevent any dos problems we could set a limit of how many of a particular error type are logged per second. Eg. max 10 per second. So if more than 10 errors of 4xx are generated, it will stop logging those and only report the total count once per second until the rate goes below the 10 per second again.

@codebien
Copy link

For our service this feature is a must-have, so we are discussing internal on a workaround. It could be ok for us to support the cost for developing a basic implementation of this feature and pull request it.

IMO, to develop this feature in the minimum possible time, the rate limit issue shouldn't be taken into account.
It could be handled outside the plugin and Kong for now. Then, in future and advanced versions, one or more rate limit techniques could be added.

We are thinking of 2 possibilities:

  • @thibaultcha says that the current plugin is for logging proxied requests, so we could implement a log plugin for not proxied requests. @mashape-team Are proxied requests identifiable from within the plugin scope?
  • A more complex but much more powerful solution would be to implement a per request access logger with a global vision directly in kong core.
    This would allow to achieve 2 goals:
    1- Catching all requests, includnig requests that do not match any API (404 generated by kong), while still keeping per-request logs.
    2- Loggins calls to the Admin API the same way gateway would do.

This is my simple idea to implement it:

For each request, a new Logger Record instance is generated. The information available at Kong core is added (The arrived request, the returned response, etc.). In addition, the Logger instance is injected within each plugin, so that they can log additional information available only within the plugin's scope.

@ALL What do you think, guys?

Thanks!

@thibaultcha thibaultcha removed the BC label Sep 1, 2016
@ewilde
Copy link

ewilde commented Jan 13, 2017

Just to mention this issue affects the runscope plugin as well.

It's a shame. It would be very useful for us to help diagnose customer authentication issues if 401's were being forwarded on. At the moment we only see 200s in the runscope traffic monitor sent by the plugin.

@Krieke
Copy link

Krieke commented Jan 25, 2017

+1

@Tieske Tieske self-assigned this Jan 26, 2017
@robotnic
Copy link

robotnic commented Aug 8, 2017

+1

thibaultcha added a commit that referenced this issue Dec 12, 2017
Background context
------------------

Prior to this change, if a plugin short-circuits a request (e.g. auth
plugin with HTTP 401, or rate limiting with HTTP 429) then subsequent
plugins with a lower priority would not be executed by Kong. This is
even more of an issue for logging plugins as they would be blind to
requests short-circuited by Kong itself.

The purpose of this patch is to allow other plugins in general to run
after a plugin short-circuited a request in the access phase.

Underlying issue
----------------

Our current plugins' run-loop is implemented in such a way that it both
constructs the list of plugins to execute, and yields to its caller (for
loop) at the same time. Once yielded, a plugin is instantly executed by
the caller (for loop). If the plugin uses the `ngx.exit()` API, then the
execution of the access phase is interrupted, and our run-loop never has
the chance to add logging plugins to the list of plugins to execute for
the current request (that is because logging plugins have a lower
priority than our authentication plugins, which must run first).

Possible solutions
------------------

One could think of several solutions to this issue:

1. Increase the priority of the logging plugins, so that they run
   earlier than auth plugin, and will be added to the list of plugins to
   execute for this request before the access phase is short-circuited.
2. Re-implement our run-loop (`plugins_iterator.lua` and
   `kong/init.lua`) so that it somehow builds the list of plugins to
   execute for a request first, then execute said plugin _after_.
3. Force the run-loop to rebuild the entire list of plugins inside of
   the logging phase.

However, none of these solutions comes without a trade-off.

1. By updating the priority order of each plugin, we run the risk of
   unnecessarily breaking logic depending on the current order of
   execution. We also risk not fixing this issue for custom plugins
   without those plugins also bumping their priority order, which
   could cause cascading issues if other plugins depend on those
   plugins being executed at later phases.
2. While this is definitely a long-term goal, the breaking change
   nature of this solution should tell us that we would rather
   post-pone it until a better case study is made against it.
   Re-implementing our run-loop will benefit Kong in many ways
   (easier to write plugins, more fine-grained hooks, more
   sandboxing...), but doing so now would be disruptive for current
   plugins. One of this reasons behind this is that such a new run-loop
   should probably not follow the same paradigm of building itself and
   yielding at the same time. Instead, we should think of a run-loop
   executing global plugins first, then authentication plugins, then
   API/Consumer-specific plugin. Such an approach as of today would be
   extremely disruptive and break many assumptions made in existing
   plugins both defaults ones and in the wild.
3. The major flaw with this approach is that the run-loop depends on the
   datastore cache, which itself results in DB calls whenever a miss is
   encountered. However, relying on the datastore cache in the log phase
   is a very bad idea, since any sort of cache miss would trigger a DB
   request, which aren't supported in the log phase of NGINX due
   (rightfully so) to the lack of cosocket support in this phase.

Proposed solution
-----------------

This could be seen as a hack, or as a slightly more complex run-loop
with some state. We take advantage of the fact that all of our plugins
(and, very likely, most of third-party plugins out there) use the
`responses` module to send their HTTP responses and short-circuit
requests. The introduction of a flag in the request's context *delays*
such a response, which gives the run-loop a chance to finish building
the list of plugins to execute (but subsequent plugins do not run
anymore once a plugin short-circuited the request).

Once the list of plugins to execute is complete, we finally
short-circuit the execution of the access phase, not giving Kong a
chance to run the "after access" handler, thus not falsely leading other
plugins into believe the request was proxied.

Once the log phase kicks in, it will undoubtedly execute the registered
plugins, even if their priority was lesser than that of the
short-circuiting plugin.

This way, we've achieved the desired result with minimal impact:

* no plugin needs to update its `priority` constant
* no plugin needs to see their code updated, as long as they use the
  `responses` module
* the performance impact is minimal; we are only doing a few `ngx.ctx`
  accesses and there is no need to re-run the plugins iterators
* the code change is minimal

Changes
-------

* Implemented a `ctx.delay_response` flag to play nice with the
  `responses` module. If set, we delay the flushing of the response
  until the plugins run-loop has finished running.
  Plugins can also make use of a custom flush callback for delayed
  response if they do not wish to use the `responses.send` API.
  They can do so by setting `ctx.delayed_response = true` and
  `ctx.delayed_response_callback` to a function accepting `ngx.ctx.`
  as its sole argument.
* Ensure all plugins follow the correct pattern of always calling
  `responses.send()` with a `return` statement.
* Implement regression tests for the subsequent-phases to run upon
  short-circuiting.

Fix #490
Fix #892
From #3079
thibaultcha added a commit that referenced this issue Dec 13, 2017
Background context
------------------

Prior to this change, if a plugin short-circuits a request (e.g. auth
plugin with HTTP 401, or rate limiting with HTTP 429) then subsequent
plugins with a lower priority would not be executed by Kong. This is
even more of an issue for logging plugins as they would be blind to
requests short-circuited by Kong itself.

The purpose of this patch is to allow other plugins in general to run
after a plugin short-circuited a request in the access phase.

Underlying issue
----------------

Our current plugins' run-loop is implemented in such a way that it both
constructs the list of plugins to execute, and yields to its caller (for
loop) at the same time. Once yielded, a plugin is instantly executed by
the caller (for loop). If the plugin uses the `ngx.exit()` API, then the
execution of the access phase is interrupted, and our run-loop never has
the chance to add logging plugins to the list of plugins to execute for
the current request (that is because logging plugins have a lower
priority than our authentication plugins, which must run first).

Possible solutions
------------------

One could think of several solutions to this issue:

1. Increase the priority of the logging plugins, so that they run
   earlier than auth plugin, and will be added to the list of plugins to
   execute for this request before the access phase is short-circuited.
2. Re-implement our run-loop (`plugins_iterator.lua` and
   `kong/init.lua`) so that it somehow builds the list of plugins to
   execute for a request first, then execute said plugin _after_.
3. Force the run-loop to rebuild the entire list of plugins inside of
   the logging phase.

However, none of these solutions comes without a trade-off.

1. By updating the priority order of each plugin, we run the risk of
   unnecessarily breaking logic depending on the current order of
   execution. We also risk not fixing this issue for custom plugins
   without those plugins also bumping their priority order, which
   could cause cascading issues if other plugins depend on those
   plugins being executed at later phases.
2. While this is definitely a long-term goal, the breaking change
   nature of this solution should tell us that we would rather
   post-pone it until a better case study is made against it.
   Re-implementing our run-loop will benefit Kong in many ways
   (easier to write plugins, more fine-grained hooks, more
   sandboxing...), but doing so now would be disruptive for current
   plugins. One of this reasons behind this is that such a new run-loop
   should probably not follow the same paradigm of building itself and
   yielding at the same time. Instead, we should think of a run-loop
   executing global plugins first, then authentication plugins, then
   API/Consumer-specific plugin. Such an approach as of today would be
   extremely disruptive and break many assumptions made in existing
   plugins both defaults ones and in the wild.
3. The major flaw with this approach is that the run-loop depends on the
   datastore cache, which itself results in DB calls whenever a miss is
   encountered. However, relying on the datastore cache in the log phase
   is a very bad idea, since any sort of cache miss would trigger a DB
   request, which aren't supported in the log phase of NGINX due
   (rightfully so) to the lack of cosocket support in this phase.

Proposed solution
-----------------

This could be seen as a hack, or as a slightly more complex run-loop
with some state. We take advantage of the fact that all of our plugins
(and, very likely, most of third-party plugins out there) use the
`responses` module to send their HTTP responses and short-circuit
requests. The introduction of a flag in the request's context *delays*
such a response, which gives the run-loop a chance to finish building
the list of plugins to execute (but subsequent plugins do not run
anymore once a plugin short-circuited the request).

Once the list of plugins to execute is complete, we finally
short-circuit the execution of the access phase, not giving Kong a
chance to run the "after access" handler, thus not falsely leading other
plugins into believe the request was proxied.

Once the log phase kicks in, it will undoubtedly execute the registered
plugins, even if their priority was lesser than that of the
short-circuiting plugin.

This way, we've achieved the desired result with minimal impact:

* no plugin needs to update its `priority` constant
* no plugin needs to see their code updated, as long as they use the
  `responses` module
* the performance impact is minimal; we are only doing a few `ngx.ctx`
  accesses and there is no need to re-run the plugins iterators
* the code change is minimal

Changes
-------

* Implemented a `ctx.delay_response` flag to play nice with the
  `responses` module. If set, we delay the flushing of the response
  until the plugins run-loop has finished running.
  Plugins can also make use of a custom flush callback for delayed
  response if they do not wish to use the `responses.send` API.
  They can do so by setting `ctx.delayed_response = true` and
  `ctx.delayed_response_callback` to a function accepting `ngx.ctx.`
  as its sole argument.
* Ensure all plugins follow the correct pattern of always calling
  `responses.send()` with a `return` statement.
* Implement regression tests for the subsequent-phases to run upon
  short-circuiting.

Fix #490
Fix #892
From #3079
thibaultcha added a commit that referenced this issue Jan 16, 2018
Background context
------------------

Prior to this change, if a plugin short-circuits a request (e.g. auth
plugin with HTTP 401, or rate limiting with HTTP 429) then subsequent
plugins with a lower priority would not be executed by Kong. This is
even more of an issue for logging plugins as they would be blind to
requests short-circuited by Kong itself.

The purpose of this patch is to allow other plugins in general to run
after a plugin short-circuited a request in the access phase.

Underlying issue
----------------

Our current plugins' run-loop is implemented in such a way that it both
constructs the list of plugins to execute, and yields to its caller (for
loop) at the same time. Once yielded, a plugin is instantly executed by
the caller (for loop). If the plugin uses the `ngx.exit()` API, then the
execution of the access phase is interrupted, and our run-loop never has
the chance to add logging plugins to the list of plugins to execute for
the current request (that is because logging plugins have a lower
priority than our authentication plugins, which must run first).

Possible solutions
------------------

One could think of several solutions to this issue:

1. Increase the priority of the logging plugins, so that they run
   earlier than auth plugin, and will be added to the list of plugins to
   execute for this request before the access phase is short-circuited.
2. Re-implement our run-loop (`plugins_iterator.lua` and
   `kong/init.lua`) so that it somehow builds the list of plugins to
   execute for a request first, then execute said plugin _after_.
3. Force the run-loop to rebuild the entire list of plugins inside of
   the logging phase.

However, none of these solutions comes without a trade-off.

1. By updating the priority order of each plugin, we run the risk of
   unnecessarily breaking logic depending on the current order of
   execution. We also risk not fixing this issue for custom plugins
   without those plugins also bumping their priority order, which
   could cause cascading issues if other plugins depend on those
   plugins being executed at later phases.
2. While this is definitely a long-term goal, the breaking change
   nature of this solution should tell us that we would rather
   post-pone it until a better case study is made against it.
   Re-implementing our run-loop will benefit Kong in many ways
   (easier to write plugins, more fine-grained hooks, more
   sandboxing...), but doing so now would be disruptive for current
   plugins. One of this reasons behind this is that such a new run-loop
   should probably not follow the same paradigm of building itself and
   yielding at the same time. Instead, we should think of a run-loop
   executing global plugins first, then authentication plugins, then
   API/Consumer-specific plugin. Such an approach as of today would be
   extremely disruptive and break many assumptions made in existing
   plugins both defaults ones and in the wild.
3. The major flaw with this approach is that the run-loop depends on the
   datastore cache, which itself results in DB calls whenever a miss is
   encountered. However, relying on the datastore cache in the log phase
   is a very bad idea, since any sort of cache miss would trigger a DB
   request, which aren't supported in the log phase of NGINX due
   (rightfully so) to the lack of cosocket support in this phase.

Proposed solution
-----------------

This could be seen as a hack, or as a slightly more complex run-loop
with some state. We take advantage of the fact that all of our plugins
(and, very likely, most of third-party plugins out there) use the
`responses` module to send their HTTP responses and short-circuit
requests. The introduction of a flag in the request's context *delays*
such a response, which gives the run-loop a chance to finish building
the list of plugins to execute (but subsequent plugins do not run
anymore once a plugin short-circuited the request).

Once the list of plugins to execute is complete, we finally
short-circuit the execution of the access phase, not giving Kong a
chance to run the "after access" handler, thus not falsely leading other
plugins into believe the request was proxied.

Once the log phase kicks in, it will undoubtedly execute the registered
plugins, even if their priority was lesser than that of the
short-circuiting plugin.

This way, we've achieved the desired result with minimal impact:

* no plugin needs to update its `priority` constant
* no plugin needs to see their code updated, as long as they use the
  `responses` module
* the performance impact is minimal; we are only doing a few `ngx.ctx`
  accesses and there is no need to re-run the plugins iterators
* the code change is minimal

Changes
-------

* Implemented a `ctx.delay_response` flag to play nice with the
  `responses` module. If set, we delay the flushing of the response
  until the plugins run-loop has finished running.
  Plugins can also make use of a custom flush callback for delayed
  response if they do not wish to use the `responses.send` API.
  They can do so by setting `ctx.delayed_response = true` and
  `ctx.delayed_response_callback` to a function accepting `ngx.ctx.`
  as its sole argument.
* Ensure all plugins follow the correct pattern of always calling
  `responses.send()` with a `return` statement.
* Implement regression tests for the subsequent-phases to run upon
  short-circuiting.

Fix #490
Fix #892
From #3079
@thibaultcha
Copy link
Member

Closing this now that Kong 0.12.0 has been released and logs such errors when returned by plugins!

thibaultcha added a commit that referenced this issue Jun 12, 2018
Context
-------

Before this patch, several scenarios would make Kong/Nginx produce
errors that did not execute plugins:

1. Nginx-produced client errors (HTTP 4xx) on invalid requests per the
   Nginx settings (e.g. headers too large, URI too large, etc...)
2. Proxy TCP/HTTP errors on the last balancer try (e.g. connection
   refused, connection timeout, read/write timeout, etc...)
3. Any other Nginx-produced server errors (HTTP 5xx), if any

Because plugins are not executed in such cases, logging plugins in
particular do not get a chance to report such errors, leaving Kong
administrator blind when relying on bundled or homegrown logging
plugins. Additionally, other plugins, e.g. transformation, do not get a
chance to run either and could lead to scenarios where those errors
produce headers or response bodies that are undesired by the
administrator.

This issue is similar to that fixed by #3079, which handled the
following case:

4. Requests short-circuited by Kong itself (vs Nginx in 1, 2, and 3.)

See also #490 & #892 for issues related to 4. This patches addresses 1,
2, and 3.

Problem
-------

Nginx-produced errors (4xx and 5xx) are redirected to the
`/kong_error_handler` location by way of the `error_page` directive,
but the runloop is not given a chance to run, since the error handler
is only instructed to override the Nginx-defined HTML responses, and
produces Kong-friendly ones.

In short, a patch solving this issue must enable the runloop within the
error handler.

Proposed solution
-----------------

Several (many) possibilities were evaluated and explored to fix this.
The proposed patch is the result of many attempts, observations, and
adjustments to initial proposals. In a gist, it proposes:

* Executing the response part of the runloop in the error_page location.
* Ensure that if the request part of the runloop did not have a chance
  to run, we do build the list of plugins to execute (thus, global plugins
  only since the request was not processed).
* Ensure that if the request part of the runloop did run, we preserve
  its `ngx.ctx` values.

A few catches:

* Currently, `ngx.req.get_method()` in the `error_page` location block
  will return `GET` even if the request method was different (only
  `HEAD` is preserved). A follow up patch for Nginx is being prepared.
  As such, several tests are currently marked as "Pending".
* When the `rewrite` phase did not get a chance to run, some parts of
  the request may not have been parsed by Nginx (e.g. on HTTP 414), thus
  some variables, such as `$request_uri` or request headers may not have
  been parsed. Code should be written defensively against such
  possibilities, and logging plugins may report incomplete request data
  (but will still report the produced error).
* The `uninitialized_variable_warn` directive must be disabled in the
  `error_page` location block for cases when the `rewrite` phase did not
  get a chance to run (Nginx-produced 4xx clients errors).
* The `ngx.ctx` variables must be copied by reference for plugins to be
  able to access it in the response phases when the request was
  short-circuited by Nginx (e.g. HTTP 502/504 from the upstream module).
* Using a named location for the `error_page` handler is not possible,
  since in some cases the URI might not be parsed by Nginx, and thus Nginx
  refuses to call named locations in such cases (e.g. on HTTP 414).

Resulting behavior
------------------

The `03-plugins_triggering_spec.lua` test suite is now a good reference
for the behavior observed upon Nginx-produced errors. This behavior can
be classified in 2 categories:

1. When the request is not parsed by Nginx (HTTP 4xx)
    a. The `error_page` directive is called (`rewrite` and `access` are
       skipped)
    b. The runloop retrieves the list of **global** plugins to run (it
       cannot retrieve more specific plugins since the request could not be
       parsed)
    c. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    d. It executes the `header_filter` phase of each global plugin.
    e. It executes the `body_filter` phase of each global plugin.
    f. It executes the `log` phase of each global plugin.

2. When the error was produced during upstream module execution (HTTP 5xx)
    a. The `rewrite` and `access` phases ran normally (we stashed the
       `ngx.ctx` reference)
    b. The `error_page` directive is called
    c. The error content handler applies the original `ngx.ctx` by
       reference
    d. It does **not** run the plugins loop, since `ngx.ctx` contains
       the list of plugins to run (global **and** specific plugins in
       this case)
    e. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    f. It executes the `header_filter` phase of each global plugin.
    g. It executes the `body_filter` phase of each global plugin.
    h. It executes the `log` phase of each global plugin.

Fix #3193
thibaultcha added a commit that referenced this issue Jun 14, 2018
Context
-------

Before this patch, several scenarios would make Kong/Nginx produce
errors that did not execute plugins:

1. Nginx-produced client errors (HTTP 4xx) on invalid requests per the
   Nginx settings (e.g. headers too large, URI too large, etc...)
2. Proxy TCP/HTTP errors on the last balancer try (e.g. connection
   refused, connection timeout, read/write timeout, etc...)
3. Any other Nginx-produced server errors (HTTP 5xx), if any

Because plugins are not executed in such cases, logging plugins in
particular do not get a chance to report such errors, leaving Kong
administrator blind when relying on bundled or homegrown logging
plugins. Additionally, other plugins, e.g. transformation, do not get a
chance to run either and could lead to scenarios where those errors
produce headers or response bodies that are undesired by the
administrator.

This issue is similar to that fixed by #3079, which handled the
following case:

4. Requests short-circuited by Kong itself (vs Nginx in 1, 2, and 3.)

See also #490 & #892 for issues related to 4. This patches addresses 1,
2, and 3.

Problem
-------

Nginx-produced errors (4xx and 5xx) are redirected to the
`/kong_error_handler` location by way of the `error_page` directive,
but the runloop is not given a chance to run, since the error handler
is only instructed to override the Nginx-defined HTML responses, and
produces Kong-friendly ones.

In short, a patch solving this issue must enable the runloop within the
error handler.

Proposed solution
-----------------

Several (many) possibilities were evaluated and explored to fix this.
The proposed patch is the result of many attempts, observations, and
adjustments to initial proposals. In a gist, it proposes:

* Executing the response part of the runloop in the error_page location.
* Ensure that if the request part of the runloop did not have a chance
  to run, we do build the list of plugins to execute (thus, global plugins
  only since the request was not processed).
* Ensure that if the request part of the runloop did run, we preserve
  its `ngx.ctx` values.

A few catches:

* Currently, `ngx.req.get_method()` in the `error_page` location block
  will return `GET` even if the request method was different (only
  `HEAD` is preserved). A follow up patch for Nginx is being prepared.
  As such, several tests are currently marked as "Pending".
* When the `rewrite` phase did not get a chance to run, some parts of
  the request may not have been parsed by Nginx (e.g. on HTTP 414), thus
  some variables, such as `$request_uri` or request headers may not have
  been parsed. Code should be written defensively against such
  possibilities, and logging plugins may report incomplete request data
  (but will still report the produced error).
* The `uninitialized_variable_warn` directive must be disabled in the
  `error_page` location block for cases when the `rewrite` phase did not
  get a chance to run (Nginx-produced 4xx clients errors).
* The `ngx.ctx` variables must be copied by reference for plugins to be
  able to access it in the response phases when the request was
  short-circuited by Nginx (e.g. HTTP 502/504 from the upstream module).
* Using a named location for the `error_page` handler is not possible,
  since in some cases the URI might not be parsed by Nginx, and thus Nginx
  refuses to call named locations in such cases (e.g. on HTTP 414).

Resulting behavior
------------------

The `03-plugins_triggering_spec.lua` test suite is now a good reference
for the behavior observed upon Nginx-produced errors. This behavior can
be classified in 2 categories:

1. When the request is not parsed by Nginx (HTTP 4xx)
    a. The `error_page` directive is called (`rewrite` and `access` are
       skipped)
    b. The runloop retrieves the list of **global** plugins to run (it
       cannot retrieve more specific plugins since the request could not be
       parsed)
    c. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    d. It executes the `header_filter` phase of each global plugin.
    e. It executes the `body_filter` phase of each global plugin.
    f. It executes the `log` phase of each global plugin.

2. When the error was produced during upstream module execution (HTTP 5xx)
    a. The `rewrite` and `access` phases ran normally (we stashed the
       `ngx.ctx` reference)
    b. The `error_page` directive is called
    c. The error content handler applies the original `ngx.ctx` by
       reference
    d. It does **not** run the plugins loop, since `ngx.ctx` contains
       the list of plugins to run (global **and** specific plugins in
       this case)
    e. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    f. It executes the `header_filter` phase of each global plugin.
    g. It executes the `body_filter` phase of each global plugin.
    h. It executes the `log` phase of each global plugin.

Fix #3193
thibaultcha added a commit that referenced this issue Jun 15, 2018
Context
-------

Before this patch, several scenarios would make Kong/Nginx produce
errors that did not execute plugins:

1. Nginx-produced client errors (HTTP 4xx) on invalid requests per the
   Nginx settings (e.g. headers too large, URI too large, etc...)
2. Proxy TCP/HTTP errors on the last balancer try (e.g. connection
   refused, connection timeout, read/write timeout, etc...)
3. Any other Nginx-produced server errors (HTTP 5xx), if any

Because plugins are not executed in such cases, logging plugins in
particular do not get a chance to report such errors, leaving Kong
administrator blind when relying on bundled or homegrown logging
plugins. Additionally, other plugins, e.g. transformation, do not get a
chance to run either and could lead to scenarios where those errors
produce headers or response bodies that are undesired by the
administrator.

This issue is similar to that fixed by #3079, which handled the
following case:

4. Requests short-circuited by Kong itself (vs Nginx in 1, 2, and 3.)

See also #490 & #892 for issues related to 4. This patches addresses 1,
2, and 3.

Problem
-------

Nginx-produced errors (4xx and 5xx) are redirected to the
`/kong_error_handler` location by way of the `error_page` directive,
but the runloop is not given a chance to run, since the error handler
is only instructed to override the Nginx-defined HTML responses, and
produces Kong-friendly ones.

In short, a patch solving this issue must enable the runloop within the
error handler.

Proposed solution
-----------------

Several (many) possibilities were evaluated and explored to fix this.
The proposed patch is the result of many attempts, observations, and
adjustments to initial proposals. In a gist, it proposes:

* Executing the response part of the runloop in the error_page location.
* Ensure that if the request part of the runloop did not have a chance
  to run, we do build the list of plugins to execute (thus, global plugins
  only since the request was not processed).
* Ensure that if the request part of the runloop did run, we preserve
  its `ngx.ctx` values.

A few catches:

* Currently, `ngx.req.get_method()` in the `error_page` location block
  will return `GET` even if the request method was different (only
  `HEAD` is preserved). A follow up patch for Nginx is being prepared.
  As such, several tests are currently marked as "Pending".
* When the `rewrite` phase did not get a chance to run, some parts of
  the request may not have been parsed by Nginx (e.g. on HTTP 414), thus
  some variables, such as `$request_uri` or request headers may not have
  been parsed. Code should be written defensively against such
  possibilities, and logging plugins may report incomplete request data
  (but will still report the produced error).
* The `uninitialized_variable_warn` directive must be disabled in the
  `error_page` location block for cases when the `rewrite` phase did not
  get a chance to run (Nginx-produced 4xx clients errors).
* The `ngx.ctx` variables must be copied by reference for plugins to be
  able to access it in the response phases when the request was
  short-circuited by Nginx (e.g. HTTP 502/504 from the upstream module).
* Using a named location for the `error_page` handler is not possible,
  since in some cases the URI might not be parsed by Nginx, and thus Nginx
  refuses to call named locations in such cases (e.g. on HTTP 414).

Resulting behavior
------------------

The `03-plugins_triggering_spec.lua` test suite is now a good reference
for the behavior observed upon Nginx-produced errors. This behavior can
be classified in 2 categories:

1. When the request is not parsed by Nginx (HTTP 4xx)
    a. The `error_page` directive is called (`rewrite` and `access` are
       skipped)
    b. The runloop retrieves the list of **global** plugins to run (it
       cannot retrieve more specific plugins since the request could not be
       parsed)
    c. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    d. It executes the `header_filter` phase of each global plugin.
    e. It executes the `body_filter` phase of each global plugin.
    f. It executes the `log` phase of each global plugin.

2. When the error was produced during upstream module execution (HTTP 5xx)
    a. The `rewrite` and `access` phases ran normally (we stashed the
       `ngx.ctx` reference)
    b. The `error_page` directive is called
    c. The error content handler applies the original `ngx.ctx` by
       reference
    d. It does **not** run the plugins loop, since `ngx.ctx` contains
       the list of plugins to run (global **and** specific plugins in
       this case)
    e. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    f. It executes the `header_filter` phase of each global plugin.
    g. It executes the `body_filter` phase of each global plugin.
    h. It executes the `log` phase of each global plugin.

Fix #3193
thibaultcha added a commit that referenced this issue Jun 15, 2018
Context
-------

Before this patch, several scenarios would make Kong/Nginx produce
errors that did not execute plugins:

1. Nginx-produced client errors (HTTP 4xx) on invalid requests per the
   Nginx settings (e.g. headers too large, URI too large, etc...)
2. Proxy TCP/HTTP errors on the last balancer try (e.g. connection
   refused, connection timeout, read/write timeout, etc...)
3. Any other Nginx-produced server errors (HTTP 5xx), if any

Because plugins are not executed in such cases, logging plugins in
particular do not get a chance to report such errors, leaving Kong
administrator blind when relying on bundled or homegrown logging
plugins. Additionally, other plugins, e.g. transformation, do not get a
chance to run either and could lead to scenarios where those errors
produce headers or response bodies that are undesired by the
administrator.

This issue is similar to that fixed by #3079, which handled the
following case:

4. Requests short-circuited by Kong itself (vs Nginx in 1, 2, and 3.)

See also #490 & #892 for issues related to 4. This patches addresses 1,
2, and 3.

Problem
-------

Nginx-produced errors (4xx and 5xx) are redirected to the
`/kong_error_handler` location by way of the `error_page` directive,
but the runloop is not given a chance to run, since the error handler
is only instructed to override the Nginx-defined HTML responses, and
produces Kong-friendly ones.

In short, a patch solving this issue must enable the runloop within the
error handler.

Proposed solution
-----------------

Several (many) possibilities were evaluated and explored to fix this.
The proposed patch is the result of many attempts, observations, and
adjustments to initial proposals. In a gist, it proposes:

* Executing the response part of the runloop in the error_page location.
* Ensure that if the request part of the runloop did not have a chance
  to run, we do build the list of plugins to execute (thus, global plugins
  only since the request was not processed).
* Ensure that if the request part of the runloop did run, we preserve
  its `ngx.ctx` values.

A few catches:

* Currently, `ngx.req.get_method()` in the `error_page` location block
  will return `GET` even if the request method was different (only
  `HEAD` is preserved). A follow up patch for Nginx is being prepared.
  As such, several tests are currently marked as "Pending".
* When the `rewrite` phase did not get a chance to run, some parts of
  the request may not have been parsed by Nginx (e.g. on HTTP 414), thus
  some variables, such as `$request_uri` or request headers may not have
  been parsed. Code should be written defensively against such
  possibilities, and logging plugins may report incomplete request data
  (but will still report the produced error).
* The `uninitialized_variable_warn` directive must be disabled in the
  `error_page` location block for cases when the `rewrite` phase did not
  get a chance to run (Nginx-produced 4xx clients errors).
* The `ngx.ctx` variables must be copied by reference for plugins to be
  able to access it in the response phases when the request was
  short-circuited by Nginx (e.g. HTTP 502/504 from the upstream module).
* Using a named location for the `error_page` handler is not possible,
  since in some cases the URI might not be parsed by Nginx, and thus Nginx
  refuses to call named locations in such cases (e.g. on HTTP 414).

Resulting behavior
------------------

The `03-plugins_triggering_spec.lua` test suite is now a good reference
for the behavior observed upon Nginx-produced errors. This behavior can
be classified in 2 categories:

1. When the request is not parsed by Nginx (HTTP 4xx)
    a. The `error_page` directive is called (`rewrite` and `access` are
       skipped)
    b. The runloop retrieves the list of **global** plugins to run (it
       cannot retrieve more specific plugins since the request could not be
       parsed)
    c. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    d. It executes the `header_filter` phase of each global plugin.
    e. It executes the `body_filter` phase of each global plugin.
    f. It executes the `log` phase of each global plugin.

2. When the error was produced during upstream module execution (HTTP 5xx)
    a. The `rewrite` and `access` phases ran normally (we stashed the
       `ngx.ctx` reference)
    b. The `error_page` directive is called
    c. The error content handler applies the original `ngx.ctx` by
       reference
    d. It does **not** run the plugins loop, since `ngx.ctx` contains
       the list of plugins to run (global **and** specific plugins in
       this case)
    e. The error content handler overrides the Nginx body with a Kong-friendly
       one (unchanged from previous behavior)
    f. It executes the `header_filter` phase of each global plugin.
    g. It executes the `body_filter` phase of each global plugin.
    h. It executes the `log` phase of each global plugin.

Fix #3193
From #3533
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task/feature Requests for new features in Kong
Projects
None yet
Development

No branches or pull requests