fault: only check for overflow when performing faults #12843

snowp · 2020-08-27T15:27:56Z

This shuffles the check for overflow around so that we only increment
the overflow value when attempting to apply a fault instead of for
each request. This ensures that if applying a low % fault the overflow
stat will be in line with that, instead of counting each request once
the overflow threshold has been hit.

Signed-off-by: Snow Pettersen [email protected]

Risk Level: Medium
Testing: New UT, existing tests
Docs Changes: n/a
Release Notes: n/a
Fixes #12816

This shuffles the check for overflow around so that we only increment the overflow value when attempting to apply a fault instead of for each request. This ensures that if applying a low % fault the overflow stat will be in line with that, instead of counting each request once the overflow threshold has been hit. Signed-off-by: Snow Pettersen <[email protected]>

Signed-off-by: Snow Pettersen <[email protected]>

snowp · 2020-08-27T15:28:59Z

source/extensions/filters/http/fault/fault_filter.cc

+  // single request might increment the counter more than once if it tries to apply multiple faults,
+  // though it is also possible for it to fail the first check then succeed on the second (should
+  // another thread decrement the fault).
+  // TODO(snowp): Is this behavior ideal? Should we track whether we've rejected faults for this


this is the main open question in my mind, not sure how important we feel this is

I do not think that this is a big issue for us.

IMO it's probably fine either way. I would go with whatever is easiest and most clear to implement. Can you remove the TODO and make it clear what we have chosen? Also not sure if it's worth it to add docs on the stat that makes this mroe clear, up to you.

I think the docs imply the behavior I changed it to - it counts the number of faults were skipped due to overflow - so I'll leave it at that. Updated the comment a bit and removed the TODO

mattklein123 · 2020-08-27T16:19:36Z

@Augustyniak can you take a first pass?

Augustyniak · 2020-08-27T17:01:47Z

@Augustyniak can you take a first pass?

Sure.

Augustyniak

2 nit comments

Augustyniak · 2020-08-27T18:30:42Z

source/extensions/filters/http/fault/fault_filter.cc

+  // single request might increment the counter more than once if it tries to apply multiple faults,
+  // though it is also possible for it to fail the first check then succeed on the second (should
+  // another thread decrement the fault).
+  // TODO(snowp): Is this behavior ideal? Should we track whether we've rejected faults for this


I do not think that this is a big issue for us.

Augustyniak · 2020-08-27T18:36:47Z

source/extensions/filters/http/fault/fault_filter.cc

@@ -391,15 +401,29 @@ FaultFilterStats FaultFilterConfig::generateStats(const std::string& prefix, Sta
                                 POOL_GAUGE_PREFIX(scope, final_prefix))};
 }

-void FaultFilter::maybeIncActiveFaults() {
+bool FaultFilter::maybeDoFault() {


This function doesn't really do fault - it just decides whether a fault can be injected or not. What do you think about renaming it to something like canApplyFault()

Yeah I tried to capture the fact that we also increment the gauge here, but I agree that it's a bit misleading. I'll go with your suggestion

Actually will do tryIncActiveFaults which I think is a bit clearer

Augustyniak · 2020-08-27T21:33:57Z

test/extensions/filters/http/fault/fault_filter_test.cc


-  EXPECT_CALL(runtime_.snapshot_, getInteger("fault.http.max_active_faults", 0))
+  EXPECT_CALL(runtime_.snapshot_,


in line #528 - can we verify that active_faults counter is equal to 0 when there is a faults_overflow.

Signed-off-by: Snow Pettersen <[email protected]>

mattklein123

LGTM with small comment, thank you!

/wait

mattklein123 · 2020-08-28T18:50:47Z

source/extensions/filters/http/fault/fault_filter.cc

+  // single request might increment the counter more than once if it tries to apply multiple faults,
+  // though it is also possible for it to fail the first check then succeed on the second (should
+  // another thread decrement the fault).
+  // TODO(snowp): Is this behavior ideal? Should we track whether we've rejected faults for this


IMO it's probably fine either way. I would go with whatever is easiest and most clear to implement. Can you remove the TODO and make it clear what we have chosen? Also not sure if it's worth it to add docs on the stat that makes this mroe clear, up to you.

Signed-off-by: Snow Pettersen <[email protected]>

This shuffles the check for overflow around so that we only increment the overflow value when attempting to apply a fault instead of for each request. This ensures that if applying a low % fault the overflow stat will be in line with that, instead of counting each request once the overflow threshold has been hit. Signed-off-by: Snow Pettersen <[email protected]> Signed-off-by: Clara Andrew-Wani <[email protected]>

snowp added 3 commits August 27, 2020 15:16

comment

73b6a51

Signed-off-by: Snow Pettersen <[email protected]>

format

84af011

Signed-off-by: Snow Pettersen <[email protected]>

snowp requested a review from alyssawilk as a code owner August 27, 2020 15:27

snowp assigned Augustyniak and mattklein123 Aug 27, 2020

snowp commented Aug 27, 2020

View reviewed changes

Augustyniak reviewed Aug 27, 2020

View reviewed changes

mattklein123 added the waiting label Aug 27, 2020

pr feedback

ea28541

Signed-off-by: Snow Pettersen <[email protected]>

repokitteh-read-only bot removed the waiting label Aug 28, 2020

Augustyniak previously approved these changes Aug 28, 2020

View reviewed changes

mattklein123 requested changes Aug 28, 2020

View reviewed changes

repokitteh-read-only bot added the waiting label Aug 28, 2020

remove todo

f49029f

Signed-off-by: Snow Pettersen <[email protected]>

snowp dismissed Augustyniak’s stale review via f49029f August 28, 2020 22:07

repokitteh-read-only bot removed the waiting label Aug 28, 2020

update comment

43aacc2

Signed-off-by: Snow Pettersen <[email protected]>

mattklein123 approved these changes Aug 28, 2020

View reviewed changes

snowp merged commit 2f68705 into envoyproxy:master Aug 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fault: only check for overflow when performing faults #12843

fault: only check for overflow when performing faults #12843

snowp commented Aug 27, 2020

snowp Aug 27, 2020

Augustyniak Aug 27, 2020

mattklein123 Aug 28, 2020

snowp Aug 28, 2020

mattklein123 commented Aug 27, 2020

Augustyniak commented Aug 27, 2020

Augustyniak left a comment

Augustyniak Aug 27, 2020

Augustyniak Aug 27, 2020

snowp Aug 28, 2020

snowp Aug 28, 2020

Augustyniak Aug 27, 2020

mattklein123 left a comment

mattklein123 Aug 28, 2020


		EXPECT_CALL(runtime_.snapshot_, getInteger("fault.http.max_active_faults", 0))
		EXPECT_CALL(runtime_.snapshot_,

fault: only check for overflow when performing faults #12843

fault: only check for overflow when performing faults #12843

Conversation

snowp commented Aug 27, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattklein123 commented Aug 27, 2020

Augustyniak commented Aug 27, 2020

Augustyniak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattklein123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment