[ML] Fix influencer count and influence calculation #150

hendrikmuhs · 2018-07-10T13:05:28Z

This fixes counting of influencer per bucket, prior this fix the count has always been set to 1.

Notes:

fix == 1st commit, the 2nd and 3rd add test coverage and improve doc/robustness
it turned out that this functionality had a test (testVarp) but test was forgotten to be added in the test suite
change affects results if influencers are used as influencer score are now changed

Fixes #24

Release note: Fixes influence count per bucket for metric population analyses, which was
wrong and lead to incorrect influencer scoring

hendrikmuhs · 2018-07-10T13:06:11Z

lib/model/CProbabilityAndInfluenceCalculator.cc

-                               params, influencedValue[0]);
+        if (computeInfluencedValue(value, count, i->second.first, i->second.second,
+                                   params, influencedValue[0]) == false) {
+            LOG_ERROR(<< "Failed to compute influencer value");


@tveasey do you have a better idea for this error message?

I think, at least for us, it is very useful to see the function arguments, i.e. value, count, i->second.first and i->second.second.

tveasey

LGTM. Left a couple of suggestions only.

tveasey · 2018-07-11T08:54:27Z

lib/model/CProbabilityAndInfluenceCalculator.cc

    }
 };

 //! \brief Computes the value of the variance statistic on a set difference.
 class CVarianceDifference {
 public:
    //! Features.
-    void operator()(const TDouble1Vec& v,


Maybe document the parameters to this function too? i.e. v == overall variance and mean, n == overall count, vi == influencer variance and mean, etc.

tveasey · 2018-07-11T08:57:49Z

docs/CHANGELOG.asciidoc

@@ -59,6 +59,7 @@ Age seasonal components in proportion to the fraction of values with which they'
 Persist and restore was missing some of the trend model state ({pull}#99[#99])
 Stop zero variance data generating a log error in the forecast confidence interval calculation ({pull}#107[#107])
 Fix corner case failing to calculate lgamma values and the correspoinding log errors ({pull}#126[#126])
+Influence count per bucket was wrong and lead to wrong influencer scoring ({pull}#150[#150])


On second thoughts, I'd say the "Influence count per bucket for metric population analyses was wrong..."

Fix counting of influencer per bucket for metric population analyses, prior this fix the count has always been set to 1. Fixes elastic#24

Fix counting of influencer per bucket for metric population analyses, prior this fix the count has always been set to 1. Fixes #24

Hendrik Muhs added 3 commits July 9, 2018 10:10

use the correct bucket count

f36bf21

reactivate unit test

dc4c4cb

add extra checks for influencer calculations

7fc0054

hendrikmuhs added >bug v7.0.0 :ml v6.4.0 affects-results labels Jul 10, 2018

hendrikmuhs requested a review from tveasey July 10, 2018 13:05

hendrikmuhs commented Jul 10, 2018

View reviewed changes

add changelog entry

a6f66aa

tveasey approved these changes Jul 11, 2018

View reviewed changes

tveasey reviewed Jul 11, 2018

View reviewed changes

address review comments

b8b3f01

hendrikmuhs merged commit d41de34 into elastic:master Jul 11, 2018

hendrikmuhs mentioned this pull request Jul 11, 2018

[6.4][ML] Fix influencer count and influence calculation #153

Merged

hendrikmuhs pushed a commit that referenced this pull request Jul 12, 2018

[ML] Fix influencer count and influence calculation (#150)

7410d96

Fix counting of influencer per bucket for metric population analyses, prior this fix the count has always been set to 1. Fixes #24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Fix influencer count and influence calculation #150

[ML] Fix influencer count and influence calculation #150

hendrikmuhs commented Jul 10, 2018 •

edited by lcawl

Loading

hendrikmuhs Jul 10, 2018

tveasey Jul 11, 2018 •

edited

Loading

tveasey left a comment

tveasey Jul 11, 2018

tveasey Jul 11, 2018 •

edited

Loading

[ML] Fix influencer count and influence calculation #150

[ML] Fix influencer count and influence calculation #150

Conversation

hendrikmuhs commented Jul 10, 2018 • edited by lcawl Loading

hendrikmuhs Jul 10, 2018

Choose a reason for hiding this comment

tveasey Jul 11, 2018 • edited Loading

Choose a reason for hiding this comment

tveasey left a comment

Choose a reason for hiding this comment

tveasey Jul 11, 2018

Choose a reason for hiding this comment

tveasey Jul 11, 2018 • edited Loading

Choose a reason for hiding this comment

hendrikmuhs commented Jul 10, 2018 •

edited by lcawl

Loading

tveasey Jul 11, 2018 •

edited

Loading

tveasey Jul 11, 2018 •

edited

Loading