Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky ModelInsight tests #395

Merged
merged 7 commits into from
Aug 30, 2019
Merged

Fix flaky ModelInsight tests #395

merged 7 commits into from
Aug 30, 2019

Conversation

TuanNguyen27
Copy link
Collaborator

@codecov
Copy link

codecov bot commented Aug 27, 2019

Codecov Report

Merging #395 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #395   +/-   ##
=======================================
  Coverage   86.82%   86.82%           
=======================================
  Files         336      336           
  Lines       10962    10962           
  Branches      572      572           
=======================================
  Hits         9518     9518           
  Misses       1444     1444

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9815b59...b28865c. Read the comment docs.

@@ -755,8 +755,8 @@ class ModelInsightsTest extends FlatSpec with PassengerSparkFixtureTest with Dou
val bigCoeffSum = orginalbigCoeff * math.sqrt(smallFeatureVariance) / labelStd + descaledbigCoeff
val absError2 = math.abs(originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd - descaledsmallCoeff)
val smallCoeffSum = originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd + descaledsmallCoeff
absError / bigCoeffSum < tol shouldBe true
absError2 / smallCoeffSum < tol shouldBe true
absError / (2 * bigCoeffSum) < tol shouldBe true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you hardcode this change instead of adjusting the tolerance? Is there a reason for a factor of 2 here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think back when i wrote this test you suggested abs(x_1 - x_2) / 2 * (x_1 + x_2) as one way to compute how close x_1 is to x_2, or did I misremember?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested difference / avg. value, which is abs(x_1 - x_2) * 2 / (x_1 + x_2). It probably makes more sense to change it to that, but you should probably increase the overall tolerance too to cut down on flakiness.

@@ -755,8 +755,8 @@ class ModelInsightsTest extends FlatSpec with PassengerSparkFixtureTest with Dou
val bigCoeffSum = orginalbigCoeff * math.sqrt(smallFeatureVariance) / labelStd + descaledbigCoeff
val absError2 = math.abs(originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd - descaledsmallCoeff)
val smallCoeffSum = originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd + descaledsmallCoeff
absError / bigCoeffSum < tol shouldBe true
absError2 / smallCoeffSum < tol shouldBe true
2 * absError / bigCoeffSum < tol shouldBe true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your fix correctly that you double the left-hand side the test would also be fixed by:

absError should be < tol * smallCoeffSum

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at any rate please use "should be < tol" as it seems to read more clear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would also provide a better error message ;) cool!

@tovbinm tovbinm merged commit b91ffe3 into master Aug 30, 2019
@tovbinm tovbinm deleted the tn/fix_flaky branch August 30, 2019 22:04
@gerashegalov gerashegalov mentioned this pull request Sep 8, 2019
gerashegalov added a commit that referenced this pull request Sep 11, 2019
Bug fixes:
- Ensure correct metrics despite model failures on some CV folds [#404](#404)
- Fix flaky `ModelInsight` tests [#395](#395)
- Avoid creating `SparseVector`s for LOCO [#377](#377)

New features / updates:
- Model combiner [#385](#399)
- Added new sample for HousingPrices [#365](#365)
- Test to verify that custom metrics appear in model insight metrics [#387](#387)
- Add `FeatureDistribution` to `SerializationFormat`s [#383](#383)
- Add metadata to `OpStandadrdScaler` to allow for descaling [#378](#378)
- Improve json serde error in `evalMetFromJson` [#380](#380)
- Track mean & standard deviation as metrics for numeric features and for text length of text features [#354](#354)
- Making model selectors robust to failing models [#372](#372)
- Use compact and compressed model json by default [#375](#375)
- Descale feature contribution for Linear Regression & Logistic Regression [#345](#345)

Dependency updates:   
- Update tika version [#382](#382)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants