Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump rcf to 3.0-rc2.1 #519

Merged
merged 5 commits into from
May 3, 2022
Merged

Conversation

amitgalitz
Copy link
Member

@amitgalitz amitgalitz commented Apr 22, 2022

Signed-off-by: Amit Galitzky [email protected]

Description

  1. bumped RCF to 3.0-rc2.1 from maven
  2. changed V1JsonToV2StateConverter usage V1JsonToV3StateConverter
  3. Added test to deserialize model from rc1 and to see if scores are the same using rc1 checkpoint and rc2.1 dependency and rc1 dependency with the same data.

Other manual testing done:

  1. real time single stream and HCAD (including create/start detector, see detector emit results; then stop cluster and restart it to see if results continue to show)
  2. historical single stream and HCAD
  3. backward compatible tests: v1 to v3 model, v2 to v3 model (B/G simulation)
  4. Also answering this question -> "given a v2 model interpreted by rc1 and v2 model interpreted by rc2, would they produce the same score (rcf score and grade)? We expect those should be the same." (Added unit test for this)
  5. Memory usage on average is 20-30% smaller, keeping the same memory formula however for worst case scenarios.

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@amitgalitz amitgalitz requested a review from a team April 22, 2022 22:13
@amitgalitz amitgalitz changed the title bump rcf to 3.0-rc2.1 and add unit test bump rcf to 3.0-rc2.1 Apr 22, 2022
@opensearch-trigger-bot opensearch-trigger-bot bot added backport 2.x infra Changes to infrastructure, testing, CI/CD, pipelines, etc. labels Apr 22, 2022
@codecov-commenter
Copy link

codecov-commenter commented Apr 22, 2022

Codecov Report

Merging #519 (9e33a6f) into main (be319d1) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##               main     #519      +/-   ##
============================================
- Coverage     79.02%   79.00%   -0.02%     
- Complexity     4195     4203       +8     
============================================
  Files           296      296              
  Lines         17663    17681      +18     
  Branches       1878     1877       -1     
============================================
+ Hits          13958    13969      +11     
- Misses         2806     2813       +7     
  Partials        899      899              
Flag Coverage Δ
plugin 79.00% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../main/java/org/opensearch/ad/ml/CheckpointDao.java 69.55% <ø> (ø)
.../java/org/opensearch/ad/AnomalyDetectorPlugin.java 96.53% <100.00%> (ø)
...java/org/opensearch/ad/task/ADBatchTaskRunner.java 81.76% <0.00%> (-1.98%) ⬇️
...ansport/handler/AnomalyResultBulkIndexHandler.java 69.35% <0.00%> (-1.62%) ⬇️
...ain/java/org/opensearch/ad/task/ADTaskManager.java 76.67% <0.00%> (-0.23%) ⬇️
...rch/ad/transport/AnomalyResultTransportAction.java 80.82% <0.00%> (+0.68%) ⬆️
.../transport/SearchAnomalyResultTransportAction.java 84.11% <0.00%> (+11.07%) ⬆️

assertEquals(30, forest.getForest().getNumberOfTrees());
}

public void testDeserializeTRCFModel() throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the test's purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -1000,4 +1005,77 @@ private double[] getPoint(int dimensions, Random random) {
}
return point;
}

public void testDeserializeRCFModelPreINIT() throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the following 2 tests used for single-stream detector's checkpoints? If so, could you add comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

kaituo
kaituo previously approved these changes Apr 25, 2022
@@ -367,7 +367,7 @@ public Collection<Object> createComponents(
mapper.setSaveExecutorContextEnabled(true);
mapper.setSaveTreeStateEnabled(true);
mapper.setPartialTreeStateEnabled(true);
V1JsonToV2StateConverter converter = new V1JsonToV2StateConverter();
V1JsonToV3StateConverter converter = new V1JsonToV3StateConverter();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need V2 to V3 converter? How about add some comments to explain what's V1, V2 and V3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing and looking at RCF code, we don't need a converter between v2 to v3 as that is dealt with accordingly on RCF side. I also added a unit test where we parse a v2 checkpoint with RCF3.0-rc2.1 as a dependency(v3) and we get the same result as we do when parsing with rc1(v2). I'll add some comments explaining v1, v2, v3 in checkPointDAO class.

kaituo
kaituo previously approved these changes May 2, 2022
@@ -117,6 +117,14 @@ public class CheckpointDao {

private Gson gson;
private RandomCutForestMapper mapper;

// For further reference v1, v2 and v3 refer to the different variations of RCF models
// used by AD. v1 was originally used with the lunch of OS 1.0. We later converted to v2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lunch -> launch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

kaituo
kaituo previously approved these changes May 2, 2022
ylwu-amzn
ylwu-amzn previously approved these changes May 2, 2022
amitgalitz added 5 commits May 3, 2022 16:16
Signed-off-by: Amit Galitzky <[email protected]>
Signed-off-by: Amit Galitzky <[email protected]>
Signed-off-by: Amit Galitzky <[email protected]>
Signed-off-by: Amit Galitzky <[email protected]>
@amitgalitz amitgalitz dismissed stale reviews from ylwu-amzn and kaituo via d1dfefb May 3, 2022 16:16
@amitgalitz amitgalitz merged commit 8227e32 into opensearch-project:main May 3, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request May 3, 2022
Signed-off-by: Amit Galitzky <[email protected]>
(cherry picked from commit 8227e32)
amitgalitz added a commit that referenced this pull request May 4, 2022
Signed-off-by: Amit Galitzky <[email protected]>
(cherry picked from commit 8227e32)
amitgalitz added a commit to amitgalitz/anomaly-detection-1 that referenced this pull request May 12, 2022
amitgalitz added a commit that referenced this pull request May 13, 2022
Signed-off-by: Amit Galitzky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport 2.0 infra Changes to infrastructure, testing, CI/CD, pipelines, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants