renaming metrics #1224

dhrubo-os · 2023-08-19T01:53:08Z

Description

[We renamed the node level metrics:

ml_node_executing_task_count --> ml_executing_task_count
ml_node_total_model_count --> ml_deployed_model_count
ml_node_total_failure_count. --> ml_failure_count
ml_node_total_circuit_breaker_trigger_count --> ml_circuit_breaker_trigger_count
ml_node_total_request_count --> ml_request_count
ml_node_jvm_heap_usage --> ml_jvm_heap_usage
]

Issues Resolved

[List any issues this PR will resolve]

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Dhrubo Saha <[email protected]>

jackiehanyang · 2023-08-21T18:15:44Z

are we removing all NODE keywords in the metric naming? like changing ML_NODE_TOTAL_MODEL_COUNT to ML_TOTAL_MODEL_COUNT. What's the reason for that?

dhrubo-os · 2023-08-21T18:17:54Z

are we removing all NODE keywords in the metric naming? like changing ML_NODE_TOTAL_MODEL_COUNT to ML_TOTAL_MODEL_COUNT. What's the reason for that?

Before we had only data note. But in model serving framework we also initiated ML Node. So having this ML_NODE prefix might confuse users.

jackiehanyang · 2023-08-21T18:42:35Z

are we removing all NODE keywords in the metric naming? like changing ML_NODE_TOTAL_MODEL_COUNT to ML_TOTAL_MODEL_COUNT. What's the reason for that?

Before we had only data note. But in model serving framework we also initiated ML Node. So having this ML_NODE prefix might confuse users.

LGTM, just could you check why all the builds are failing? Will approve after CI is passing

dhrubo-os · 2023-08-21T18:46:55Z

LGTM, just could you check why all the builds are failing? Will approve after CI is passing

It's failing here, we applied some node level vs cluster level distinction in the stats. Now it's treating all of them as Cluster level stats.

Signed-off-by: Dhrubo Saha <[email protected]>

ylwu-amzn · 2023-08-21T19:16:08Z

plugin/src/main/java/org/opensearch/ml/rest/RestMLStatsAction.java

@@ -148,6 +149,10 @@ protected RestChannelConsumer prepareRequest(RestRequest request, NodeClient cli
    }

    MLStatsInput createMlStatsInputFromRequestParams(RestRequest request) {
+
+        Set<String> mlNodeStatNames = EnumSet.allOf(MLNodeLevelStat.class).stream()


Should we construct a new Set for every request?

we can construct the new set in the class initialization. Let me do that.

Signed-off-by: Dhrubo Saha <[email protected]>

ylwu-amzn · 2023-08-22T00:38:51Z

plugin/src/main/java/org/opensearch/ml/model/MLModelManager.java

@@ -392,9 +391,6 @@ private void indexRemoteModel(MLRegisterModelInput registerModelInput, MLTask ml
        String taskId = mlTask.getTaskId();
        FunctionName functionName = mlTask.getFunctionName();
        try (ThreadContext.StoredContext context = client.threadPool().getThreadContext().stashContext()) {
-            mlStats.getStat(MLNodeLevelStat.ML_REQUEST_COUNT).increment();
-            mlStats.createCounterStatIfAbsent(functionName, REGISTER, ML_ACTION_REQUEST_COUNT).increment();


This line is to track how many register requests on function level. By removing this, can we still track that?

Yeah, because we are tracking this in the parent function registerMLModel

Signed-off-by: Dhrubo Saha <[email protected]>

ylwu-amzn · 2023-08-22T01:09:17Z

plugin/src/main/java/org/opensearch/ml/stats/MLNodeLevelStat.java

@@ -11,7 +11,8 @@
 */
 public enum MLNodeLevelStat {
    ML_JVM_HEAP_USAGE,
-    ML_EXECUTING_TASK_COUNT,
+    ML_EXECUTING_TASK_COUNT, // How many tasks are executing currently. If any task starts, then it will be 1, if the task finished then it


then it will be 1 -> then it will increase by 1

will get back to 0 -> will decrease by 1?

Signed-off-by: Dhrubo Saha <[email protected]>

ylwu-amzn · 2023-08-22T01:11:58Z

plugin/src/main/java/org/opensearch/ml/action/register/TransportRegisterModelAction.java

@@ -234,12 +233,6 @@ private void registerModel(MLRegisterModelInput registerModelInput, ActionListen
                throw new IllegalArgumentException("URL can't match trusted url regex");
            }
        }
-        // mlStats.getStat(MLNodeLevelStat.ML_NODE_EXECUTING_TASK_COUNT).increment();
-        mlStats.getStat(MLNodeLevelStat.ML_NODE_TOTAL_REQUEST_COUNT).increment();


Why remove this line?

we are already counting this in the registerMLModel function in MLModelManager class

* renaming metrics Signed-off-by: Dhrubo Saha <[email protected]> * updating tests Signed-off-by: Dhrubo Saha <[email protected]> * updating test cases Signed-off-by: Dhrubo Saha <[email protected]> * removing the ML_NODE checking for node level stats Signed-off-by: Dhrubo Saha <[email protected]> * updating constructing new set Signed-off-by: Dhrubo Saha <[email protected]> * spotless Apply Signed-off-by: Dhrubo Saha <[email protected]> * updating ML_NODE_TOTAL_MODEL_COUNT to ML_DEPLOYED_MODEL_COUNT Signed-off-by: Dhrubo Saha <[email protected]> * fixing metrics count Signed-off-by: Dhrubo Saha <[email protected]> * spotless Signed-off-by: Dhrubo Saha <[email protected]> * fixing executing task Signed-off-by: Dhrubo Saha <[email protected]> * updating comment Signed-off-by: Dhrubo Saha <[email protected]> --------- Signed-off-by: Dhrubo Saha <[email protected]> (cherry picked from commit 86eb953)

* renaming metrics Signed-off-by: Dhrubo Saha <[email protected]> * updating tests Signed-off-by: Dhrubo Saha <[email protected]> * updating test cases Signed-off-by: Dhrubo Saha <[email protected]> * removing the ML_NODE checking for node level stats Signed-off-by: Dhrubo Saha <[email protected]> * updating constructing new set Signed-off-by: Dhrubo Saha <[email protected]> * spotless Apply Signed-off-by: Dhrubo Saha <[email protected]> * updating ML_NODE_TOTAL_MODEL_COUNT to ML_DEPLOYED_MODEL_COUNT Signed-off-by: Dhrubo Saha <[email protected]> * fixing metrics count Signed-off-by: Dhrubo Saha <[email protected]> * spotless Signed-off-by: Dhrubo Saha <[email protected]> * fixing executing task Signed-off-by: Dhrubo Saha <[email protected]> * updating comment Signed-off-by: Dhrubo Saha <[email protected]> --------- Signed-off-by: Dhrubo Saha <[email protected]> (cherry picked from commit 86eb953) Co-authored-by: Dhrubo Saha <[email protected]>

* renaming metrics Signed-off-by: Dhrubo Saha <[email protected]> * updating tests Signed-off-by: Dhrubo Saha <[email protected]> * updating test cases Signed-off-by: Dhrubo Saha <[email protected]> * removing the ML_NODE checking for node level stats Signed-off-by: Dhrubo Saha <[email protected]> * updating constructing new set Signed-off-by: Dhrubo Saha <[email protected]> * spotless Apply Signed-off-by: Dhrubo Saha <[email protected]> * updating ML_NODE_TOTAL_MODEL_COUNT to ML_DEPLOYED_MODEL_COUNT Signed-off-by: Dhrubo Saha <[email protected]> * fixing metrics count Signed-off-by: Dhrubo Saha <[email protected]> * spotless Signed-off-by: Dhrubo Saha <[email protected]> * fixing executing task Signed-off-by: Dhrubo Saha <[email protected]> * updating comment Signed-off-by: Dhrubo Saha <[email protected]> --------- Signed-off-by: Dhrubo Saha <[email protected]>

renaming metrics

29d6398

Signed-off-by: Dhrubo Saha <[email protected]>

dhrubo-os had a problem deploying to ml-commons-cicd-env August 19, 2023 01:53 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 19, 2023 01:53 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 19, 2023 01:53 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 19, 2023 01:53 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:12 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:12 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:12 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:12 — with GitHub Actions Error

updating tests

1b703f9

Signed-off-by: Dhrubo Saha <[email protected]>

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:30 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:30 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:30 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:30 — with GitHub Actions Error

updating test cases

8439f45

Signed-off-by: Dhrubo Saha <[email protected]>

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:55 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:55 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:55 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 17:55 — with GitHub Actions Failure

removing the ML_NODE checking for node level stats

30b1761

Signed-off-by: Dhrubo Saha <[email protected]>

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 19:12 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 19:12 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 19:12 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 21, 2023 19:12 — with GitHub Actions Failure

ylwu-amzn reviewed Aug 21, 2023

View reviewed changes

updating constructing new set

1a057eb

Signed-off-by: Dhrubo Saha <[email protected]>

ylwu-amzn reviewed Aug 22, 2023

View reviewed changes

fixing executing task

62b0e7b

Signed-off-by: Dhrubo Saha <[email protected]>

dhrubo-os temporarily deployed to ml-commons-cicd-env August 22, 2023 01:02 — with GitHub Actions Inactive

ylwu-amzn reviewed Aug 22, 2023

View reviewed changes

updating comment

0b61450

Signed-off-by: Dhrubo Saha <[email protected]>

dhrubo-os had a problem deploying to ml-commons-cicd-env August 22, 2023 01:11 — with GitHub Actions Error

dhrubo-os had a problem deploying to ml-commons-cicd-env August 22, 2023 01:11 — with GitHub Actions Failure

dhrubo-os temporarily deployed to ml-commons-cicd-env August 22, 2023 01:11 — with GitHub Actions Inactive

ylwu-amzn reviewed Aug 22, 2023

View reviewed changes

ylwu-amzn approved these changes Aug 22, 2023

View reviewed changes

dhrubo-os temporarily deployed to ml-commons-cicd-env August 22, 2023 01:30 — with GitHub Actions Inactive

dhrubo-os had a problem deploying to ml-commons-cicd-env August 22, 2023 01:30 — with GitHub Actions Failure

dhrubo-os temporarily deployed to ml-commons-cicd-env August 22, 2023 01:49 — with GitHub Actions Inactive

b4sjoo approved these changes Aug 22, 2023

View reviewed changes

dhrubo-os merged commit 86eb953 into opensearch-project:2.x Aug 22, 2023

dhrubo-os added the backport 2.9 label Aug 22, 2023

opensearch-trigger-bot bot mentioned this pull request Aug 22, 2023

[Backport 2.9] renaming metrics #1229

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

renaming metrics #1224

renaming metrics #1224

dhrubo-os commented Aug 19, 2023 •

edited

Loading

jackiehanyang commented Aug 21, 2023

dhrubo-os commented Aug 21, 2023

jackiehanyang commented Aug 21, 2023

dhrubo-os commented Aug 21, 2023

ylwu-amzn Aug 21, 2023

dhrubo-os Aug 21, 2023

ylwu-amzn Aug 22, 2023

dhrubo-os Aug 22, 2023

ylwu-amzn Aug 22, 2023

ylwu-amzn Aug 22, 2023

dhrubo-os Aug 22, 2023

renaming metrics #1224

renaming metrics #1224

Conversation

dhrubo-os commented Aug 19, 2023 • edited Loading

Description

Issues Resolved

Check List

jackiehanyang commented Aug 21, 2023

dhrubo-os commented Aug 21, 2023

jackiehanyang commented Aug 21, 2023

dhrubo-os commented Aug 21, 2023

ylwu-amzn Aug 21, 2023

Choose a reason for hiding this comment

dhrubo-os Aug 21, 2023

Choose a reason for hiding this comment

ylwu-amzn Aug 22, 2023

Choose a reason for hiding this comment

dhrubo-os Aug 22, 2023

Choose a reason for hiding this comment

ylwu-amzn Aug 22, 2023

Choose a reason for hiding this comment

ylwu-amzn Aug 22, 2023

Choose a reason for hiding this comment

dhrubo-os Aug 22, 2023

Choose a reason for hiding this comment

dhrubo-os commented Aug 19, 2023 •

edited

Loading