Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add training stats and library initialized stats #191

Merged
merged 10 commits into from
Nov 9, 2021

Conversation

jmazanec15
Copy link
Member

Description

This change adds the last couple of stats for 1.2 release:

  1. faiss_initialized
  2. nmslib_initialized
  3. training_requests
  4. training_errors
  5. training_memory_usage
  6. training_memory_usage_percentage

It does not take fine grained training job stats (i.e. how much memory a particular training job is using on a node). This is okay because at the moment only one job can execute training on a node at a time. This could be improved in the future.

Stats API response now looks like this:

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "integTest",
  "circuit_breaker_triggered" : false,
  "model_index_status" : null,
  "nodes" : {
    "<node-id>" : {
      "graph_memory_usage_percentage" : 0.0,
      "graph_query_requests" : 0,
      "graph_memory_usage" : 0,
      "cache_capacity_reached" : false,
      "load_success_count" : 0,
      "training_memory_usage" : 0,
      "indices_in_cache" : { },
      "script_query_errors" : 0,
      "hit_count" : 0,
      "knn_query_requests" : 0,
      "total_load_time" : 0,
      "miss_count" : 0,
      "training_memory_usage_percentage" : 0.0,
      "graph_index_requests" : 0,
      "faiss_initialized" : false,
      "load_exception_count" : 0,
      "training_errors" : 0,
      "eviction_count" : 0,
      "nmslib_initialized" : false,
      "script_compilations" : 0,
      "script_query_requests" : 0,
      "graph_query_errors" : 0,
      "indexing_from_model_degraded" : false,
      "graph_index_errors" : 0,
      "training_requests" : 0,
      "script_compilation_errors" : 0
    }
  }
}

Issues Resolved

#151

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jmazanec15 jmazanec15 requested a review from VijayanB November 9, 2021 18:05
Signed-off-by: John Mazanec <[email protected]>
@codecov-commenter
Copy link

codecov-commenter commented Nov 9, 2021

Codecov Report

Merging #191 (1c3c154) into main (7a9d1e0) will decrease coverage by 0.08%.
The diff coverage is 75.80%.

❗ Current head 1c3c154 differs from pull request most recent head 47d84f3. Consider uploading reports for the commit 47d84f3 to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##               main     #191      +/-   ##
============================================
- Coverage     83.16%   83.08%   -0.09%     
- Complexity      846      858      +12     
============================================
  Files           122      123       +1     
  Lines          3706     3759      +53     
  Branches        358      359       +1     
============================================
+ Hits           3082     3123      +41     
- Misses          465      475      +10     
- Partials        159      161       +2     
Impacted Files Coverage Δ
...in/transport/TrainingJobRouterTransportAction.java 80.00% <ø> (ø)
...org/opensearch/knn/training/TrainingJobRunner.java 49.29% <0.00%> (-4.56%) ⬇️
...plugin/transport/TrainingModelTransportAction.java 83.87% <50.00%> (-8.44%) ⬇️
...rch/knn/index/memory/NativeMemoryCacheManager.java 94.05% <83.33%> (-2.46%) ⬇️
.../java/org/opensearch/knn/index/util/KNNEngine.java 100.00% <100.00%> (ø)
...java/org/opensearch/knn/index/util/KNNLibrary.java 84.61% <100.00%> (+0.37%) ⬆️
...main/java/org/opensearch/knn/jni/FaissService.java 85.71% <100.00%> (+2.38%) ⬆️
...ain/java/org/opensearch/knn/jni/NmslibService.java 85.71% <100.00%> (+2.38%) ⬆️
...va/org/opensearch/knn/plugin/stats/KNNCounter.java 100.00% <100.00%> (ø)
...rg/opensearch/knn/plugin/stats/KNNStatsConfig.java 96.87% <100.00%> (+0.72%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a9d1e0...47d84f3. Read the comment docs.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 requested a review from vamshin November 9, 2021 18:23
src/main/java/org/opensearch/knn/jni/FaissService.java Outdated Show resolved Hide resolved
routeRequest(request, listener);
}, listener::onFailure));
routeRequest(request, wrappedListener);
}, wrappedListener::onFailure));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}, wrappedListener::onFailure));
}, ex-> {
KNNCounter.TRAINING_ERRORS.increment();
listener.onFailure(ex);
}));

Copy link
Member Author

@jmazanec15 jmazanec15 Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In its new location, I think it makes sense to be a variable

@jmazanec15 jmazanec15 requested a review from VijayanB November 9, 2021 18:32
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
vamshin
vamshin previously approved these changes Nov 9, 2021
Copy link
Member

@vamshin vamshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 removed the request for review from vamshin November 9, 2021 19:54
@jmazanec15 jmazanec15 merged commit 5fcd818 into opensearch-project:main Nov 9, 2021
jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Nov 9, 2021
@jmazanec15 jmazanec15 added the Enhancements Increases software capabilities beyond original client specifications label Nov 15, 2021
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 7, 2022
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 7, 2022
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancements Increases software capabilities beyond original client specifications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants