-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Add support in the SDK to retrieve IndexingPressure information from OpenSearch #655
Comments
This is available in the Nodes API:
Should be a quick add of the appropriate client call in SDK client similar to Field Mappings. |
See example "indexing_pressure" : {
"memory" : {
"current" : {
"combined_coordinating_and_primary_in_bytes" : 0,
"coordinating_in_bytes" : 0,
"primary_in_bytes" : 0,
"replica_in_bytes" : 0,
"all_in_bytes" : 0
},
"total" : {
"combined_coordinating_and_primary_in_bytes" : 40256,
"coordinating_in_bytes" : 40256,
"primary_in_bytes" : 45016,
"replica_in_bytes" : 0,
"all_in_bytes" : 40256,
"coordinating_rejections" : 0,
"primary_rejections" : 0,
"replica_rejections" : 0
},
"limit_in_bytes" : 53687091
}
}, |
Have tried and failed a few approaches, recording here for posterity:
This leaves two options:
So assuming I go with option 1, I'm thinking to create an SDK-side Transport Action that does the request. However, that leads to the follow-on question, "what are we going to do with this information"? In the AD application, the indexing pressure is considered on the node which is currently executing the request. In the AD Extension we're processing data on a remote node and sending it back via REST calls, which we don't know which node will handle it (Hello Hash Ring?). SO I'm thinking this issue needs to be paused pending Hash Ring implementation. Thoughts, anyone? |
Useful blog post, I think this is the way forward: https://opensearch.org/blog/shard-indexing-backpressure-in-opensearch Current code is local node based and just measures memory as a signal whether to index. Instead we should query the REST APIs documented in this and use the unthrottled/soft limit thresholds as our critereon. |
Putting this issue back into the backlog for now. Future plans:
|
Isn't this issue a blocker for multi entity detectors? |
Yes, Indexing pressure is needed for the |
Not if throttling is not needed. If it is we can look at the above api to determine what thresholds to use. Existing code assumes you are on the node doing the work, not injecting via api. |
Summary: replicating AD code has no meaning on an extension. we may need a different api call to decide whether to skip some bulk indexing but I don't know what thresholds we should use with the new call. We should try performance testing without any limit. If we end up needing to add it then the test will help us know what to use. |
Is your feature request related to a problem?
The ADResultBulkTransportAction handles bulk indexing requests to OpenSearch to index Anomaly Results. This is used for multi-entity detectors.
This particular transport action requires an object of type
IndexingPressure
to be injected via guice which tracks the incoming index requests per shard/node in the cluster and provides memory accounting. Anomaly Detection uses this indexing pressure here to calculate an indexing pressure limit. This calculation is then used to determine whether to continue indexing Anomaly Results or to queue them for later.What solution would you like?
The SDK should provide a mechanism to request the state of the
IndexingPressure
instantiated in OpenSearch and enable extensions to retrieve this information from theExtensionRunner
to perform these claculations. This workflow should follow a similar design to the SDK's sendClusterStateRequest.Here is a high level overview of what the workflow should look like :
extensionRunner
should define a method calledsendIndexingPressureRequest(Transportservice)
that instantiates aIndexingPressureResponseHandler
and uses the transport service to send a request to theExtensionsManager
.ExtensionsManager
should include theIndexingPressureService
as a class field and provide a method to set this field after theIndexingPressureService
is instantiated here in Node.javaIndexingPressureService
'sShardIndexingPressure
object,Do you have any additional context?
Within Node.java, the
IndexingPressureService
is instantiated here and is then bound to guice here. Internally this includes an object of typeShardIndexingPressure
here that extends theIndexingPressure
class.The text was updated successfully, but these errors were encountered: