-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve string terms aggregation performance using Collector#setWeight #11643
Conversation
❌ Gradle check result for 006f404: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 6667d18: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Compatibility status:Checks if related components are compatible with change cdc4204 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git] |
❌ Gradle check result for ce1082c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
...ava/org/opensearch/search/aggregations/bucket/terms/GlobalOrdinalsStringTermsAggregator.java
Outdated
Show resolved
Hide resolved
...ava/org/opensearch/search/aggregations/bucket/terms/GlobalOrdinalsStringTermsAggregator.java
Outdated
Show resolved
Hide resolved
...ava/org/opensearch/search/aggregations/bucket/terms/GlobalOrdinalsStringTermsAggregator.java
Outdated
Show resolved
Hide resolved
❌ Gradle check result for a5b3baa: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for e4e0b3c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 851b759: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for e005a9c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
To test the performance improvements, I had edited one of the sub-aggregation search body to a simple term aggregation, like this. Basically in OSB, my search body/workload looks like this:
The changes are run on 2.11 cluster: Without Changes:
With Current Changes:
Clearly 4x (p100) - 6x (p90) improvements can be seen. @msfroh I'm working next to see if I can trim in more corners in implementation, refactor further and relevant cases, but please feel free to take initial look and provide comments. Also, with OSB, I will open up a separate issue with OSB workload to incorporate vanilla term aggregations in their workloads since currently we do not have any term aggregations workload like the one I tested. |
❌ Gradle check result for 6d05716: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 2a9cce7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 82d1532: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 58716d2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 8922b88: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for b555118: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Sandesh Kumar <[email protected]>
Signed-off-by: Sandesh Kumar <[email protected]>
Signed-off-by: Sandesh Kumar <[email protected]>
Signed-off-by: Sandesh Kumar <[email protected]>
Signed-off-by: Sandesh Kumar <[email protected]>
@msfroh Thanks for the comments, made the rewording related changes as you pointed out. Let me know if it looks more accurate now. |
…onally match-all for a segment (#11643) --------- Signed-off-by: Sandesh Kumar <[email protected]> (cherry picked from commit 7dac98c) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…onally match-all for a segment (opensearch-project#11643) --------- Signed-off-by: Sandesh Kumar <[email protected]>
…query is functionally match-all for a segment (#12629) Quickly compute terms aggregations when the top-level query is functionally match-all for a segment (#11643) --------- Signed-off-by: Sandesh Kumar <[email protected]>
…onally match-all for a segment (opensearch-project#11643) --------- Signed-off-by: Sandesh Kumar <[email protected]>
…onally match-all for a segment (opensearch-project#11643) --------- Signed-off-by: Sandesh Kumar <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
Utilize Collector#setWeight to smartly short-circuit certain aggregation paths. Basically cases when weight#count does not returns
-1
:weight#count > 0
&weight#count == maxdocs
in segments -> can leverage reading from termsEnumCases accounted for (for which the optimization will not work):
Related Issues
Resolves #10954
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.