Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force using the last centroid during merging #111644

Merged
merged 5 commits into from
Aug 8, 2024
Merged

Conversation

kkrik-es
Copy link
Contributor

@kkrik-es kkrik-es commented Aug 6, 2024

The MergingDigest uses pre-allocated arrays for constant memory footprint and improved performance. Points get added to an intermediate buffer that gets merged to the canonical arrays of means and weights (constituting the centroids). During merging, the values in the latter arrays get updated with the merged values. However, the current logic may try to update a value after the array bound, for values in the very low or very high percentiles. This PR fixes this.

Fixes #111065

@kkrik-es kkrik-es added >bug :Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport-and-merge v8.15.1 labels Aug 6, 2024
@kkrik-es kkrik-es requested a review from not-napoleon August 6, 2024 15:59
@kkrik-es kkrik-es self-assigned this Aug 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @kkrik-es, I've created a changelog YAML for you.

@@ -152,7 +152,7 @@ public void testQuantile() {
hist2.compress();
double x1 = hist1.quantile(0.5);
double x2 = hist2.quantile(0.5);
assertEquals(Dist.quantile(0.5, data), x1, 0.2);
assertEquals(Dist.quantile(0.5, data), x1, 0.25);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is orthogonal, noticed it in CI: https://gradle-enterprise.elastic.co/s/36zxzfevfwrcc

Verified that it happens regardless of the change in MergingDigest.

@kkrik-es kkrik-es marked this pull request as ready for review August 6, 2024 18:05
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@kkrik-es
Copy link
Contributor Author

kkrik-es commented Aug 8, 2024

Thanks Mark!

@kkrik-es kkrik-es merged commit 0688cc2 into elastic:main Aug 8, 2024
15 checks passed
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Aug 9, 2024
* upstream/main: (22 commits)
  Prune changelogs after 8.15.0 release
  Bump versions after 8.15.0 release
  EIS integration (elastic#111154)
  Skip LOOKUP/INLINESTATS cases unless on snapshot (elastic#111755)
  Always enforce strict role validation (elastic#111056)
  Mute org.elasticsearch.xpack.esql.analysis.VerifierTests testUnsupportedAndMultiTypedFields elastic#111753
  [ML] Force time shift integration test (elastic#111620)
  ESQL: Add tests for sort, where with unsupported type (elastic#111737)
  [ML] Force time shift documentation (elastic#111668)
  Fix remote cluster credential secure settings reload   (elastic#111535)
  ESQL: Fix for overzealous validation in case of invalid mapped fields (elastic#111475)
  Pass allow security manager flag in gradle test policy setup plugin (elastic#111725)
  Rename streamContent/Separator to bulkContent/Separator (elastic#111716)
  Mute org.elasticsearch.tdigest.ComparisonTests testSparseGaussianDistribution elastic#111721
  Remove 8.14 from branches.json
  Only emit product origin in deprecation log if present (elastic#111683)
  Forward port release notes for v8.15.0 (elastic#111714)
  [ES|QL] Combine Disjunctive CIDRMatch (elastic#111501)
  ESQL: Remove qualifier from attrs (elastic#110581)
  Force using the last centroid during merging (elastic#111644)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceNamedWriteablesProvider.java
@kkrik-es kkrik-es deleted the fix/111065 branch August 19, 2024 11:35
elasticsearchmachine pushed a commit that referenced this pull request Aug 26, 2024
* [8.15] Force using the last centroid during merging

* update test
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Sep 4, 2024
* Force using the last centroid during merging

* Update docs/changelog/111644.yaml

* spotless

* add asserts in test
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Sep 5, 2024
* Force using the last centroid during merging

* Update docs/changelog/111644.yaml

* spotless

* add asserts in test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.1 v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ArrayIndexOutOfBoundsException in TDigest library
3 participants