Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percentiles aggregation validation checks for range #51871

Merged
merged 6 commits into from
Feb 11, 2020

Conversation

hendrikmuhs
Copy link

@hendrikmuhs hendrikmuhs commented Feb 4, 2020

disallow to specify percentile out of range [0,100]. This also fixes a problem in transform by failing validation if an invalid percentile configuration is used.

Note: The duplication check is a breaking change if you are picky (question is, if it can be found in the wild). -> duplication moved out to PR: TBD

For release notes: Please list it under aggs (percentiles in transforms is a new feature for 7.7.0)

relates to #51808

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@hendrikmuhs
Copy link
Author

run elasticsearch-ci/bwc
run elasticsearch-ci/default-distro

@hendrikmuhs hendrikmuhs force-pushed the percentile-aggregation-checks branch from 41fe6b8 to 442fb22 Compare February 5, 2020 20:13
@benwtrent benwtrent removed the :ml/Transform Transform label Feb 6, 2020
@hendrikmuhs hendrikmuhs added the :ml/Transform Transform label Feb 6, 2020
@nik9000 nik9000 self-requested a review February 6, 2020 16:37
throw new IllegalArgumentException("percent must be in [0,100], got [" + percent + "]: [" + name + "]");
}

if(percent == previousPercent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have an extra space after if and before (.

@nik9000
Copy link
Member

nik9000 commented Feb 6, 2020

Note: The duplication check is a breaking change if you are picky (question is, if it can be found in the wild).

I imagine it can happen, mostly in the case where folks build the percentiles with a tool. What does it look like when there are dups?

I certainly would feel more comfortable making the change in 8.0 only.

@hendrikmuhs
Copy link
Author

I imagine it can happen, mostly in the case where folks build the percentiles with a tool. What does it look like when there are dups?

  "aggregations" : {
    "percentile_field" : {
      "values" : {
        "1.0" : 0.0,
        "5.0" : 274.3552087114338,
        "25.0" : 3009.2182284054206,
        "50.0" : 5480.849716847411,
        "50.0" : 5480.849716847411,
        "75.0" : 8015.768454300776,
        "95.0" : 10010.003716690042,
        "99.0" : 17937.91377777778
      }
    }
  }

It depends on the json parser to accept this, it seems common behavior to silently remove the double entry and take the last value.

This issue exists now for a long time without no one noticed or at least not finding it problematic. I therefore agree we can wait for 8.0 to fix it.

I certainly would feel more comfortable making the change in 8.0 only.

I can split the PR:

  • the range validation is important for transforms and required for 7.7, this is a non-breaking change.
  • for the double percentiles output I added a workaround in [Transform] add support for percentile aggs #51808, so it won't break for transforms, I can move that part into a separate 8.0 only PR

@hendrikmuhs hendrikmuhs changed the title Percentiles aggregation validation checks for range and dups Percentiles aggregation validation checks for range Feb 7, 2020
@hendrikmuhs
Copy link
Author

I removed the duplication check. I will create a separate PR for this, 8.0 only.
However I require the changes from this PR, so I will do it after this has been merged.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hendrikmuhs hendrikmuhs merged commit 8b9e18e into elastic:master Feb 11, 2020
@hendrikmuhs hendrikmuhs deleted the percentile-aggregation-checks branch February 11, 2020 16:24
hendrikmuhs pushed a commit that referenced this pull request Feb 11, 2020
disallow to specify percentile out of range [0,100]. This also fixes a problem in transform by failing
validation if an invalid percentile configuration is used.
hendrikmuhs pushed a commit that referenced this pull request Feb 26, 2020
Disallow specifying the same percentile multiple times in percentiles
aggregation

Related: #51871
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants