Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add edge segment size to filter out change points that are observed on the data edge #28780

Merged
merged 5 commits into from
Oct 7, 2023

Conversation

AnandInguva
Copy link
Contributor

@AnandInguva AnandInguva commented Oct 2, 2023

Workaround for #28757.

When a change point is observed on the edge sometimes, we can discard it because it will still be alerted but after the edge_segment_size runs.

Fixes: #28757


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@github-actions github-actions bot added the python label Oct 2, 2023
@codecov
Copy link

codecov bot commented Oct 3, 2023

Codecov Report

Merging #28780 (c72b47d) into master (31e1c7a) will decrease coverage by 0.03%.
Report is 36 commits behind head on master.
The diff coverage is 12.50%.

@@            Coverage Diff             @@
##           master   #28780      +/-   ##
==========================================
- Coverage   72.23%   72.21%   -0.03%     
==========================================
  Files         684      685       +1     
  Lines      101198   101518     +320     
==========================================
+ Hits        73102    73307     +205     
- Misses      26518    26633     +115     
  Partials     1578     1578              
Flag Coverage Δ
python 82.70% <12.50%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
.../python/apache_beam/testing/analyzers/constants.py 100.00% <100.00%> (ø)
...ache_beam/testing/analyzers/perf_analysis_utils.py 18.44% <0.00%> (-3.35%) ⬇️

... and 16 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2023

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @damccorm for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

# https://github.com/apache/beam/issues/28757
# Remove this workaround once we have a good solution to deal
# with the edge change points.
if is_edge_change_point(change_point_index,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this logic into find_latest_change_point_index since there are other considerations in that function to filter out noise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also skip adding extra param to find_latest_change_point_index until we have a usecase to customize it with a non-default value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@AnandInguva AnandInguva requested a review from tvalentyn October 3, 2023 22:39
'awaiting additional data. Should the change point persist after '
'gathering more data, an alert will be raised.' %
(change_point_index, constants._EDGE_SEGMENT_SIZE))
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be considering prior change_points_indices? That is, instead of returning change_points_indices[-1], we return the latest change point that is not in the edge segment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return the latest change point itself. Even if we ignore it for example 3 days, it gets filed.

We also ignore change points that are occurred 14 days before. Most often change_points_indices[-2] lies outside of that window or doesn't exist. So we could just follow the current approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. yes, i think it should get file eventually.

@AnandInguva AnandInguva requested a review from tvalentyn October 5, 2023 16:47
@AnandInguva AnandInguva merged commit d5b8fb8 into apache:master Oct 7, 2023
74 of 75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add min number of data points to raise an perf alert
2 participants