Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Allow users to pass service name for profiler #26220

Merged
merged 15 commits into from
May 3, 2023

Conversation

riteshghorse
Copy link
Contributor

@riteshghorse riteshghorse commented Apr 11, 2023

This PR adds an experimental pipeline option for users to pass a service name for profiler. By default, cloud profiler uses JOB_NAME as the service name.
Users can specify this as --dataflow_service_options=enable_google_cloud_profiler=new_profiler

Sample Streaming Wordcount Job: https://pantheon.corp.google.com/dataflow/jobs/us-central1/2023-05-03_07_19_52-5911424696235359677;graphView=0?pageState=(%22dfTime%22:(%22s%22:%222023-05-03T14:19:52.473Z%22,%22e%22:%222023-05-03T14:31:38.548Z%22))&project=google.com:clouddfe

Profiler with the set name:
image

Fixes #26280


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@riteshghorse riteshghorse marked this pull request as ready for review April 13, 2023 20:08
@riteshghorse
Copy link
Contributor Author

R: @tvalentyn

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@codecov
Copy link

codecov bot commented Apr 17, 2023

Codecov Report

Merging #26220 (1706a55) into master (80f1f6c) will increase coverage by 0.01%.
The diff coverage is 68.96%.

@@            Coverage Diff             @@
##           master   #26220      +/-   ##
==========================================
+ Coverage   81.07%   81.08%   +0.01%     
==========================================
  Files         469      469              
  Lines       67165    67199      +34     
==========================================
+ Hits        54453    54488      +35     
+ Misses      12712    12711       -1     
Flag Coverage Δ
python 81.08% <68.96%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../python/apache_beam/runners/worker/data_sampler.py 100.00% <ø> (ø)
...thon/apache_beam/runners/worker/sdk_worker_main.py 77.71% <43.75%> (+0.83%) ⬆️
...dks/python/apache_beam/options/pipeline_options.py 94.19% <100.00%> (+0.16%) ⬆️

... and 20 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

CHANGES.md Outdated Show resolved Hide resolved
sdks/python/apache_beam/runners/worker/sdk_worker_main.py Outdated Show resolved Hide resolved
@tvalentyn
Copy link
Contributor

thanks, looking better!

@tvalentyn
Copy link
Contributor

Thanks, @riteshghorse . We can merge once tests pass, feel free to ping me. Please also re-do the manual sanity check as described in the PR description just to make sure.

@tvalentyn
Copy link
Contributor

Run Python_Runners PreCommit

@tvalentyn
Copy link
Contributor

Run Python_Integration PreCommit

@riteshghorse
Copy link
Contributor Author

I'm seeing a strange behavior with Dataflow UI. The profiler is generated for this Job. But the Job Info panel on job page doesn't show a job profiler tab as it should. (It shows the tab when I pass dataflow_service_options=enable_google_cloud_profiler but not when dataflow_service_options=enable_google_cloud_profiler=custom is passed)

@riteshghorse
Copy link
Contributor Author

Let's hold merging this until we get response from UI team

@riteshghorse
Copy link
Contributor Author

Changes are live in the UI. Updated the description with new job and profiler example.

Merging this change.

@riteshghorse riteshghorse merged commit a13749f into apache:master May 3, 2023
cushon pushed a commit to cushon/beam that referenced this pull request May 24, 2024
* handle service name for profiler

* add debug messages

* remove logs

* move import statement and logs

* rm unnecessary check

* add helper to GCloudOption

* rm lint

* added new helper and unit test

* separate functions

* updated unit tests

* refactored plus check for envvar

* updated changes.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Allow users to pass their own service name for google cloud profiler
2 participants