Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ONNX runtime in RunInference API #24911

Merged
merged 34 commits into from
Feb 10, 2023
Merged

Conversation

ziqi-ma
Copy link
Contributor

@ziqi-ma ziqi-ma commented Jan 6, 2023

Please add a meaningful description for your change here
Addresses #22972

Adding onnx support to beam. ONNX Runtime is a popular framework to accelerate ML inference:
https://onnxruntime.ai/. It supports a broad range of model frameworks including PyTorch, Tensorflow/Keras, TFLite, scikit-learn, etc.. This PR lets you call an onnx model from beam by passing in model path.

Specifically:

  • Added onnx_inference.py which is the onnx API.
  • Added onnx_inference_test.py which contains unit tests for onnx models based on PyTorch, tensor flow, and sklearn.
  • Added onnx_inference_it_test.py which contains integration test for sentiment classification with onnx version of RoBERTa.
  • Added examples/onnx_sentiment_classification.py which is an example of using an onnx version of RoBERTa for sentiment classification, also used in the integration test.
  • Configured test with python 3.8 in tox and gradle.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@codecov
Copy link

codecov bot commented Jan 6, 2023

Codecov Report

Merging #24911 (d3c535c) into master (16cb63b) will decrease coverage by 0.12%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master   #24911      +/-   ##
==========================================
- Coverage   72.95%   72.84%   -0.12%     
==========================================
  Files         745      748       +3     
  Lines       99174    99361     +187     
==========================================
+ Hits        72356    72378      +22     
- Misses      25453    25617     +164     
- Partials     1365     1366       +1     
Flag Coverage Δ
python 82.27% <0.00%> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...xamples/inference/onnx_sentiment_classification.py 0.00% <0.00%> (ø)
.../python/apache_beam/ml/inference/onnx_inference.py 0.00% <0.00%> (ø)
sdks/go/pkg/beam/core/metrics/dumper.go 49.20% <0.00%> (-4.77%) ⬇️
sdks/python/apache_beam/utils/interactive_utils.py 95.12% <0.00%> (-2.44%) ⬇️
sdks/python/apache_beam/io/localfilesystem.py 90.97% <0.00%> (-0.76%) ⬇️
...python/apache_beam/runners/worker/worker_status.py 74.66% <0.00%> (-0.67%) ⬇️
...hon/apache_beam/runners/worker/bundle_processor.py 93.25% <0.00%> (-0.30%) ⬇️
sdks/python/apache_beam/transforms/util.py 96.16% <0.00%> (-0.13%) ⬇️
...dks/python/apache_beam/options/pipeline_options.py 93.97% <0.00%> (ø)
...thon/apache_beam/ml/inference/pytorch_inference.py 0.00% <0.00%> (ø)
... and 8 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@AnandInguva
Copy link
Contributor

One question: Do we need GPUs to run the ONNX unit tests? I have very little about ONNX but I will have a read about it during the review.

@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Jan 7, 2023

One question: Do we need GPUs to run the ONNX unit tests? I have very little about ONNX but I will have a read about it during the review.

In the code I did specify ONNX provider in a way (as a priority list) such that if run on NVIDIA GPUs they should run the gpu version but on cpu it should run the cpu version. But I ran my tests on CPU - if we want to make sure this works properly in a GPU environment (and which type of GPU also matters) then I need to test in those environments too.

sdks/python/apache_beam/ml/inference/onnx_inference.py Outdated Show resolved Hide resolved
sdks/python/apache_beam/ml/inference/onnx_inference.py Outdated Show resolved Hide resolved
sdks/python/apache_beam/ml/inference/onnx_inference.py Outdated Show resolved Hide resolved
sdks/python/apache_beam/ml/inference/onnx_inference.py Outdated Show resolved Hide resolved
sdks/python/apache_beam/ml/inference/onnx_inference.py Outdated Show resolved Hide resolved
sdks/python/apache_beam/ml/inference/onnx_inference.py Outdated Show resolved Hide resolved
sdks/python/apache_beam/ml/inference/onnx_inference.py Outdated Show resolved Hide resolved
start-build-env-onnx.sh Outdated Show resolved Hide resolved
@AnandInguva
Copy link
Contributor

AnandInguva commented Jan 10, 2023

Hi, once the PR is ready for another review, comment on this with PTAL @<username>.
Thanks for the starting this.

@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Jan 14, 2023

Hi, once the PR is ready for another review, comment on this with PTAL @<username>.
Thanks for the starting this.

PTAL @AnandInguva [I'm still working on the gradle onnxtest (if that is the right direction to go), but in the meantime could you let me know how to put data at gcp location for integration test? I think maybe that should come before the gradle part.]

@AnandInguva
Copy link
Contributor

AnandInguva commented Jan 17, 2023

Hi, once the PR is ready for another review, comment on this with PTAL @<username>.
Thanks for the starting this.

PTAL @AnandInguva [I'm still working on the gradle onnxtest (if that is the right direction to go), but in the meantime could you let me know how to put data at gcp location for integration test? I think maybe that should come before the gradle part.]

Hi, gs://apache-beam-samples/run_inference of project apache-beam-testing should be open to public. You can upload your required files/folder here for testing.

@AnandInguva
Copy link
Contributor

Hi, I will take another review this weekend. If you have some time, you can fix the formatting issues.

image

@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Jan 21, 2023

Hi, I will take another review this weekend. If you have some time, you can fix the formatting issues.

image

Fixed yaps

@ziqi-ma ziqi-ma closed this Jan 21, 2023
@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Jan 21, 2023

Hi, I will take another review this weekend. If you have some time, you can fix the formatting issues.
image

Fixed yaps

@ziqi-ma ziqi-ma reopened this Jan 21, 2023
@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Jan 21, 2023

Hi, once the PR is ready for another review, comment on this with PTAL @<username>.
Thanks for the starting this.

PTAL @AnandInguva [I'm still working on the gradle onnxtest (if that is the right direction to go), but in the meantime could you let me know how to put data at gcp location for integration test? I think maybe that should come before the gradle part.]

Hi, gs://apache-beam-samples/run_inference of project apache-beam-testing should be open to public. You can upload your required files/folder here for testing.

For apache-beam-samples/run_inference, I am able to read but not write, getting the error below:
User [[email protected]] does not have permission to access b instance [apache-beam-samples] (or it may not exist): [email protected] does not have storage.objects.create access to the Google Cloud Storage object. Permission 'storage.objects.create' denied on resource (or it may not exist).

However, it seems like the other tests use gs://apache-beam-ml/models/ and gs://apache-beam-ml/datasets. For these I do not have read access.

Copy link
Contributor

@AnandInguva AnandInguva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this looks good to me.

I would clean up the lint, formatting errors for the checks to go green and we can start finalizing the PR.

@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Feb 4, 2023

Most of this looks good to me.

I would clean up the lint, formatting errors for the checks to go green and we can start finalizing the PR.

Hi -thanks. I fixed the formatting errors but seems like the 2 remaining failures are about files/tests that should not be touched by this PR?

  • codecov/patch is complaining 0.00% diff hit (and it seems to be one line not covered in sdks/python/apache_beam/runners/worker/bundle_processor.py? (I'm not sure how to cover it?)
  • py310 test is failing due to a segfault, but I only added onnx tests for py38?

Would be great if you could give some pointers regarding these, thanks!

@AnandInguva
Copy link
Contributor

you can ignore codecov and the other error is not relevant to this PR and in general...could be flaky.

@ziqi-ma ziqi-ma changed the title Ziqima/onnx Support ONNX runtime in RunInference API Feb 5, 2023
@AnandInguva
Copy link
Contributor

AnandInguva commented Feb 9, 2023

@ziqi-ma Hi, is it ready for the final review?

I confirm the tests are running for onnx.
link: https://ci-beam.apache.org/job/beam_PreCommit_Python_Coverage_Commit/398/testReport/apache_beam.ml.inference.onnx_inference_test/
I am not a committer so passing this @jrmccluskey @damccorm @riteshghorse @tvalentyn

Thanks again for the changes.

Also, can you add this feature to the CHANGES.md defined at https://github.com/apache/beam/blob/master/CHANGES.md?

Something like

* RunInference PTransform will accept model paths as SideInputs in Python SDK. ([#24042](https://github.com/apache/beam/issues/24042))

I would like to get this before Feb 22nd so that it can be included in next release 2.46.0

Copy link
Contributor

@riteshghorse riteshghorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor nit, LGTM

sdks/python/tox.ini Outdated Show resolved Hide resolved
@riteshghorse
Copy link
Contributor

Since you've already added an example, please add a section to README on how to run it. You can add it in a new PR as well if you'd like.

@damccorm
Copy link
Contributor

damccorm commented Feb 9, 2023

LGTM - we can merge once Ritesh's comments are responded to

@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Feb 10, 2023

Since you've already added an example, please add a section to README on how to run it. You can add it in a new PR as well if you'd like.

Added

@ziqi-ma
Copy link
Contributor Author

ziqi-ma commented Feb 10, 2023

@ziqi-ma Hi, is it ready for the final review?

I confirm the tests are running for onnx. link: https://ci-beam.apache.org/job/beam_PreCommit_Python_Coverage_Commit/398/testReport/apache_beam.ml.inference.onnx_inference_test/ I am not a committer so passing this @jrmccluskey @damccorm @riteshghorse @tvalentyn

Thanks again for the changes.

Also, can you add this feature to the CHANGES.md defined at https://github.com/apache/beam/blob/master/CHANGES.md?

Something like

* RunInference PTransform will accept model paths as SideInputs in Python SDK. ([#24042](https://github.com/apache/beam/issues/24042))

I would like to get this before Feb 22nd so that it can be included in next release 2.46.0

Hi - I have fixed all comments. Think this is ready to merge.

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, thank you!

@damccorm
Copy link
Contributor

retest this please

@damccorm
Copy link
Contributor

(above comment was for the CI bot) - I'm going to let the precommit checks run to completion to make sure they are passing, then I will merge

@damccorm damccorm merged commit 874bd45 into apache:master Feb 10, 2023
@damccorm
Copy link
Contributor

Thanks @ziqi-ma!

@Abacn
Copy link
Contributor

Abacn commented Feb 13, 2023

It breaks Python PostCommits hdfsIntegrationTest task, error message is (taken from https://ci-beam.apache.org/view/PostCommit/job/beam_PostCommit_Python310/470/consoleFull)

04:11:40 test_1      | Traceback (most recent call last):
04:11:40 test_1      |   File "/usr/local/bin/tox", line 8, in <module>
04:11:40 test_1      |     sys.exit(cmdline())
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/session/__init__.py", line 42, in cmdline
04:11:40 test_1      |     main(args)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/session/__init__.py", line 62, in main
04:11:40 test_1      |     config = load_config(args)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/session/__init__.py", line 78, in load_config
04:11:40 test_1      |     config = parseconfig(args)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 262, in parseconfig
04:11:40 test_1      |     ParseIni(config, config_file, content)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1126, in __init__
04:11:40 test_1      |     raise tox.exception.ConfigError(
04:11:40 test_1      | tox.exception.ConfigError: ConfigError: py{38}-onnx-{113} failed with ConfigError: substitution key '38' not found at Traceback (most recent call last):
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1102, in run
04:11:40 test_1      |     results[name] = cur_self.make_envconfig(name, section, subs, config)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1226, in make_envconfig
04:11:40 test_1      |     res = meth(env_attr.name, env_attr.default, replace=replace)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1407, in getpath
04:11:40 test_1      |     path = self.getstring(name, defaultpath, replace=replace)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1486, in getstring
04:11:40 test_1      |     x = self._replace_if_needed(x, name, replace, crossonly)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1491, in _replace_if_needed
04:11:40 test_1      |     x = self._replace(x, name=name, crossonly=crossonly)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1517, in _replace
04:11:40 test_1      |     replaced = Replacer(self, crossonly=crossonly).do_replace(value)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1557, in do_replace
04:11:40 test_1      |     expanded = substitute_once(value)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1551, in substitute_once
04:11:40 test_1      |     return self.RE_ITEM_REF.sub(self._replace_match, x)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1598, in _replace_match
04:11:40 test_1      |     return self._replace_substitution(match)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1633, in _replace_substitution
04:11:40 test_1      |     val = self._substitute_from_other_section(sub_key)
04:11:40 test_1      |   File "/usr/local/lib/python3.10/site-packages/tox/config/__init__.py", line 1627, in _substitute_from_other_section
04:11:40 test_1      |     raise tox.exception.ConfigError("substitution key {!r} not found".format(key))
04:11:40 test_1      | tox.exception.ConfigError: ConfigError: substitution key '38' not found
04:11:40 test_1      | 

see #25443 for details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants