Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hexagon] Move aot/graph_executor interactions into launcher #10907

Merged

Conversation

Lunderberg
Copy link
Contributor

Follow-up from #10581, applying similar changes to the AOT and graph executor interactions. This moves the file management and upload/download from the unit tests into the launcher.

Follow-up from apache#10581, applying
similar changes to the AOT and graph executor interactions.  This
moves the file management and upload/download from the unit tests into
the launcher.
Copy link
Contributor

@kparzysz-quic kparzysz-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this to the launcher class, and call functions from it from Session?

My general concern is that when we reuse a session, we load and run many modules into the same OS process[1]. Since we don't unload any of the previous modules, the prior loads/runs may affect subsequent tests. Starting a new session each time starts a fresh OS process, so any prior misbehaving tests won't affect the current one. It's not necessary to use a new session every time, but it's good to have that option.

[1] It's the fastrpc shell on device and the entire QuRT image on simulator.

@Lunderberg
Copy link
Contributor Author

For the re-use, each unit test is run with an independent hexagon session. The hexagon_session test fixture is a function-scope fixture, so it gets regenerated before every test (or parametrization of a test), and every test runs with a unique session.

@Lunderberg
Copy link
Contributor Author

For moving the implementation from the session to the launcher, is that primarily to allow them to be overloaded by the HexagonLauncherAndroid and HexagonLauncherSimulator subclasses? Otherwise, I'm not quite following how moving the implementation from Session to HexagonLauncherRPC would avoid the re-use of the session object.

I had been thinking of the session as the primary way that a real or simulated device is exposed to the user, since it holds the session.device, and wanted to avoid requiring test code to interact with both a launcher and a session generated from that launcher. Since the session is always generated from a launcher, the session can track which launcher it was generated from, and delegate any functionality that is needed at that point. That said, whichever has the full implementation should be called by the other. On a re-read, I realized that I didn't HexagonLauncherRPC.get_graph_executor and HexagonLauncherRPC.get_aot_executor functions should be updated to directly call the Session functions, so the two have different accepted argument types.

@kparzysz-quic
Copy link
Contributor

Let me think about it for a bit.

@kparzysz-quic
Copy link
Contributor

IMO there are two types of interactions with a device: ones that have to do with setting up the device, and ones that have to do with running code on it. The first group would be, for example, uploading files to it, starting servers, etc, while the second would be loading modules into memory, creating arrays in memory, invoking functions, etc. The first set would be the responsibility of the launcher, while the second would belong to the session.

In this view, creating executors would belong to the session, so your changes are ok in that sense. Still, I'd like to make the division of responsibilities clear (not necessarily in this PR though).

Finally, I don't like the idea of passing executor factories to the launcher/session code. Executor factories are not persistent objects. What's persistent (i.e. can be saved/loaded) are the actual modules. We should operate in terms of those instead. In particular, the export_library stuff does not belong in either launcher or session.

@kparzysz-quic
Copy link
Contributor

We could have a utility function somewhere in tvm/contrib/hexagon that does the export_library together with the link-step workaround. That should save some work from the tests using AoT.

@Lunderberg
Copy link
Contributor Author

I like the general division between launcher and session, and that mirrors the general compile-time / run-time split that exists throughout TVM.

From what I can tell, there's two distinct use-cases that are pointing toward the different preferred designs. In a production environment repeatedly running the same model, the file copying should only be done once, with each session loading the model that has already been uploaded. In this use case, the saved binary file is fundamental; created by the user, uploaded by the launcher, and loaded by the session. In a testing environment where each model is different, the file copying must be done prior to each session. In this use case, the saved binary file is a temporary intermediate, whose entire purpose is replicate the local built module onto the remote session.

I think passing the executor factories makes sense to support the second use case, because those are the objects that have sufficient information to fully describe the state being replicated onto the remote session.

@Lunderberg
Copy link
Contributor Author

We could have a utility function somewhere in tvm/contrib/hexagon that does the export_library together with the link-step workaround. That should save some work from the tests using AoT.

Good point, and I like the idea of centralizing the link-step workaround, so it will only need to be removed from one spot once it is not longer needed. I was thinking that it would make sense to have it directly in tvm.contrib.hexagon.create_aot_shared, since right now it is necessary for all invocations.

The utility function would get a lot of the same benefits. I had been hoping to also avoid requiring functionality tests to handle file management and cleanup, as the file management isn't relevant to the behavior being tested. I'm open to having a testing-specific utility function, though I'd still prefer having it be part of the session object, as that would allow for additional ergonomic benefits in the future (e.g. Uploads triggered through the session having a session-specific prefix to avoid name collisions, and being cleaned up after the session.)

@kparzysz-quic
Copy link
Contributor

Could we have separate function inside Session that takes the factory objects (something like get_executor_from_factory or something like that)? Then the other functions would take a module, or module+json.

@Lunderberg
Copy link
Contributor Author

Sure. I like that division, because it avoids a lot of the argument checks that are necessary if both behaviors are in a single function.

Copy link
Contributor

@kparzysz-quic kparzysz-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@kparzysz-quic kparzysz-quic merged commit cd6aa7b into apache:main Apr 12, 2022
@Lunderberg Lunderberg deleted the hexagon_launcher_executor_ergonomics branch April 12, 2022 14:10
AndrewZhaoLuo added a commit to AndrewZhaoLuo/tvm that referenced this pull request Apr 15, 2022
* main: (527 commits)
  [hexagon] 'add_hvx' test to explore HVX usage. (apache#10604)
  [COMMUNITY] @yzh119 -> Reviewer (apache#10993)
  [Metaschedule] Make custom schedule_rule registration optional (apache#10975)
  [ONNX] Add imports for BERT contrib operators (apache#10949)
  sort axes (apache#10985)
  [Hexagon] Remove HexagonBuffer external constructor and support (apache#10978)
  [CI] Update GPU image (apache#10992)
  [Runtime][Vulkan] Add RGP support to TVM for vulkan device (apache#10953)
  [FIX] resolve int64/32 for AttrStmtNode (apache#10983)
  [TVMC] Allow output module name to be passed as a command line argument (apache#10962)
  [ONNX] Add MatMulInteger importer (apache#10450)
  [COMMUNITY] @guberti -> Reviewer (apache#10976)
  Support `qnn.conv2d` in FoldExplicitPading (apache#10982)
  change Hexagon docker version (apache#10981)
  remove exception handling of autotvm xgboost extract functions (apache#10948)
  [CUDNN] Add partitioning support for conv2d and log_softmax (apache#10961)
  [Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout (apache#10905)
  [Hexagon] Move aot/graph_executor interactions into launcher (apache#10907)
  [HEXAGON] Split huge 1D DMA Transfers into smaller transfers with legal sizes. (apache#10971)
  [CI][DOCKER] Add pytest-lazy-fixture to images (apache#10970)
  ...
Lucien0 pushed a commit to Lucien0/tvm that referenced this pull request Apr 19, 2022
…10907)

* [Hexagon] Move aot/graph_executor interactions into launcher

Follow-up from apache#10581, applying
similar changes to the AOT and graph executor interactions.  This
moves the file management and upload/download from the unit tests into
the launcher.

* Added Session.test_executor to avoid duplication in graph/aot test.

* Resolve lint errors

* Moved link flags workaround out of session, into create_aot_shared

* Separated Session.get_*_executor and Session.get_executor_from_factory

* Updated to resolve lint error
altanh pushed a commit to altanh/tvm that referenced this pull request Apr 28, 2022
…10907)

* [Hexagon] Move aot/graph_executor interactions into launcher

Follow-up from apache#10581, applying
similar changes to the AOT and graph executor interactions.  This
moves the file management and upload/download from the unit tests into
the launcher.

* Added Session.test_executor to avoid duplication in graph/aot test.

* Resolve lint errors

* Moved link flags workaround out of session, into create_aot_shared

* Separated Session.get_*_executor and Session.get_executor_from_factory

* Updated to resolve lint error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants