Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hexagon] [runtime] Manage RPC and runtime buffers separately #13028

Merged
merged 8 commits into from
Oct 12, 2022

Conversation

janetsc
Copy link
Collaborator

@janetsc janetsc commented Oct 11, 2022

Creates HexagonPageAllocator to manage allocations for the RPC server

Adds new Device APIs for RPC buffer management

Allocations are tracked by two separate buffer managers:

  • rpc_hexbuffs is used exclusively for RPC buffers
  • runtime_hexbuffs is used exclusively for runtime buffers

This will fix the throw on shutdown in the simulator.

cc: @csullivan, @kparzysz-quic, @adstraw

@janetsc janetsc marked this pull request as ready for review October 11, 2022 13:47
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good, just a few questions and locations where comments/errors could be improved.

src/runtime/hexagon/hexagon_device_api.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_api.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_api.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_api.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_api.h Show resolved Hide resolved
@tvm-bot
Copy link
Collaborator

tvm-bot commented Oct 12, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, thank you! A couple additional nitpicks that can go either way.

src/runtime/hexagon/hexagon_buffer_manager.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_buffer_manager.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_buffer_manager.h Outdated Show resolved Hide resolved
@driazati driazati merged commit b18f642 into apache:main Oct 12, 2022
@janetsc janetsc deleted the investigate_rpc_allocs branch October 12, 2022 18:58
//! \brief Returns a vector of currently allocated pointers, owned by the manager.
// Note - this should only be used by the device API to keep track of what
// was in the manager when HexagonDeviceAPI::ReleaseResources is called.
std::vector<void*> current_allocations() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for the post-merge feedback. But ... having this method seems like a break in the abstraction of the HexagonBufferManager class, to me. If we have to do this it feels like something else is amiss.

Copy link
Collaborator Author

@janetsc janetsc Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

I agree, this isn't ideal. However, I wanted to make sure that I don't give a "free pass" to all frees that come after ReleaseResources. The scenario I was trying to avoid was:

User allocates buffer B
Session ends, ReleaseResources is called and B is freed
User frees buffer B as objects are being torn down, which should not throw as it is somewhat expected
User frees buffer C, which SHOULD throw, as it was never allocated in the first place

I added this so that I can check if there's no runtime_hexbuffs, then I make sure that what is being attempted to be freed is something that we owned before ReleaseResources was called. In which case, that Free is a no-op. I also want to detect spurious calls to Free that wouldn't have been legit before ReleaseResources was called.

I'm happy to discuss more, and brainstorm other ways to accomplish this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, what if the HexagonBufferManager allows three states for buffers instead of two.

  • Valid to use, valid to free (Initial state after allocating memory)
  • Invalid to use, valid to free (State after closing a session)
  • Invalid to use, invalid to free (State of unknown pointer)

That way, instead of discarding the HexagonBufferManager runtime_hexbuffs altogether when the session closes, it instead moves all keys from the internal std::unordered_map<void*, std::unique_ptr<HexagonBuffer>> hexagon_buffer_map_; to an internal std::unordered_set<void*>. This avoids exposing any of the buffers to the outside, but does allow the HexagonBufferManager to identify a previously freed pointer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Lunderberg, I think your suggestion would satisfy my concern. What do you think @janetsc? This would allow us to make valid frees a nop and to catch invalid frees as well. If I am understanding correctly runtime_hexbuffs would be instantiated by the Hexagon Device API in the same way that we instantiate the rpc_hexbuffs ... HexagonBufferManager runtime_hexbuffs;? If that is the case, we could potentially have just one HeagonBufferManager that implements the API and internal state that @Lunderberg suggests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also another state not captured - the buffer manager is in an invalid state to allocate more memory.

There needs to be two buffer managers, because on ReleaseResources how do we know which buffers need to be freed?

I'm thinking that a better model is to enforce that all frees should take place before ReleaseResources. That's cleaner. I will investigate which ones are outstanding. All objects should be cleanly torn down by the time the session ends. I will file a task to investigate.

mgr->FreeHexagonBuffer(ptr);
if (runtime_hexbuffs) {
runtime_hexbuffs->FreeHexagonBuffer(ptr);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure on the value-add for this else case. The only thing that happens is that we might fail in the CHECK if we free a buffer after we release resources that was not in the released_runtime_buffers. Do we want this behavior? What happens on-system if we just get rid of this else case? We can then get rid of released_runtime_buffers and the need for HexagonBufferManager::current_allocations. We can simply ...

  if (runtime_hexbuffs) {
    runtime_hexbuffs->FreeHexagonBuffer(ptr);
  }

With some comments to explain why the "free" may happen after the "release".

Copy link
Collaborator Author

@janetsc janetsc Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous reply is related.

If there's a manager, I can just use it to call Free. If there isn't (it has been released, or maybe acquire was never called), I can check what I had allocated before Release. If that buffer is in there, I will not throw. But if there's a bug and the user is attempting to free some pointer we never knew about, I do want that to throw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making sure all "temporal" frees occur before Release is a good idea and worthwhile endeavor. ✅

I am not convinced that two HexagonBufferManager objects are required. Instead, you could have separate interfaces in a single HexagonBufferManager class to track "permanent" (for RPC e.g.) versus "temporal" allocations. And, in this way, alleviate the need for the current_allocations method which exposes the internals of the HexagonBufferManager class to its users. Something to consider for a future cleanup, perhaps.

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
…#13028)

Creates HexagonPageAllocator to manage allocations for the RPC server

Adds new Device APIs for RPC buffer management

Allocations are tracked by two separate buffer managers:

- rpc_hexbuffs is used exclusively for RPC buffers
- runtime_hexbuffs is used exclusively for runtime buffers

This will fix the throw on shutdown in the simulator.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants