-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SharedCache] Split state into initial, loaded, and modified #6393
base: dev
Are you sure you want to change the base?
Conversation
The initial state is initialized during `PerformInitialLoad` and is immutable after that point. This required some slight restructuring of how information about memory regions is tracked as that was previously modified as regions were loaded. Memory regions are now stored in a map from their address range to the `MemoryRegion` object. This makes it cheap to look them up by address which is a common operation. The modified state consists of changes since the last save to the `DSCView` / `ViewSpecificState`. This means it is no longer necessary to copy any state when mutating a `SharedCache` instance for the first time. Instead, its data structures start off empty and are populated as images, sections, or symbol information is loaded. The loaded state consists of all modified state that has since been saved. It lives on the `ViewSpecificState`. Saving modified state merges it into the the existing loaded state. This pattern is carried over to the `Metadata` stored on the `DSCView`. The initial state is stored under its own metadata key, and each modified state is stored under a key with an incrementing number. This means each save of the state only needs to serialize the state that changed, rather than reserializing all of the state all of the time. There are two huge benefits from these changes: 1. At no point does `SharedCache` have to copy its in memory state. The basic copy-on-write approach introduced in Vector35#6129 reduced how often these copies are made, but they're still frequent and very expensive. 1. At no point does `SharedCache` have to re-serialize state to JSON that it has already serialized. JSON serialization previously added hundreds of milliseconds to any mutating operation on `SharedCache`. As a result, this speeds up the initial load of the shared cache by around 2x and loading of subsequent images improves by about the same. One trade-off is that the serialization / deserialization logic is more complicated. There are two reasons for this: 1. The state is now split across multiple metadata keys and needs to be merged when it is loaded. 2. The in-memory representation uses pointers to identify memory regions. These relationships have to be re-established after the JSON is deserialized. As a future direction it is worth considering whether the logic owned by `SharedCache` could be split in a similar manner to the data. The initial loading of the cache header, loading of images, and handling of symbol information are all mostly independent and work on separate data. If the logic were split into separate classes it would be easier to reason about which data is valid when, and would easily permit concurrent loading of multiple images from the shared library in a thread-safe manner.
I'm a pretty big fan of everything this proposes including the next steps re: splitting the logic into separate classes. |
Sounds great and certainly the way things needed to go. This is more for the future; I haven't had time to dive into the actual commit so this may not be an issue but when it comes to concurrent loading of images I think further changes will be required. An issue I've had there is the UI blocking for a long time. I came to the realization that whilst it may be a Binary Ninja issue, I think the solution is to be using the bulk add segments/memory region APIs to prevent the UI from trying to update an unnecessary number of times during concurrent loading. Whether that will prevent the UI from locking up entirely or not, I don't know but it should help as far as I can tell. The problem is that I believe it will require doing a little bit of re-writing of how images are loaded because sections can't be added until the end of the bulk add API is called. So concurrent loading of the segments/memory regions would have to occur first, then the adding of sections and post processing of them could be done. Thats what I gathered when I tried to wrap concurrent image loading with bulk add segments/memory region API calls. |
In the short term this makes concurrent loading slightly less practical as it uses a single lock within As we've talked about, the existing locking is inconsistent and has correctness issues. Some of these are a result of my change that introduced COW. I felt it was best to make the locking clearly correct for now with the goal of moving to finer-grained locking when functionality is split out of A next step to support concurrent image loading could be to move responsibility for loading a single image (and the state associated with that) into a separate type. If adding segments and sections proves to be a bottleneck then batching of mutations could be accommodated by splitting the responsibility slightly further. The image loader could populate state local to the shared cache plug-in, with a separate step handling applying the results of completed image loads to the view. Honestly, that'd probably be a win for clarity as well. That said, I'm not sure whether adding segments and sections is still a bottleneck after the recent changes to how |
NOTE: I'm sending this as a draft for sake of discussion. This is complete, but would ideally be split into smaller changes for sake of easier review. I've had it in my tree for six weeks now and haven't found the energy to split it up. I figured I'd check if there are any major objections to this approach before I spend the time trying to split it up.
The bottom line here is that this change:
SharedCache
's state.SharedCache
's state to JSON.The end result is a > 2x performance improvement on loading images from the shared cache, with a clear path towards safely loading images in parallel.
@0cyn @WeiN76LQh
The initial state,
CacheInfo
, is initialized duringPerformInitialLoad
and is immutable after that point. This required some slight restructuring of how information about memory regions is tracked as that was previously modified as regions were loaded. Memory regions are now stored in a map from their address range to theMemoryRegion
object. This makes it cheap to look them up by address which is a common operation.The modified state,
ModifiedState
, consists of changes since the last save to theDSCView
/ViewSpecificState
. This means it is no longer necessary to copy any state when mutating aSharedCache
instance for the first time. Instead, its data structures start off empty and are populated as images, sections, or symbol information is loaded.The loaded state,
State
, consists of all modified state that has since been saved. It lives on theViewSpecificState
. Saving modified state merges it into the existing loaded state.State
andModifiedState
are almost identical in what they contain, but are separate types to make it clearer which contains all state vs just modifications.This pattern is carried over to the
Metadata
stored on theDSCView
. The initial state is stored under its own metadata key, and each modified state is stored under a key with an incrementing number. This means each save of the state only needs to serialize the state that changed, rather than reserializing all of the state all of the time.There are two huge benefits from these changes:
SharedCache
have to copy its in memory state. The basic copy-on-write approach introduced in [SharedCache] Use basic copy-on-write for viewStateCache #6129 reduced how often these copies are made, but they're still frequent and very expensive. This eliminates the 🐄.SharedCache
have to re-serialize state to JSON that it has already serialized. JSON serialization previously added hundreds of milliseconds to any mutating operation onSharedCache
.As a result, this speeds up the initial load of the shared cache by around 2x and loading of subsequent images improves by about the same.
One trade-off is that the serialization / deserialization logic is more complicated. There are two reasons for this:
As a future direction it is worth considering whether the logic owned by
SharedCache
could be split in a similar manner to the data. The initial loading of the cache header, loading of images, and handling of symbol information are all mostly independent and work on separate data. If the logic were split into separate classes it would be easier to reason about which data is valid when, and would easily permit concurrent loading of multiple images from the shared library in a thread-safe manner.