-
Notifications
You must be signed in to change notification settings - Fork 35
Support for ArrayOps functionality? #39
Comments
Unfortunately that would trigger boxing twice on every operation and can cause up to an order of magnitude slow-down. Our current strategy is to re-implement all of the methods people care about ourselves. What methods are you missing? |
The most important ones would be (in order of importance):
|
|
Thank you! |
Hi Denys, I can take this into a branch and work on it in parallel while we flush out the remaining jemalloc implementation. |
@arosenberger Sounds good. Implementation of the current array macros lies in macros/Array.scala. There isn't much documentation on internals yet so let me know if something is not clear. |
@densh I'm unsure if there is a way to do the filter efficiently from avoiding either
Could be I'm missing something - what do you think? |
@arosenberger
I think the rule of thumb for scala-offheap should be if we can't implement something efficiently we should rather not have it at all rather than providing slow default implementation like standard Scala collections do. |
I've got a close to working implementation of filter that does a malloc with the length of the original array, adds elements as they pass the predicate, then a final realloc to the actual filtered element count. We can run some jmh benchmarks on different array sizes vs some baselines like on-heap array filter or e.g. offheap map. We can throw it away if it looks to be too slow. |
The problem is that it needs to work for any allocator (that is taken as an implicit parameter to all allocating methods), not just Looking forward to jmh performance comparison anyway, maybe it's not that bad in practice and the cost can be tolerated. |
A trick can be done to speed this up: allocate an array of longs, that has size of original array divided by 64, and set bit if predicate passes. Why longs? That's because you can collect a batch of 64 tests of predicate and make a single write. Additionally, you would maintain a counter of how many tests have succeeded. Than have a second pass that actually copies elements that have the bit set. You could have used array of Booleans and relied on write combining inside of the CPU but as, you execute a custom user-supplied predicate, it is very easy for writes to stop being combined. |
@DarkDimius That's an excellent idea, we need to try that. |
@DarkDimius Thanks for the suggestion, I will try that as a variant |
@arosenberger It looks like you might be forgetting to call |
@densh I cleaned this up some and got a test mostly passing. But I'm unsure as to what I'm doing wrong for the final array size/length. It is possible that the realloc is not actually shrinking the allocated memory, or more likely I've not set things correctly in the macro body. I'll add some debug code to the jni binding on my other machine and test. Gist is updated, and this test produces this output test("filter") { Element 2 matches predicate |
@arosenberger Reallocate takes number of bytes, not elements. You need to multiply computed size on sizeOf of element type. |
@densh I believe I'm doing so. This is the size argument I'm passing to reallocate: ${newArrayFinalSize.symbol} = |
@densh First pull request opened for filter. Going to switch to other requested array methods before trying to optimize filter implementation |
I'm splitting this ticket into a number of smaller actionable tickets: |
Is there the possibility / an easy way of supporting the various methods from ArrayOps?
The text was updated successfully, but these errors were encountered: