-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fallback when resize!
does not work
#31
Conversation
Codecov Report
@@ Coverage Diff @@
## master #31 +/- ##
==========================================
- Coverage 98.59% 97.63% -0.97%
==========================================
Files 7 7
Lines 142 211 +69
==========================================
+ Hits 140 206 +66
- Misses 2 5 +3
Continue to review full report at Codecov.
|
* fix typo in identitymap * generate undef Vectors * tests: clean up and add new
@cako , thanks for working on this. This is not a final opinion, but one thing I don't like about the catch-try block is that it may silently switch to the catch-case, because there is something wrong with the code that the user would want to (or should) know. Hardcoding error catching like this has a somewhat unpleasant flavor. I acknowledge, though, the following (mis-)behavior:
fails for the same reason. So |
I think something like this can detect whether an _flags(A::Array) = unsafe_load(convert(Ptr{UInt16}, pointer_from_objref(A)), 9)
_isshared(A::Array) = !(_flags(A) & 0x4000 == 0x0000) I am however a bit worried if this depends on the architecture (i.e. 32 vs 64 bits), in particular the value 9 in the first function will probably have to be different on 32 bit. So probably it needs to be if Sys.WORD_SIZE == 64
_flags(A::Array) = unsafe_load(convert(Ptr{UInt16}, pointer_from_objref(A)), 9)
elseif Sys.WORD_SIZE == 32
_flags(A::Array) = unsafe_load(convert(Ptr{UInt16}, pointer_from_objref(A)), 5)
else
error("Unknown word size")
end
_isshared(A::Array) = !(_flags(A) & 0x4000 == 0x0000) But I am not sure and don't have a 32 bit machine to test this. |
That sounds like a plan: detecting the specific |
Another way out is that we pass |
Thanks for the comments, I think we are making progress! @dkarrasch I think in general it is definitely better if we can use something other than try/catch, but in this case it might be warranted. See the end of this comment for my reasoning! @Jutho let me see if I got this right: every Julia convert(Ptr{UInt16}, pointer_from_objref(A)) Now, this pointer is the memory location of This is where I start getting confused, in 32 bits, With regard to speed, I ran your solution against the previous benchmarks, and this is what I got: The conclusion is that the both try/catch and flag-checking have virtually the same performance of a bare In terms of "cleanness" although it is a matter of opinion, I quite like the try/catch block. Although in general all-inclusive try/catch blocks are not great as @dkarrasch has pointed out, in this case I think it make sense to use it. The alternative solution provided by @Jutho hit a snag as pointed out by @chethega in issue 24909 (namely, we don't know if it is possible to estimate Let me know what you think. On a separate note, if you'd like, I can start writing my example as a test and add the commit to this pull request. |
Regardless of how we're going to fix this issue, I think adding the test would be good. This is a generic problem that we would want to be able to solve. Unfortunately, there is not much to be learned from the implementation of |
@cako, the numbers 5 and 9 are measured in units of But I agree that this approach is too fragile and clearly advised against. The |
@dkarrasch cool, I will see if I can clean up/simplify the test and commit it. As for the benchmarks, you can find them here. Basically I write functions which allocate a size 1_000 array, then resize, with optional try/catch or flag testing. I benchmark that function, then benchmark only allocating the size 1_000 array, and subtract the latter from the former. That gives me the benchmarks I used for the graphs above. I've run it a few times and I reliably get slighly worse performance for small arrays (<1000), and essentially the same performance for larger arrays. @Jutho and @dkarrasch Perhaps a compromise is catch only the shared data error: try
resize!(dest, size(A.maps[n], 1))
catch err
if err == ErrorException("cannot resize array with shared data") # use `==` instead of `isa` because `resize!` throws a general exception
dest = Array{T}(undef, size(A.maps[n], 1))
else
rethrow(err)
end
end |
Yes, I think that's a good compromise. In my own (unrelated to this package) code, I used try-catch-blocks (perhaps in a bad way), and had difficulties of finding mistakes after changes in the code, and just got |
Thanks @cako for the nice tests. I made some minor changes. The transpose/adjoint of a CompositeMap is a CompositeMap again, and therefore gets handled by the |
Since Github still states: "This branch has no conflicts with the base branch" I think this can still safely be merged in master and all edits (commits) will survive. But we can also try to rebase, this shows up even on Github, if you fold out the "Squash and merge" drop down menu. |
I wasn't sure what is meant by "base branch", if it's our current master or the one where @cako branched off. I'll simply squash, merge, and tag a new fix release 2.2.2 then. |
Alternatively, @cako could rebase locally on the latest master and then push (maybe push -f for forced) here |
@dkarrasch Thanks for fixing the transpose/adjoints, I couldn't figure out how to do it! I can do the rebase locally, but only later today! |
No worries. It's a bit tricky (and in fact, I was believing up until recently that those methods are not needed at all ;-) ). The rebase is mainly to have all tests together, let code coverage run again and get good grades. |
I have been thinking about the issue #24 (using
resize!
within the composition operator) and have come to the conclusion that users should not be expected to always provide "clean" arrays which can be resized. Take my case for example, I had to essentially change my function by adding a couple ofcopy
commands to ensureresize!
was working. I was only able to do this after guidance. In fact, I have a more complex example (which I won't share because it is pretty long and specific) that I simply cannot get to work with the current code. Copying data both at the input and at the output do not work. I have another (somewhat contrived) example demonstrates how this can happen. It can also be used for code coverage if the commit is accepted.So either we can accept that some people's code will mess with the array in such a way that it cannot be resized, or we make them accept the fact that the library is not for them. I don't think the latter is beneficial for anyone!
On the other hand, I fully appreciate that
resize!
is much faster than array creation. I benchmarked resizing an array of size1000
to several different sizes, ranging from100
to100_000
. The cost of only the resizing is in orange, while the cost of allocating from scratch is shown in green.Clearly allocating from scratch grows linearly, while resizing may eventually be linear as well, but it is offset by how much has already been allocated.
So I propose a very simple solution that I think will make everyone happy: we resize when we can, and when we can't, we allocate. I have added a simple try/catch block to do this. As shown in the figure above, this "fix" (blue line) retains decent performance for larger arrays, and is still much better than allocating for smaller arrays. It also has the obvious benefit of not breaking anyone's code!
The commits are very simple, simply wrap all
resize!
calls with a try/catch block which in the catch statement allocates an array.