-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: structure removal order for unions #3397
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pfackeldey - Great! Thanks for providing the fix so quickly! Indeed, iterating over the unique index fixes the issue.
this PR does however make the assumption that the number of unique tags corresponds to the |
@pfackeldey just reading the comments and not the diff -- import awkward as ak
import numpy as np
x = ak.Array([1, "hi", 2, "bye"])
y = x[[0,2]]
assert np.unique(y.layout.tags).size == len(y.layout.contents) |
Ah I see. In this case this implementation is still working correct though: # with this PR
x = ak.Array([1, "hi", 2, "bye"])
y = x[[0,2]]
ak.ravel(y)
# <Array [1, 2] type='2 * int64'> on main this actually fails, which I would consider wrong aswell (so this PR might fix another bug aswell?): x = ak.Array([1, "hi", 2, "bye"])
y = x[[0,2]]
ak.ravel(y)
# ... > AssertionError: cannot merge NumpyArray with ListOffsetArray The loop over the unique ordered tags will pick (in order) the contents where tags point to. This is always less or equal to the length of contents. Do you see any problem with that @agoose77 ? |
@pfackeldey the original code has two bugs (which are the same kind of bug)! This PR fixes one, but not the other I think. The intention of For trivial-unions (e.g. a union of integer and bool), the proper way to flatten the layout in an order-preserving way is to write a kernel that fills a new buffer with Moreover, as we are dealing with a union, to remove the structure probably means we want to simplify a union over the results such that the union disappears and the result is simplified to a common type. So, maybe the proper way to do this is to build a new union via But, I also feel compelled to say -- |
ff36fdf
to
d5b2d9c
Compare
@pfackeldey and @agoose77 - please, check when you have time. I'm sure, it's not the most efficient implementation - I just want to make sure it is correct and I'm not missing anything. Thanks! |
…kit-hep/awkward into pfackeldey/fix_ravel_order_uniontags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the right direction 👍, but the code won't be compatible with typetracers as is right now. I'm not sure though how to fix this properly - I'll give this some thoughts...
We should make sure that the following works (this is on main):
import awkward as ak
a = ak.Array([[1, 2, 3], 4, [5, 6], 7, 8], backend="typetracer")
ak.ravel(a)
# <Array-typetracer [...] type='## * int64'>
@pfackeldey - #3409 should fix the failing tests. It will be auto-merged soon. This PR will need to be re-based. |
Fixes: #3180