You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@jorisvandenbossche and I have been looking at Cythonizing some of GeoPandas. This has resulted in us having an object that holds onto a numpy array of pointers to C-level Geometry objects. We would maybe like to include this numpy-like object as a column in a Pandas dataframe. However, we would still like to hold onto the object, and not have these pointers just join an integer block in the block manager. This is useful for things like garbage collection on the C side, odd indexing rules, etc..
My intuition says that putting a numpy-like object into a Pandas dataframe without it being coerced into part of a numpy array is probably not feasible with present-day Pandas, but I thought I'd check first just in case. We have backup plans if this isn't feasible, so it's not a big deal either way.
The text was updated successfully, but these errors were encountered:
AFAICT this would require implementing a new Block type and patching BlockManager to recognize it. I've been toying with something similar but have shied away from it as being "too internal".
@gfyoung where does this sort of thing lie on a scale of "What's The Worst That Could Happen" to "You're Gonna Have A Bad Time"?
@jbrockmendel : Working with internals is not going to be a cakewalk. That being said, you're not developing in production, so "what's the worst that could happen?" 😄
I wouldn't worry about it being "too internal" - as long as you can surface it in the end with the desired behavior, that's all that counts.
Let's close this in favor of #17144 (allow external Blocks), as I think that is the only option to include non-numpy arrays in a DataFrame (and the one we are using now in geopandas)
@jorisvandenbossche and I have been looking at Cythonizing some of GeoPandas. This has resulted in us having an object that holds onto a numpy array of pointers to C-level Geometry objects. We would maybe like to include this numpy-like object as a column in a Pandas dataframe. However, we would still like to hold onto the object, and not have these pointers just join an integer block in the block manager. This is useful for things like garbage collection on the C side, odd indexing rules, etc..
My intuition says that putting a numpy-like object into a Pandas dataframe without it being coerced into part of a numpy array is probably not feasible with present-day Pandas, but I thought I'd check first just in case. We have backup plans if this isn't feasible, so it's not a big deal either way.
The text was updated successfully, but these errors were encountered: