You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You're right that we don't expose this very well (i.e at all) via Arrow.Table right now; but using Arrow.Stream gives you back an iterator of Arrow.Table for each record batch. But we could probably also expose a way via Arrow.Table to let you get the individual tables. Something to think about, or at least improve in the docs mentioning Arrow.Stream.
we don't have to do any Python implementation says, that's specifically for Python. A batch is a well defined thing in file format, independent of which implementation we're talking about, it's purely a logical question of how do we get there given the schema / metadata and what's the interface for user
Maybe related to #353
It is already possible to use Tables.partitioner to write record batches to a single Arrow file. However, when I read that file with Arrow.Table I do not know how to access a specific record batch like here: https://arrow.apache.org/docs/java/ipc.html#writing-and-reading-random-access-files
According to the docs, this should be possible but I am not sure if that is not implemented yet or simply not documented.
The text was updated successfully, but these errors were encountered: