-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-236: Bridging IO interfaces under the hood in pyarrow #104
Conversation
…ow::io::RandomAccessFile Change-Id: I7982556541a60ca03b3064a333b207fd45e323c3
Change-Id: I86e43b42582276302332eb3c61afffd6f7187c40
@@ -99,7 +106,7 @@ class ARROW_EXPORT FileReader { | |||
virtual ~FileReader(); | |||
|
|||
private: | |||
class Impl; | |||
class PARQUET_NO_EXPORT Impl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be ARROW_NO_EXPORT
?
…apping C++ file interfaces Change-Id: I4a3d0c4d2a763abb02ca546df35b9556f1060c0e
Change-Id: I3f6329e159431df781959d6266b4e016e4f6fa2c
Miraculously, I was able to get this working mere minutes before my talk at Data Science Summit. Let me know any comments on the general approach. |
Change-Id: I8e3a1f90907357d138d875b2761a7833b069b86f
The general approach looks good, +1. |
Thanks -- I'll get the build passing and merge. I am not able to read very many flat Parquet files right now (for example: Impala's Parquet files do not have the UTF8 annotation for strings, similarly timestamps are stored in Int96), so will create a bunch of JIRAs to track these. In the absence of a separate metadata (like the Hive metastore), we'll have to make some default guesses about the actual schema and eventually provide some options to set the column logical types explicitly when there is ambiguity. |
Change-Id: I2df54d0dc25457055011cd8a2b798fc28b1640d1
+1 |
…Bytes of ArrowBuf (apache#104) We have BOUNDS_CHECKING_SKIP in ArrowBuf.setByte or ArrowBuf.getByte, it helps to remove unexpected bounds checks. However, it doesn't exists in ArrowBuf.setBytes or ArrowBuf.getBytes, which makes 10% cpu time cost for checking bounds in our environment. Closes apache#13161 from jackylee-ch/skip_bounds_check_for_set_or_get_bytes Authored-by: stczwd <[email protected]> Signed-off-by: David Li <[email protected]> Co-authored-by: stczwd <[email protected]>
No description provided.