ARROW-236: Bridging IO interfaces under the hood in pyarrow #104

wesm · 2016-07-12T07:50:44Z

No description provided.

…ow::io::RandomAccessFile Change-Id: I7982556541a60ca03b3064a333b207fd45e323c3

Change-Id: I86e43b42582276302332eb3c61afffd6f7187c40

xhochy · 2016-07-12T19:42:36Z

cpp/src/arrow/parquet/reader.h

@@ -99,7 +106,7 @@ class ARROW_EXPORT FileReader {
  virtual ~FileReader();

 private:
-  class Impl;
+  class PARQUET_NO_EXPORT Impl;


This should be ARROW_NO_EXPORT?

…apping C++ file interfaces Change-Id: I4a3d0c4d2a763abb02ca546df35b9556f1060c0e

Change-Id: I3f6329e159431df781959d6266b4e016e4f6fa2c

wesm · 2016-07-12T23:21:43Z

Miraculously, I was able to get this working mere minutes before my talk at Data Science Summit. Let me know any comments on the general approach.

Change-Id: I8e3a1f90907357d138d875b2761a7833b069b86f

xhochy · 2016-07-13T21:43:09Z

The general approach looks good, +1.

wesm · 2016-07-15T14:24:19Z

Thanks -- I'll get the build passing and merge. I am not able to read very many flat Parquet files right now (for example: Impala's Parquet files do not have the UTF8 annotation for strings, similarly timestamps are stored in Int96), so will create a bunch of JIRAs to track these.

In the absence of a separate metadata (like the Hive metastore), we'll have to make some default guesses about the actual schema and eventually provide some options to set the column logical types explicitly when there is ambiguity.

Change-Id: I2df54d0dc25457055011cd8a2b798fc28b1640d1

Change-Id: Icf2093b4a379bf159b3b1ecce119c7fde77c96ef

wesm · 2016-07-18T22:37:05Z

+1

…Bytes of ArrowBuf (apache#104) We have BOUNDS_CHECKING_SKIP in ArrowBuf.setByte or ArrowBuf.getByte, it helps to remove unexpected bounds checks. However, it doesn't exists in ArrowBuf.setBytes or ArrowBuf.getBytes, which makes 10% cpu time cost for checking bounds in our environment. Closes apache#13161 from jackylee-ch/skip_bounds_check_for_set_or_get_bytes Authored-by: stczwd <[email protected]> Signed-off-by: David Li <[email protected]> Co-authored-by: stczwd <[email protected]>

wesm added 2 commits July 11, 2016 23:54

Implement alternate ctor to construct parquet::FileReader from an arr…

e6724de

…ow::io::RandomAccessFile Change-Id: I7982556541a60ca03b3064a333b207fd45e323c3

Provide a means to expose abstract native file handles

c7a913e

Change-Id: I86e43b42582276302332eb3c61afffd6f7187c40

xhochy reviewed Jul 12, 2016
View reviewed changes

wesm added 2 commits July 12, 2016 13:52

Slight refactoring of read table to be able to also handle classes wr…

06ddd06

…apping C++ file interfaces Change-Id: I4a3d0c4d2a763abb02ca546df35b9556f1060c0e

Barely working direct HDFS-Parquet reads

9b9d94d

Change-Id: I3f6329e159431df781959d6266b4e016e4f6fa2c

wesm changed the title ~~WIP ARROW-236: Bridging IO interfaces under the hood in pyarrow~~ ARROW-236: Bridging IO interfaces under the hood in pyarrow Jul 12, 2016

Do not let Parquet close an Arrow file

94bcd30

Change-Id: I8e3a1f90907357d138d875b2761a7833b069b86f

wesm added 2 commits July 16, 2016 20:39

Check in io.pxd

f2cd77f

Change-Id: I2df54d0dc25457055011cd8a2b798fc28b1640d1

cpplint

73648e0

Change-Id: Icf2093b4a379bf159b3b1ecce119c7fde77c96ef

asfgit closed this in 59e5f98 Jul 18, 2016

wesm deleted the ARROW-236 branch July 18, 2016 23:25

paleolimbot mentioned this pull request Jan 28, 2023

[R] Crash on MacOS (x86) when running tests with homebrew apache-arrow also installed #33903

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARROW-236: Bridging IO interfaces under the hood in pyarrow #104

ARROW-236: Bridging IO interfaces under the hood in pyarrow #104

wesm commented Jul 12, 2016 •

edited

Loading

xhochy Jul 12, 2016

wesm commented Jul 12, 2016

xhochy commented Jul 13, 2016

wesm commented Jul 15, 2016

wesm commented Jul 18, 2016

ARROW-236: Bridging IO interfaces under the hood in pyarrow #104

ARROW-236: Bridging IO interfaces under the hood in pyarrow #104

Conversation

wesm commented Jul 12, 2016 • edited Loading

xhochy Jul 12, 2016

Choose a reason for hiding this comment

wesm commented Jul 12, 2016

xhochy commented Jul 13, 2016

wesm commented Jul 15, 2016

wesm commented Jul 18, 2016

wesm commented Jul 12, 2016 •

edited

Loading