-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Go] Random segmentation faults when calling Read() on a pqarrow.RecordReader #29
Comments
Hi @reiades, thanks for opening this issue and sharing so much detail. Can you share more about how often this issue occurs? If there's a certain number of iterations or level of concurrency at which it can be reliably reproduced, it will be much easier to isolate the problem. |
Hi @joellubi, thanks for responding! It is pretty tough to reproduce as it happens very randomly. The above code gets run every minute in a single go routine (side note: I know that seems excessive and unneeded to do if the file is the exact same each time - it definitely is, but there are other requirements at play). I see this segfault any where from 0 to 4 times a day. I will try to think of other ways I can reproduce the problem because I know that's probably not very helpful :( I am trying to iterate through |
Ok got it, I'll take a look and see if I have any luck reproducing. One more question, what OS and architecture are you seeing this on? |
linux and arm64 - thank you so much! |
Hi @reiades. I haven't had any luck reproducing this yet. Do you have a sample parquet file you can share that you know has had this issue? If that's not possible, could you share the schema, numRows, any encodings used, etc? |
Hello - sorry I have been a bit busy so had to put this aside. I don't think I will be able to share a sample parquet file but can tell you more about the schema, num rows, and encodings used. I will get back to you; thanks again for your help! |
@reiades I've been skimming through the open Issues here and saw this one. Just wanted to poke you to see if you can get back with any information that might help us reproduce this issue so we can address and fix it. |
Hello!
I am currently using
github.com/apache/arrow/go/v16/parquet
to read the records of a downloaded s3 parquet file (75KB, stored inbytes.Buffer
). My implementation is the following:I am reading the same file each time and majority of the reads into
rec
are successful. However, on occasion, I get a segmentation fault inside ofrr.Read()
. I have confirmed that the file is successfully downloaded each time and thatbuf.Bytes()
is the same on successful and failed reads. I have also confirmed that I can get the schema from the file on successful and failed reads which leads me more to believe something is happening inside theRecordReader
.Here are some logs from the stack trace that I thought could be helpful for debugging.
It seems that the segmentation fault is happening inside of
(*recordReader).next
so was curious if anyone familiar with this library had some insight on why this was happening. I can share a longer stack trace if that would be helpful. I am also using v16 but saw the same error in v13 as well. Thanks in advance!Component(s)
Go
The text was updated successfully, but these errors were encountered: