-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(r): Add bindings to IPC writer #608
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Left some notes, feel free to ignore or open separate issues. I'll note that I couldn't do a thorough job reviewing the C parts of this.
@@ -107,6 +109,42 @@ read_nanoarrow.connection <- function(x, ..., lazy = FALSE) { | |||
check_stream_if_requested(reader, lazy) | |||
} | |||
|
|||
#' @rdname read_nanoarrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This puts write_nanoarrow
on read_nanoarrow
's page which either needs to be updated or maybe write_nanoarrow
just needs its own help page. I'm happy to contribute the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a basic level of inclusion on the read page but feel free to contribute something better!
r/src/ipc.c
Outdated
static SEXP call_writebin(void* hdata) { | ||
struct ConnectionInputStreamHandler* data = (struct ConnectionInputStreamHandler*)hdata; | ||
|
||
// Write 16MB chunks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does the 16MB value come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note about this...it's somewhere between a big value (to reduce the number of R calls) and a small value (to minimize the copying/ensure there's an R call every once in a while to catch an interrupt).
r/src/ipc.c
Outdated
|
||
int code = ArrowIpcWriterInit(writer, output_stream); | ||
if (code != NANOARROW_OK) { | ||
Rf_error("ArrowIpcWriterInit() failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a more useful error message be produced here? It looks like code is an errno-style error so it seems like it'd be possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the errno value! Pretty much all that can happen here is a failure to allocate (rare, since it's a few bytes). I never quite got to adding a macro like ERROR_NOT_OK()
to help with this.
This PR adds a basic level of support for IPC writing in the R package. This is basically a thin wrapper around
ArrowIpcWriterWriteStream()
and could be more feature-rich like the Python version (that allows schemas and batches to be written individually).I also added a bit of code to handle interrupts (which should catch interrupts on read and write and wasn't handled before).
Created on 2024-09-14 with reprex v2.1.1