-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] Using dplyr::tally with an Arrow FileSystemDataset crashes R #33807
Comments
This might be a clue
|
Hi @ablack3, thanks for reporting this! I haven't been able to reproduce this myself, though I am using Ubuntu 22.04 and not macOS. You could get more verbose output by attaching the C++ debugger before running R via the instructions here: https://arrow.apache.org/docs/dev/r/articles/developers/debugging.html Can you show me the output of running |
It might be useful to see the output of |
Sorry for the delay. Here is arrow::arrow_info()
#> Arrow package version: 11.0.0.2
#>
#> Capabilities:
#>
#> dataset TRUE
#> substrait FALSE
#> parquet TRUE
#> json TRUE
#> s3 TRUE
#> gcs TRUE
#> utf8proc TRUE
#> re2 TRUE
#> snappy TRUE
#> gzip TRUE
#> brotli TRUE
#> zstd TRUE
#> lz4 TRUE
#> lz4_frame TRUE
#> lzo FALSE
#> bz2 TRUE
#> jemalloc TRUE
#> mimalloc TRUE
#>
#> Memory:
#>
#> Allocator mimalloc
#> Current 0 bytes
#> Max 0 bytes
#>
#> Runtime:
#>
#> SIMD Level sse4_2
#> Detected SIMD Level sse4_2
#>
#> Build:
#>
#> C++ Library Version 11.0.0
#> C++ Compiler AppleClang
#> C++ Compiler Version 10.0.0.10001145 Created on 2023-02-16 with reprex v2.0.2 Thanks for the debugging instructions @thisisnic.
I am on an M1 running a x86_64 version of R. I used to run the arm version of R but had issues with odbc drivers not working with arm so had to move my R installation to x86_64 via Rosetta. |
Thank you for this! I am guessing that whatever runtime detection mechanism we're using might not be working with rosetta. Do we know if there's any way to force Arrow to pretend that SIMD doesn't exist at runtime? |
You can try and set the environment variable |
This is still crashing R on my machine. I'm using arrow v11.0.0.2
|
We ran into something like this a few times at @thisisnic and @stephhazlitt 's workshop. What happened was that some folks using Apple ARM-based machines were using R built for x86 (running under Rosetta emulation), and therefore received Arrow package binaries intended for x86, which will crash with illegal op codes. R has had native builds for R for a long time now (and there are native ARM builds for arrow which work well), so if people are using ARM-based macs, we recommend installing native R and native arrow. I will also send a PR shortly that adds a detection + warning on package load for arrow if we detect this so that folks know that they should run native R and things will work fine. |
) Resolves #33807 and #37034 ### Rationale for this change If someone is running R under emulation, arrow segfaults without error. We can detect this when we load so can also warn people that this is not recommended. Though the version of R being run is not directly an arrow issue, arrow fails very quickly in this configuration. ### What changes are included in this PR? Detect when running under rosetta (on macOS only) and warn when the library is attached ### Are these changes tested? No, given the paucity of ARM-based mac CI, testing this organically would be difficult. But the logic is straightforward. ### Are there any user-facing changes? Yes, a warning when someone loads arrow under emulation. * Closes: #33807 Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
apache#37777) Resolves apache#33807 and apache#37034 ### Rationale for this change If someone is running R under emulation, arrow segfaults without error. We can detect this when we load so can also warn people that this is not recommended. Though the version of R being run is not directly an arrow issue, arrow fails very quickly in this configuration. ### What changes are included in this PR? Detect when running under rosetta (on macOS only) and warn when the library is attached ### Are these changes tested? No, given the paucity of ARM-based mac CI, testing this organically would be difficult. But the logic is straightforward. ### Are there any user-facing changes? Yes, a warning when someone loads arrow under emulation. * Closes: apache#33807 Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
apache#37777) Resolves apache#33807 and apache#37034 If someone is running R under emulation, arrow segfaults without error. We can detect this when we load so can also warn people that this is not recommended. Though the version of R being run is not directly an arrow issue, arrow fails very quickly in this configuration. Detect when running under rosetta (on macOS only) and warn when the library is attached No, given the paucity of ARM-based mac CI, testing this organically would be difficult. But the logic is straightforward. Yes, a warning when someone loads arrow under emulation. * Closes: apache#33807 Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
The following code snippet crashes R. I'm using arrow 10.0.1
Platform information
Component(s)
R
The text was updated successfully, but these errors were encountered: