Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arrow::write_feather error: Capacity error: array cannot contain more than 2147483646 bytes #8732

Closed
jangorecki opened this issue Nov 21, 2020 · 4 comments

Comments

@jangorecki
Copy link

jangorecki commented Nov 21, 2020

I tried to log in to my existing account on jira but I am keep getting this error

Sorry, your userid is required to answer a CAPTCHA question correctly.

thus reporting issue here.

I cannot write arrow file due to the error about array limitation. Reproducible example in R:

library(arrow)
d = data.frame(id1 = factor(paste0("id",1:4e8)))
write_feather(d, "data.feather")
#Error in Table__from_dots(dots, schema) : 
#  Capacity error: array cannot contain more than 2147483646 bytes, have 2147483649

arrow 2.0.0
R 4.0.3

Ubuntu 18.04
kernel 5.4.0

@nealrichardson
Copy link
Member

Thanks for the report, and that's odd about your ASF jira account--maybe password reset would fix it?

The issue is that the R package isn't chunking the data.frame when converting to Arrow--that is, it's not a limitation of Arrow/Feather format but just of the R package as it stands now. We're working on it (https://issues.apache.org/jira/browse/ARROW-9293 among others) and hope to have some improvements in the next release.

If you're interested, you may be able to work around this now by doing the chunking yourself, something like

write_feather(Table$create(d[1:2e8, , drop = FALSE], d[2e8 + 1:2e8, , drop = FALSE]), "data.feather")

@nealrichardson
Copy link
Member

You could also generate the data from Python (pyarrow) and I believe it would handle the chunking correctly.

@ekt-dar
Copy link

ekt-dar commented Apr 13, 2023

Any news on this issue?

@wbelzak
Copy link

wbelzak commented Jan 5, 2024

Bumping b/c I am having the same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants