Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R session breaks when using non-ascii characters on description #127

Closed
leonardoshibata opened this issue Nov 21, 2019 · 14 comments
Closed
Assignees
Milestone

Comments

@leonardoshibata
Copy link

No description provided.

@javierluraschi
Copy link
Contributor

@leonardoshibata you have a simple example to reproduce this? Thank you!

@leonardoshibata
Copy link
Author

Hi @javierluraschi,
Sorry for not providing an example. pin(mtcars, description = "á") will do the trick.
Thanks!

@javierluraschi
Copy link
Contributor

Works for me in OS X, you in Windows? Can you share sessionInfo()?

@leonardoshibata
Copy link
Author

Yes, I am using Windows. It might be it.

> sessionInfo()

R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252    LC_MONETARY=Portuguese_Brazil.1252
[4] LC_NUMERIC=C                       LC_TIME=Portuguese_Brazil.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pins_0.2.0.0002

loaded via a namespace (and not attached):
[1] compiler_3.6.1 magrittr_1.5   tools_3.6.1 

@leonardoshibata
Copy link
Author

I just tried on a different computer (Mac) and it works without breaking the R session.

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pins_0.2.0.0002

loaded via a namespace (and not attached):
 [1] fansi_0.4.0      assertthat_0.2.1 zeallot_0.1.0    utf8_1.1.4      
 [5] crayon_1.3.4     rappdirs_0.3.1   jsonlite_1.6     backports_1.1.5 
 [9] magrittr_1.5     pillar_1.4.2     rlang_0.4.0      cli_1.1.0       
[13] rstudioapi_0.10  vctrs_0.2.0      tools_3.6.1      yaml_2.2.0      
[17] compiler_3.6.1   pkgconfig_2.0.3  tibble_2.1.3  

@javierluraschi
Copy link
Contributor

javierluraschi commented Dec 4, 2019

I was able to reproduce this crash in Windows as follows, investigating...

yaml::write_yaml(list(description = "á"), "test.yml")

Screen Shot 2019-12-04 at 3 14 19 PM

@javierluraschi
Copy link
Contributor

Unfortunately, this stopped reproducing for me once I installed additional debugging tools... looks like it's easier to reproduce on a brand new EC2 machine.

@kevinushey
Copy link

Does the crash depend on the system language / locale? Can you reproduce with a more complicated character, e.g. 鬼?

@augustohassel
Copy link

augustohassel commented Sep 17, 2020

It has also crashed for me because of an "í" on the description.

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Argentina.1252  LC_CTYPE=Spanish_Argentina.1252    LC_MONETARY=Spanish_Argentina.1252 LC_NUMERIC=C                       LC_TIME=Spanish_Argentina.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] digest_0.6.25   pins_0.4.3      log4r_0.3.2     xlsx_0.6.3      xtable_1.8-4    rmarkdown_2.3   knitr_1.29      shiny_1.5.0     sodium_1.1      mailR_0.4.1     jsonlite_1.7.0  httr_1.4.1      RJDBC_0.2-8     rJava_0.9-13   
[15] odbc_1.2.3      RMySQL_0.10.20  DBI_1.1.0       lubridate_1.7.9 forcats_0.5.0   stringr_1.4.0   dplyr_1.0.0     purrr_0.3.4     readr_1.3.1     tidyr_1.1.0     tibble_3.0.3    ggplot2_3.3.2   tidyverse_1.3.0

loaded via a namespace (and not attached):
 [1] fs_1.4.2          usethis_1.6.1     devtools_2.3.0    bit64_0.9-7.1     filelock_1.0.2    rprojroot_1.3-2   tools_4.0.2       backports_1.1.8   R6_2.4.1          colorspace_1.4-1  withr_2.2.0       tidyselect_1.1.0  prettyunits_1.1.1
[14] processx_3.4.3    curl_4.3          bit_1.1-15.2      compiler_4.0.2    cli_2.0.2         rvest_0.3.5       xml2_1.3.2        desc_1.2.0        scales_1.1.1      callr_3.4.3       rappdirs_0.3.1    R.utils_2.9.2     pkgconfig_2.0.3  
[27] htmltools_0.5.0   sessioninfo_1.1.1 dbplyr_1.4.4      fastmap_1.0.1     rlang_0.4.7       readxl_1.3.1      rstudioapi_0.11   generics_0.0.2    R.oo_1.23.0       magrittr_1.5      Rcpp_1.0.5        munsell_0.5.0     fansi_0.4.1      
[40] lifecycle_0.2.0   R.methodsS3_1.8.0 stringi_1.4.6     yaml_2.2.1        pkgbuild_1.1.0    grid_4.0.2        blob_1.2.1        promises_1.1.1    crayon_1.3.4      haven_2.3.1       xlsxjars_0.6.1    hms_0.5.3         ps_1.3.3         
[53] pillar_1.4.6      pkgload_1.1.0     reprex_0.3.0      glue_1.4.1        evaluate_0.14     remotes_2.1.1     modelr_0.1.8      vctrs_0.3.2       httpuv_1.5.4      testthat_2.3.2    cellranger_1.1.0  gtable_0.3.0      assertthat_0.2.1 
[66] xfun_0.15         mime_0.9          broom_0.7.0       later_1.1.0.1     memoise_1.1.0     ellipsis_0.3.1 

@kevinushey
Copy link

I can now reproduce the crash previously reported by @javierluraschi -- simply running

yaml::write_yaml(list(description = "á"), "test.yml")

will cause R to crash (regardless of whether you're in RStudio or RGui).

It appears to be encoding related; e.g.

# ok
data <- list(description = enc2utf8("á"))
yaml::write_yaml(data, "test.yml")

# barf
data <- list(description = enc2native("á"))
yaml::write_yaml(data, "test.yml")

@kevinushey
Copy link

Filed an issue for the yaml package here: vubiostat/r-yaml#90

We'll have to see if there's some places in RStudio where we're not handling the encoding correctly.

@kevinushey
Copy link

For the pins package, this error could be avoided by ensuring all strings are converted to UTF-8 before calling as.yaml() or write_yaml().

@javierluraschi javierluraschi added this to the 0.5.0 milestone Dec 9, 2020
@javierluraschi
Copy link
Contributor

We should fix this one since the fix is really easy and safe, and the outcome of hitting this pretty bad.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Aug 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants