Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as.yaml() causing crash in some cases #21

Closed
sckott opened this issue Jan 14, 2015 · 6 comments
Closed

as.yaml() causing crash in some cases #21

sckott opened this issue Jan 14, 2015 · 6 comments
Labels

Comments

@sckott
Copy link

sckott commented Jan 14, 2015

We are getting an error with using as.yaml() in some cases. In the rfigshare package (https://github.com/ropensci/rfigshare), here's the issue for this bug ropensci-archive/rfigshare#88

To reproduce this you can do

install.packages("rfigshare")
library("rfigshare")
fs_search(query="Boettiger")

Some html character strings are causing this problem, e.g., this string

string <- "<p>The membrane surface is depicted in the presence of one, two, or three bonds with a rigid substrate (xy- and z- coordinates are not to scale); <i>\xc3<sub>s</sub></i>\n=\n1000 pN/nm, <i>l<sub>g</sub></i>\n=\n45 nm, <i>\xc3<sub>g</sub></i>\n=\n0.01 pN/nm. The inlays are the corresponding xy- contour maps of the z- membrane displacements. Note that larger areas of the membrane are brought in closer proximity to the ECM substrate when more bonds are placed in close proximity to each other.</p>"

throws an error

as.yaml(string)

Error in as.yaml(string)  : 
  Emitter error: expected SCALAR, SEQUENCE-START, MAPPING-START, or ALIAS

Here's another string that causes the same error

<p>The plots in (A) are temporal snapshots of the xy- positions of inactive integrins (red circles), active unbound integrins (light blue squares), and bound integrins (dark blue dots) obtained during simulation of integrin dynamics on a rigid ECM substrate with best-estimate parameters (<a href=\"http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000604#pcbi-1000604-t001\" target=\"_blank\">Table 1</a>). The corresponding equilibrium z-direction membrane deformations are depicted in (B). Simulated area: 3 \xb5m\xd73 \xb5m.</p>"

Should we process our html somehow to avoid this? Or should something be done in yaml?

@viking
Copy link
Contributor

viking commented Jan 14, 2015

That is definitely a bug. I'll check it out.

@sckott
Copy link
Author

sckott commented Jan 14, 2015

great, thanks

@ofurkusi
Copy link

ofurkusi commented Feb 3, 2016

This seems to be related to the bug described by @reinholdsson in #4

@sckott
Copy link
Author

sckott commented Feb 3, 2016

thanks, but still doesn't work, whether using unicode=TRUE or unicode=FALSE

@ofurkusi
Copy link

ofurkusi commented Feb 4, 2016

When you define the string, R converts \xc3 to the UTF-8 character Ã. When you run as.yaml on the string, it fails because of this non ASCII character just as in #4.

You may, possibly, circumvent this by escaping the backslash, and if that is not enough, hack it by adding a non visible space &zwnj; inbetween, e.g. \\&zwnj;xc3.

In the latter string \xb5 is converted to µ and \xd7 to ×.

You can verify this by printing the strings, or by typing

"\xc3\xb5\xd7"

@spgarbet
Copy link
Member

Moved into ticket #113

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants