Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch from RJSONIO to jsonlite #572

Closed
kforner opened this issue Aug 20, 2014 · 30 comments · Fixed by #606
Closed

switch from RJSONIO to jsonlite #572

kforner opened this issue Aug 20, 2014 · 30 comments · Fixed by #606
Milestone

Comments

@kforner
Copy link

kforner commented Aug 20, 2014

Hello,

The RJSONIO is known to be incompatible with devtools, cf r-lib/devtools#427
Would it be possible to switch to jsonlite, which is a fork of RJSONIO is ?
Thanks

@wch
Copy link
Collaborator

wch commented Aug 20, 2014

It would probably be good to do that eventually, but as RJSONIO and jsonlite have slightly different behavior, making the change and having it work reliably will be a non-trivial task.

@wch wch closed this as completed Aug 20, 2014
@yihui yihui added this to the 0.10.2 milestone Aug 20, 2014
@yihui
Copy link
Member

yihui commented Aug 20, 2014

Yeah, if we were to start shiny over, I would prefer jsonlite over RJSONIO, but it is too late now.

@kforner
Copy link
Author

kforner commented Aug 20, 2014

Ok I understand.
By the way, a good and extensive test suite allows to test this kind of refactoring/changes impact quite quickly.

@jcheng5
Copy link
Member

jcheng5 commented Aug 20, 2014

Ehhh. jsonlite has its own issues that make it not a great fit for Shiny.

> jsonlite::toJSON(list(a=NULL))
{"a":{}}
> jsonlite::toJSON(list(b="hi"))
{"b":["hi"]}

@yihui
Copy link
Member

yihui commented Aug 20, 2014

Good point. Cc @jeroenooms

> jsonlite::fromJSON(jsonlite::toJSON(list(a=NULL)))
$a
list()

@jeroen
Copy link

jeroen commented Aug 20, 2014

@jcheng5 @yihui This is intended. The motivation behind this is that the json null type is reserved for missing values, because it is frequently used that way in APIs. For example:

> json = '["foo", null, "bar"]'
> fromJSON(json)
[1] "foo" NA    "bar"

The NULL value in R cannot appear within an atomic vector, and therefore it actually carries a very different meaning than the json null type, even though they have the same name. The NULL in R is somewhat similar to undefined in javascript, but that is not part of JSON.

However note that in R, the NULL is actually an empty pairlist:

> identical(NULL, pairlist())
[1] TRUE

That is the motivation behind the current behavior, and that is why the NULL turns into a {}, which is the same as an empty named list. I think this is really the most sensible way of representing this type, that does not really have a native JSON type.

@jeroen
Copy link

jeroen commented Aug 20, 2014

Also @jcheng5 if you find any other issues I would love to hear about it obviously. If you are unhappy with the fact that jsonlite turns atomic vectors of lenght 1 into an array, you can use the unbox feature:

> jsonlite::toJSON(list(b=unbox("hi")))
{"b":"hi"}

If you automatically want to unbox all atomic vectors of length 1 as RJSONIO does it, you can use the auto_unbox argument:

> jsonlite::toJSON(list(a=pi, b="hi"), auto_unbox=TRUE)
{"a":3.1416,"b":"hi"}

However because this is very risky when you work with dynamic data (it is actually one of the main reasons I forked RJSONIO), this is not done by default.

@jcheng5
Copy link
Member

jcheng5 commented Aug 20, 2014

@jeroenooms I understand. For the use case of passing R data to JS, the jsonlite behavior is very sensible. But a lot of what Shiny does is passing R metadata (e.g. the options for a datatable) which has a quite different set of desirable defaults, both for null handling and for unboxing.

The auto_unbox feature is good to be aware of, thanks. I would love it if even if auto_unbox is TRUE, data frames would never be unboxed, as I think data frame implies that the data is actually tabular:

> jsonlite::toJSON(data.frame(a=1, b=2), auto_unbox=TRUE)
[{"a":1,"b":2}]

(We had to hack this behavior into RJSONIO for Shiny)

@jeroen
Copy link

jeroen commented Aug 20, 2014

@jcheng5 The example that you give has nothing to do with auto_unbox:

> jsonlite::toJSON(data.frame(a=1, b=2))
[{"a":1,"b":2}]

It is because jsonlite encodes data frames by row rather than by column. For example:

> jsonlite::toJSON(iris[1:3,], pretty=T)
[
    {
        "Sepal.Length" : 5.1,
        "Sepal.Width" : 3.5,
        "Petal.Length" : 1.4,
        "Petal.Width" : 0.2,
        "Species" : "setosa"
    },
    {
        "Sepal.Length" : 4.9,
        "Sepal.Width" : 3,
        "Petal.Length" : 1.4,
        "Petal.Width" : 0.2,
        "Species" : "setosa"
    },
    {
        "Sepal.Length" : 4.7,
        "Sepal.Width" : 3.2,
        "Petal.Length" : 1.3,
        "Petal.Width" : 0.2,
        "Species" : "setosa"
    }
]

If you rather want arrays, you can either cast your data frame to a list using as.list or set the argument in toJSON:

> jsonlite::toJSON(iris[1:3,], dataframe = "columns", pretty=T)
{
    "Sepal.Length" : [
        5.1,
        4.9,
        4.7
    ],
    "Sepal.Width" : [
        3.5,
        3,
        3.2
    ],
    "Petal.Length" : [
        1.4,
        1.4,
        1.3
    ],
    "Petal.Width" : [
        0.2,
        0.2,
        0.2
    ],
    "Species" : [
        "setosa",
        "setosa",
        "setosa"
    ]
}

@jeroen
Copy link

jeroen commented Aug 20, 2014

@jcheng5 In my experience, it is safer to explicitly use unbox where an atomic vector should turn into a json primitive, rather than making toJSON do this automatically for all vectors of length 1 as with auto_unbox or RJSIONO. It might seem handy at first, but it can easily result in nasty type errors for dynamic data where you expect a vector to turn into an array, but the vector happens to be length 1 sometimes. This is explained with an example in section 2.1.3 of http://arxiv.org/pdf/1403.2805v1.pdf

@jcheng5
Copy link
Member

jcheng5 commented Aug 21, 2014

Oh, I totally understand the danger and I'm not happy about the ambiguity. But the usability hit of explicit unboxing is just devastating in my opinion (for metadata, not data). RJSONIO just picks a different default, use I() for explicit boxing.

No easy answers anywhere here--just different tradeoffs.

@jeroen
Copy link

jeroen commented Aug 21, 2014

In jsonlite I decided to go with the safer default, but provide the same convenience as RJSONIO via the auto_unbox argument if you know what you are doing. I think you should be able to create any json structure with the same ease as RJSONIO, but if you have a counter example I would love to see it :-)

@jcheng5
Copy link
Member

jcheng5 commented Aug 22, 2014

Grrrrrrrrrrr. RJSONIO is disappointing me today.

> cat(RJSONIO::toJSON(list(a=character()), pretty=TRUE))
{
    "a" : 
}

@jeroenooms Keep in mind we don't have the luxury of controlling both ends of the JSON serialization/deserialization; Shiny has been out in the wild for two years and allows user code in the browser to do whatever it wants with the deserialized representation. So at this point we'd really need to be, if not bug-for-bug compatible with RJSONIO, at least, to use the same representations for all but the most egregious edge cases.

@jcheng5
Copy link
Member

jcheng5 commented Aug 22, 2014

My apologies to RJSONIO, apparently the bug I just posted is fixed in the latest CRAN release :)

@yihui
Copy link
Member

yihui commented Aug 22, 2014

I do not trust Duncan or Jeroen because of who they are, but I trust Jeroen's tests 👍 Duncan does have a tests directory, but it seems they are "eyeball" tests instead of unit tests, which always makes me nervous since the behavior of functions is not well defined. It is hard to say which behavior is correct/desirable in JSON conversions, but a behavior that is not consistent is certainly not good.

@wch
Copy link
Collaborator

wch commented Aug 25, 2014

There are a couple issues related to escaping backslashes in names in RJSONIO, which have been causing me some issues:
duncantl/RJSONIO#4
duncantl/RJSONIO#16

jsonlite also has one of these:
jeroen/jsonlite#30

@wch
Copy link
Collaborator

wch commented Aug 25, 2014

@jeroenooms As Joe mentioned, we often don't have control of both ends; in some cases, we want to construct JSON objects that are passed directly to Javascript functions in external libraries. That may mean creating a JSON object like {"x": null}, or {"vec": 4}.

If we were to migrate to jsonlite, I think we can get partway there with toJSON(x, auto_unbox = TRUE, force = TRUE), but I think we would also need options to enable:

  • Converting NULL to null
  • Converting named atomic vectors to JSON objects, instead of arrays (and therefore losing the names).

(There may be other issues that I haven't encountered.)

An aside: I think that it's conceptually more accurate to say that an empty pairlist is the same as NULL, rather than that NULL is an empty pairlist. This is because NULL is a special object in R. Similarly, you wouldn't say that NULL is a missing object in a list, even though this is true: identical(NULL, list(1,2,3)$foo).

@jeroen
Copy link

jeroen commented Aug 25, 2014

The json escaping bug should be fixed now.

Several people have requested that NULL should be null, so I'll probably add that. Maybe as an option or maybe that should just be the default.

However let me try to explain why I think {} leads to greater type-safety. The help for NULL states:

NULL is also used as the empty pairlist.

In practice, NULL often appears for functions that would otherwise return a named list. For example the attributes function returns either a named list, or if the object has no attributes, it returns NULL. In R this is fine because NULL behaves as an empty list:

> x = NULL
> y = list()
> x$foo
NULL
> y$foo
NULL

However in JavaScript null is a primitive that cannot be subsetted, so it is very different from the empty set. This can lead to type errors.

For example, suppose you encode an object with attributes to json:

json <- toJSON(
  data = as.list(myobject),
  attr = attributes(myobject)
)

And then the JavaScript client does:

var object = JSON.parse(json)
alert(object.attr.names)

All goes well as long as the object has at least one attribute. But if for whatever reason one day the object has zero attributes, R will return NULL and JavaScript will run into type errors:

TypeError: Cannot read property 'names' of null

For this reason I felt that {} might be a safer default than null when encoding NULL.

@wch
Copy link
Collaborator

wch commented Aug 26, 2014

That makes sense. I think the root of the problem is that there isn't a straightforward bijective mapping between R and JSON data structures, so some compromises are necessary -- which compromises you make depends on your goals.

In your case, you want to make it possible to safely index into object.attr (with object.attr.names) even if attr was NULL on the R side. In my case, I want to construct a JSON object with nulls in it, and it's very natural to write R code that supplies NULL in those places; putting an NA in those places requires a little extra work.

Another factor to consider is that RJSONIO::toJSON preserves type when mapping these back and forth, but jsonlite::toJSON doesn't:

RJSONIO::fromJSON(RJSONIO::toJSON(list(x=NULL)))         # x is NULL
# $x
# NULL
RJSONIO::fromJSON(RJSONIO::toJSON(list(x=list())))       # x is empty list
# $x
# list()
RJSONIO::fromJSON(RJSONIO::toJSON(list(x=list(a=1)[0]))) # x is empty named list
# $x
# named list()

jsonlite::fromJSON(jsonlite::toJSON(list(x=NULL)))
# $x
# list()
jsonlite::fromJSON(jsonlite::toJSON(list(x=list())))
# $x
# list()
jsonlite::fromJSON(jsonlite::toJSON(list(x=list(a=1)[0])))
# $x
# list()

@jeroen
Copy link

jeroen commented Sep 4, 2014

@jch @jcheng5 @yihui In the latest version of jsonlite, the auto_unbox argument is now ignored for data frames. I hope I understood that request correctly.

> test <- list(x = 123, y = iris[1,])
> toJSON(test, dataframe="col", auto_unbox=TRUE, pretty=TRUE)
{
    "x" : 123,
    "y" : {
        "Sepal.Length" : [
            5.1
        ],
        "Sepal.Width" : [
            3.5
        ],
        "Petal.Length" : [
            1.4
        ],
        "Petal.Width" : [
            0.2
        ],
        "Species" : [
            "setosa"
        ]
    }
}

@jcheng5
Copy link
Member

jcheng5 commented Sep 4, 2014

@jeroenooms That's great, thanks!

@jeroen
Copy link

jeroen commented Sep 5, 2014

jsonlite 0.9.11 is on cran now. I think it should be possible to mimic the most relevant toJSON behavior of RJSONIO using parameters for dataframe, null, na and auto_unbox:

# Compatibility mode
toJSON_compat <- function(x, ...){
  jsonlite::toJSON(x, dataframe="columns", null="null", na="null", auto_unbox=TRUE, ...)
}

A simple test case:

> test <- list(num = pi, data = iris[1,], list=list("foo", TRUE, NULL, NA, c(1,2,NA)))
> json1 <- toJSON_compat(test)
> json2 <- minify(RJSONIO::toJSON(test))
> identical(json1, json2)
[1] TRUE

In the case of fromJSON, the following statements should be equivalent:

RJSONIO::fromJSON(json1, simplify=FALSE, simplifyWithNames=FALSE)
jsonlite::fromJSON(json1, simplifyVector=FALSE)

However when simplification is enabled, jsonlite and RJSONIO start behaving very differently.

@wch
Copy link
Collaborator

wch commented Oct 3, 2014

I think we're getting pretty close to having jsonlite replace toJSON... Here's one more case where they still differ though:

d <- list(matrix(1:2, ncol=1))

cat(RJSONIO::toJSON(d))
# [
#  [ [ 1 ],
# [ 2 ] ] 
# ]
jsonlite::toJSON(d, auto_unbox=T)
# [[1,2]] 

@jeroenooms Do you know of a way to work around this, while still preserving the auto_unbox behavior for other data objects? I think we want it to happen for data frames, but not for matrices.

@kevinushey
Copy link
Contributor

Just a thought -- could auto_unbox be a function supplied by the user that unboxes for elements where it returns TRUE?

jsonlite::toJSON(d, auto_unbox = function(x) !(is.data.frame(x) || !is.matrix(x)))

or something to that effect (I may have mixed things up)?

@jeroen
Copy link

jeroen commented Oct 3, 2014

Yes that is unfortunate. I fixed this in the devel version, but it will be at least a couple of weeks until the next release.

Edit: actually if you really need it I can probably release sooner. If you send me a heads up when you are planning on releasing a new shiny, then I'll push out a version of jsonlite.

@wch
Copy link
Collaborator

wch commented Oct 3, 2014

Thanks - we just released a new version of Shiny, so it'll be at least another month before we do another.

@jeroen
Copy link

jeroen commented Oct 3, 2014

OK cool. I'm also looking at the num to string stuff right now.

@wch
Copy link
Collaborator

wch commented Oct 3, 2014

Some more comparisons. This reveals some RJSONIO bugs. :) But I think the digits handling in RJSONIO makes more sense, where digits appears to specify the precision of the number. @jeroenooms this relates to the digits vs. precision stuff I did in the code I wrote for converting numeric vectors to strings.

compare <- function(x, ...) {
  resr <- jsonlite::minify(RJSONIO::toJSON(x, ...))
  resj <- jsonlite::minify(jsonlite::toJSON(x, dataframe="columns",
            null="null", na="null", auto_unbox=TRUE, ...))

  if (!identical(resr, resj)) {
    cat(paste0("Results differ.",
      "\nRJSONIO:  ", resr,
      "\njsonlite: ", resj
    ))
  } else {
    cat(resj)
  }

  invisible()
}

compare(list(2))
# [2]

compare(data.frame(x=1:2, y=c("a","b")))
# {"x":[1,2],"y":["a","b"]}

# Differences in displaying digits
compare(
  data.frame(x=c(1.2323123231, 5.12345678901234e19), y=c("a","b"))
)
# Results differ.
# RJSONIO:  {"x":[1.2323,5.1235e+19],"y":["a","b"]}
# jsonlite: {"x":[1.2323,51234567890123399168],"y":["a","b"]}

# In RJSONIO, digits controls precision; in jsonlite, it controls digits after
# the decimal.
compare(
  data.frame(x=c(1.2323123231, 5.12345678901234e19), y=c("a","b")),
  digits = 2
)
# Results differ.
# RJSONIO:  {"x":[1.2,5.1e+19],"y":["a","b"]}
# jsonlite: {"x":[1.23,51234567890123399168],"y":["a","b"]}

# This is a RJSONIO bug
compare(list(x = matrix(nrow = 0, ncol = 1)))
#  Error: parse error: unallowed token at this point in JSON text
#                              {  "x":   }
#                      (right here) ------^

compare(list(matrix(1:2, nrow=1)))
# [[[1,2]]]

compare(list(matrix(1:2, ncol=1)))
# [[[1],[2]]]

compare(list(x = matrix(1:4, nrow=2)))
# {"x":[[1,3],[2,4]]}

# Difference in 1x1 matrix - I think jsonlite is correct
compare(list(x = matrix(1)))
# Results differ.
# RJSONIO:  {"x":[1]}
# jsonlite: {"x":[[1]]}

compare(list(x=NULL))
# {"x":null}

compare(list(as.data.frame(t(matrix(1:2, ncol=1)))))
# [{"V1":[1],"V2":[2]}]

@jeroen
Copy link

jeroen commented Oct 21, 2014

FYI the fix above on on cran now (jsonlite 0.9.13)

@wch
Copy link
Collaborator

wch commented Oct 21, 2014

Great, thanks!

gordonwoodhull added a commit to att/rcloud.shiny that referenced this issue Jul 5, 2015
shiny probably switched json libraries
(see long and fascinating discussion of R-JSON encoding here:
rstudio/shiny#572)

so now they're sending scalars as 1-length arrays with annotations
which we capture as.. objects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants