Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support decompression in jsonlite::read_json #414

Open
dhimmel opened this issue Feb 21, 2023 · 2 comments
Open

Support decompression in jsonlite::read_json #414

dhimmel opened this issue Feb 21, 2023 · 2 comments

Comments

@dhimmel
Copy link

dhimmel commented Feb 21, 2023

It's nice that jsonlite::read_json supports URLs for paths:

url = "https://github.com/related-sciences/nxontology-data/raw/71cf538dc5c258ada880d58663b0205b7b7f8561/087_chembl_target_tree.json"
jsonlite::read_json(path = url)

However, it doesn't appear to support URLs where the data is compressed:

url = "https://github.com/related-sciences/nxontology-data/raw/3330eb4ce4b29eb7399adb4e317a10a1c91138a7/mesh_topical_descriptor_descendants.json.gz"
jsonlite::read_json(path = url)  # fails

Would be nice to implement similar compression support to readr (docs):

Files ending in .gz, .bz2, .xz, or .zip will be automatically uncompressed.

And possibly adding a compression argument like pandas.read_json, such that users can override the inferred compression from the path.

@jeroen
Copy link
Owner

jeroen commented Feb 21, 2023

R has built-in compression tools so you can easily write a wrapper yourself.

read_compressed <- function(url){
  tmp <- tempfile()
  curl::curl_download(url, tmp)
  jsonlite::parse_json(gzfile(tmp))
}

url = "https://github.com/related-sciences/nxontology-data/raw/3330eb4ce4b29eb7399adb4e317a10a1c91138a7/mesh_topical_descriptor_descendants.json.gz"
read_compressed(url)

This also works:

url = "https://github.com/related-sciences/nxontology-data/raw/3330eb4ce4b29eb7399adb4e317a10a1c91138a7/mesh_topical_descriptor_descendants.json.gz"
jsonlite::parse_json(gzcon(url(url))) 

@dhimmel
Copy link
Author

dhimmel commented Feb 22, 2023

@jeroen much appreciate the code to read the gzipped URL data.

Is nice to have jsonlite abstract the compression handling, but understand if that's out of scope for the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants