Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ne_download URL strange/wrong #29

Closed
rix133 opened this issue Jun 19, 2019 · 28 comments
Closed

ne_download URL strange/wrong #29

rix133 opened this issue Jun 19, 2019 · 28 comments

Comments

@rix133
Copy link

rix133 commented Jun 19, 2019

So after clean install using the latest "rnaturalearth" on Windows 7 (R 3.5.3):

The URL seems strange to me:
http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip

Command that I ran:
countries10 <- ne_download(scale = 10, type = 'countries', category = 'cultural', returnclass = "sf")

Error in utils::download.file(file.path(address), zip_file <- tempfile()) : cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip'
2.
utils::download.file(file.path(address), zip_file <- tempfile())
1.
ne_download(scale = 10, type = "countries", category = "cultural", returnclass = "sf")
@Nowosad
Copy link

Nowosad commented Jun 19, 2019

@andysouth
Copy link
Contributor

The command works from R for me and as @Nowosad points out should work in browser too.
(but you are right it does look a bit strange).
Must be another download issue ?

@rix133
Copy link
Author

rix133 commented Jun 19, 2019

Is the link www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip works in your browser?

Actually it does, so it can't be a firewall or some other problem, I guess. So I have no idea ? I must investigate further. It fails with a clean install of 3.6.0 as well with the same message.

@andysouth
Copy link
Contributor

The code saves the downloaded file to a temporary location, might be worth checking that this works for you :
write.csv(data.frame(), tempfile())

@andysouth
Copy link
Contributor

Try this to check connection from R:
library(curl)
curl::has_internet()

@rix133
Copy link
Author

rix133 commented Jun 20, 2019

Try this to check connection from R:
library(curl)
curl::has_internet()

TRUE

write.csv(data.frame(), tempfile())
gives no error

Furthermore other urls work i.e:
utils::download.file(file.path('https://file-examples.com/wp-content/uploads/2017/02/zip_2MB.zip'), zip_file <- tempfile())

yields:
trying URL 'https://file-examples.com/wp-content/uploads/2017/02/zip_2MB.zip' Content type 'application/zip' length 2036861 bytes (1.9 MB) downloaded 1.9 MB

downloading using http instead of https of this URL works as well.

@rix133
Copy link
Author

rix133 commented Jun 20, 2019

UPDATE:
It works if I specify download method to in download to libcurl i.e:
utils::download.file(file.path(address), zip_file <- tempfile(), method = "libcurl")

So I can specify global options for the ne_download :

options("download.file.method" = "libcurl")
countries10 <- ne_download(scale = 10, type = 'countries', category = 'cultural', returnclass = "sf")

yields

(trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip'
Content length 283 bytes
downloaded 4.7 MB)

Should I close the issue?

@andysouth
Copy link
Contributor

Thanks Richard, Good work finding a solution. Leave it open for now while I ask twitter if it's wise to set that option in the package.

@barryrowlingson
Copy link

download.file uses this code to decide on the method:

    method <- if (missing(method)) 
        getOption("download.file.method", default = "auto")
    else match.arg(method, c("auto", "internal", "libcurl", "wget", 
        "curl", "lynx"))
    if (method == "auto") {
        if (length(url) != 1L || typeof(url) != "character") 
            stop("'url' must be a length-one character vector")
        method <- if (grepl("^file:", url)) 
            "internal"
        else "libcurl"
    }

so if method isn't supplied and download.file.method isn't set then method gets the default of auto and then the method is "internal" if the URL starts with file: or libcurl otherwise. So if your download works with explicit method="libcurl" but not with a missing method argument then the default is being got from somewhere. Sure you haven't set the download.file.method to something else? Something that breaks on double-slashes in a URL? That's the only odd thing in that URL... Maybe it parses the string up to the second // instead of the first?

@rix133
Copy link
Author

rix133 commented Jun 20, 2019

Sure you haven't set the download.file.method to something else?

So the error appears both in Rstudio console and R console.
The options("download.file.method") returns:
NULL in R console
"wininet"`` in latest RStudio

I looked at the method definition on R3.5.3 (windows) and it seems to default to "wininet" if auto:
snippet from download.file

if (method == "auto") {
    if (length(url) != 1L || typeof(url) != "character") 
      stop("'url' must be a length-one character vector")
    method <- if (grepl("^ftps:", url) && capabilities("libcurl")) 
      "libcurl"
    else "wininet"
  }

@petrpajdla
Copy link

Having the same issue on Linux, R version 4.0.3 with rnaturalearth version 0.1.0 and 0.2.0.

options("download.file.method")
$download.file.method
[1] "libcurl"

ne_download() returs weird link (no problem with internet connection, curl::has_internet() returns TRUE, the link does not work in the browser either w/o firewall, wordpress returns There has been a critical error on your website.)

ne_download(scale = 10, type = 'rivers_lake_centerlines', category = 'physical', destdir = destdir, load = FALSE)

rnaturalearth version 0.1.0 returns:

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip'

and rnaturalearth version 0.2.0 returns http status 500:

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/raster/GRAY_HR_SR.zip'
download failed
NULL
Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/raster/GRAY_HR_SR.zip': HTTP status was '500 Internal Server Error'

Any ideas ho to solve this yet?

@rmgriffin
Copy link

Same issue here

@Grelot
Copy link

Grelot commented Jan 13, 2021

Same issue with Windows 10 rnaturalearth 0.1.0. and r 4.3

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/physical/ne_50m_coastline.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/physical/ne_50m_coastline.zip'
In addition: Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  InternetOpenUrl failed: 'The operation timed out

@bienflorencia
Copy link

bienflorencia commented Apr 28, 2021

I believe is related to the 'http' being used, instead of 'https' in the ne_file_name function.
Is there a way to change this locally so it works?

function (scale = 110, type = "countries", category = c("cultural", 
  "physical", "raster"), full_url = FALSE) 
{
  scale <- check_scale(scale)
  category <- match.arg(category)
  if (type %in% c("countries", "map_units", "map_subunits", 
    "sovereignty", "tiny_countries", "boundary_lines_land", 
    "pacific_groupings", "breakaway_disputed_areas", "boundary_lines_disputed_areas", 
    "boundary_lines_maritime_indicator")) {
    type <- paste0("admin_0_", type)
  }
  if (type == "states") 
    type <- "admin_1_states_provinces_lakes"
  if (category == "raster") {
    file_name <- paste0(type)
  }
  else {
    file_name <- paste0("ne_", scale, "m_", type)
  }
  if (full_url) 
    file_name <- paste0("http://www.naturalearthdata.com/http//", 
      "www.naturalearthdata.com/download/", scale, "m/", 
      category, "/", file_name, ".zip")
  return(file_name)
}

Edit: I changed that in the function and it's still not working. Can it be something related to firewall?

@jmarshallnz
Copy link

It appears to be a new issue on the website. If you go directly to the website and browse, you get a link the same as in the R package (though https:// vs http://). If you click on the link it invokes an onclick event that goes through urchinTracker. If you copy the link into a browser it fails with a Wordpress problem. Possibly some issue with redirection?

Thus, the ne_download() function is currently broken in the R package. It seems that the natural earth data folk have been made aware of this, see here: nvkelso/natural-earth-vector#528

@nvkelso
Copy link

nvkelso commented May 7, 2021

This should be fixed now.

@dlebauer
Copy link

I just ran into this error with R 4.0.2, rnaturalearth 0.1.0 (543e3cb) current version in the repository), Windows 10

urban_areas <- rnaturalearth::ne_download(scale = 'large', type = 'urban_areas', returnclass = 'sf')
#> Warning in utils::download.file(file.path(address), zip_file <- tempfile()):
#> InternetOpenUrl failed: 'The server name or address could not be resolved'
#> Error in utils::download.file(file.path(address), zip_file <- tempfile()): cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_urban_areas.zip'

Created on 2021-09-24 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/Phoenix             
#>  date     2021-09-24                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package       * version date       lib source        
#>  assertthat      0.2.1   2019-03-21 [1] CRAN (R 4.0.2)
#>  backports       1.2.1   2020-12-09 [1] CRAN (R 4.0.3)
#>  class           7.3-17  2020-04-26 [2] CRAN (R 4.0.2)
#>  classInt        0.4-3   2020-04-07 [1] CRAN (R 4.0.2)
#>  cli             2.3.1   2021-02-23 [1] CRAN (R 4.0.4)
#>  crayon          1.4.1   2021-02-08 [1] CRAN (R 4.0.2)
#>  DBI             1.1.1   2021-01-15 [1] CRAN (R 4.0.3)
#>  digest          0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  dplyr           1.0.4   2021-02-02 [1] CRAN (R 4.0.3)
#>  e1071           1.7-6   2021-03-18 [1] CRAN (R 4.0.4)
#>  ellipsis        0.3.1   2020-05-15 [1] CRAN (R 4.0.2)
#>  evaluate        0.14    2019-05-28 [1] CRAN (R 4.0.2)
#>  fansi           0.4.2   2021-01-15 [1] CRAN (R 4.0.3)
#>  fs              1.5.0   2020-07-31 [1] CRAN (R 4.0.3)
#>  generics        0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  glue            1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
#>  highr           0.8     2019-03-20 [1] CRAN (R 4.0.2)
#>  htmltools       0.5.1.1 2021-01-22 [1] CRAN (R 4.0.3)
#>  KernSmooth      2.23-17 2020-04-26 [2] CRAN (R 4.0.2)
#>  knitr           1.31    2021-01-27 [1] CRAN (R 4.0.3)
#>  lattice         0.20-41 2020-04-02 [2] CRAN (R 4.0.2)
#>  lifecycle       1.0.0   2021-02-15 [1] CRAN (R 4.0.4)
#>  magrittr        2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
#>  pillar          1.5.1   2021-03-05 [1] CRAN (R 4.0.4)
#>  pkgconfig       2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
#>  proxy           0.4-25  2021-03-05 [1] CRAN (R 4.0.4)
#>  purrr           0.3.4   2020-04-17 [1] CRAN (R 4.0.2)
#>  R6              2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
#>  Rcpp            1.0.7   2021-07-07 [1] CRAN (R 4.0.5)
#>  reprex          2.0.0   2021-04-02 [1] CRAN (R 4.0.5)
#>  rlang           0.4.10  2020-12-30 [1] CRAN (R 4.0.3)
#>  rmarkdown       2.7     2021-02-19 [1] CRAN (R 4.0.4)
#>  rnaturalearth   0.1.0   2017-03-21 [1] CRAN (R 4.0.5)
#>  sessioninfo     1.1.1   2018-11-05 [1] CRAN (R 4.0.2)
#>  sf              0.9-7   2021-01-06 [1] CRAN (R 4.0.4)
#>  sp              1.4-5   2021-01-10 [1] CRAN (R 4.0.3)
#>  stringi         1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
#>  stringr         1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
#>  styler          1.3.2   2020-02-23 [1] CRAN (R 4.0.2)
#>  tibble          3.1.0   2021-02-25 [1] CRAN (R 4.0.4)
#>  tidyselect      1.1.0   2020-05-11 [1] CRAN (R 4.0.2)
#>  units           0.7-1   2021-03-16 [1] CRAN (R 4.0.4)
#>  utf8            1.2.1   2021-03-12 [1] CRAN (R 4.0.5)
#>  vctrs           0.3.6   2020-12-17 [1] CRAN (R 4.0.3)
#>  withr           2.4.1   2021-01-26 [1] CRAN (R 4.0.3)
#>  xfun            0.20    2021-01-06 [1] CRAN (R 4.0.3)
#>  yaml            2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] C:/Users/David/Documents/lib/R
#> [2] C:/Program Files/R/R-4.0.2/library

@nvkelso
Copy link

nvkelso commented Sep 29, 2021

There's a GIST showing where to find the files on S3.

@dlebauer
Copy link

@nvkelso should the package be updated to use the new urls? Or are the currently used URLs expected to come back online?

@nvkelso
Copy link

nvkelso commented Sep 29, 2021 via email

@nmarchio
Copy link

Encountering the same bug and I believe there is still an issue with the URLs on the website. I am able to manually download from the website, but this could be related to the onclick event mentioned earlier in the thread.

This code:
ocean <- ne_download(type = 'ocean', scale = 'large', category = 'physical', returnclass='sf')

Returns:

trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_ocean.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_ocean.zip'
In addition: Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_ocean.zip': HTTP status was '500 Internal Server Error'

Further when I try to download directly using the following code (which works on other files):

filedir <- paste0(tempdir())
unlink(filedir, recursive = TRUE)
dir.create(filedir)
ocean_shp <- paste0('https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip')
download.file(url = ocean_shp, destfile = paste0(filedir, basename(ocean_shp)))
unzip(paste0(filedir,basename(ocean_shp)), exdir= filedir)
list.files(path = filedir)

I get a similar error:

trying URL 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip'
Error in download.file(url = ocean_shp, destfile = paste0(filedir, basename(ocean_shp))) : 
  cannot open URL 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip'
In addition: Warning message:
In download.file(url = ocean_shp, destfile = paste0(filedir, basename(ocean_shp))) :
  cannot open URL 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip': HTTP status was '500 Internal Server Error'

@nvkelso
Copy link

nvkelso commented Feb 27, 2022

I suspect this was during a rare maintenance window on the Natural Earth server. 500s are server errors. Downloading that link works for me today.

If you switch over to the S3 URLs than that is much less likely to affect you.

@jbenjamin-rms
Copy link

jbenjamin-rms commented Jan 4, 2023

This issue has come up for me today, after years of using this package without issues. Both on Windows and Linux, have posted the Windows session information below.

R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
rnaturalearth_0.1.0

If I try to use the ne_download function for any file (and even using the defaults) I get the following errors:

ne_download()
trying URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip'
Error in utils::download.file(file.path(address), zip_file <- tempfile()) : 
  cannot open URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip'
In addition: Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'https://www.naturalearthdata.com/http/www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip': HTTP status was '404 Not Found'

@nvkelso
Copy link

nvkelso commented Jan 4, 2023

I'm not sure if &/or why this changed, but note the http/ versus http// in the URL path.

In any event, please switch over to the S3 links.

@jbenjamin-rms
Copy link

I did note that - I can switch over to using the S3 links, that's no problem, I was just wondering whether the functions in the package would be updated to reflect these changes as they offer a more convenient/streamlined way of downloading and using the data for my use-case. Thank you!

@jaum20
Copy link

jaum20 commented Jan 30, 2023

More than 3 years past and this bug still exists:

urban = try(rnaturalearth::ne_download(scale = 'medium', type = 'urban_areas', category = 'cultural'), silent = TRUE)
tentando a URL 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/cultural/ne_50m_urban_areas.zip'
Warning message:
In utils::download.file(file.path(address), zip_file <- tempfile()) :
  cannot open URL 'https://www.naturalearthdata.com/http/www.naturalearthdata.com/download/50m/cultural/ne_50m_urban_areas.zip': HTTP status was '404 Not Found'

@PMassicotte
Copy link
Contributor

At some point, you should update packages...

@PMassicotte
Copy link
Contributor

rnaturalearth::ne_download(scale = "medium", type = "urban_areas", category = "cultural", returnclass = "sf")
#> Simple feature collection with 2143 features and 4 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -157.984 ymin: -46.26844 xmax: 174.97 ymax: 69.35127
#> Geodetic CRS:  WGS 84
#> # A tibble: 2,143 × 5
#>    scalerank featurecla area_sqkm min_zoom                              geometry
#>        <dbl> <chr>          <dbl>    <dbl>                         <POLYGON [°]>
#>  1         3 Urban area    1003.       3.7 ((-121.3788 38.39169, -121.3788 38.3…
#>  2         5 Urban area     165.       5   ((-122.8139 38.506, -122.8139 38.506…
#>  3         5 Urban area      90.6      5   ((-122.1707 38.08574, -122.1707 38.0…
#>  4         2 Urban area    2538.       3.6 ((-122.4463 37.57833, -122.4463 37.5…
#>  5         5 Urban area     514.       5   ((-121.2264 37.88368, -121.2264 37.8…
#>  6         5 Urban area     131.       5   ((-122.5233 38.02747, -122.5233 38.0…
#>  7         5 Urban area     258.       5   ((-121.7905 37.7324, -121.7905 37.73…
#>  8         5 Urban area     322.       5   ((-120.9724 37.75669, -120.9724 37.7…
#>  9         4 Urban area     401.       4   ((-119.7376 36.88969, -119.7376 36.8…
#> 10         6 Urban area      98.6      6   ((-121.6897 36.74055, -121.6897 36.7…
#> # … with 2,133 more rows

Created on 2023-01-30 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests