You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Confirm that your issue is most likely a genuine bug in the targets package itself and not a user error or known limitation. For usage issues and troubleshooting, please post to the discussions instead.
If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
Currently, when a URL-type target runs into any non-200 HTTP status code, it returns the same error ("could not access url"). This could be improved to provide more specific feedback, since it can get difficult to debug (see below). Also, the current wording could be read to suggest a 404 error, which is not always the case.
Specifically, I have come across a case where a server (a) does not permit the HEAD method and (b) unhelpfully returns 400 instead of the appropriate 405 status, so even just looking at the status code is not useful. This is quite hard to debug as using a plain curl command or a web browser returns the resource correctly with Etag and Last-modified headers.
Unfortunately this particular error relates to an instance of Socrata, a data publication solution which is used by quite a few open data publishers, so this issue may affect a broader range of users.
Reproducible example
Post a minimal reproducible example so the maintainer can troubleshoot the problems you identify. A reproducible example is:
Runnable: post enough R code and data so any onlooker can create the error on their own computer.
Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
It would be useful to see more about what error occurred - status code and error text. Even the current text could be adapted so as not to suggest a 404 HTTP error.
I don't see a way to get around the fact that the server does not accept HEAD requests: the same server does not respect the range curl option telling the server to only return e.g. the first 500 bytes, which suggests using this option will not be reliable. Failing that, the only other option is to GET the whole resource from the URL, which would defeat most of the purpose of the URL-format target.
Prework
targets
package itself and not a user error or known limitation. For usage issues and troubleshooting, please post to the discussions instead.Description
Currently, when a URL-type target runs into any non-200 HTTP status code, it returns the same error ("could not access url"). This could be improved to provide more specific feedback, since it can get difficult to debug (see below). Also, the current wording could be read to suggest a 404 error, which is not always the case.
Specifically, I have come across a case where a server (a) does not permit the HEAD method and (b) unhelpfully returns 400 instead of the appropriate 405 status, so even just looking at the status code is not useful. This is quite hard to debug as using a plain curl command or a web browser returns the resource correctly with Etag and Last-modified headers.
Unfortunately this particular error relates to an instance of Socrata, a data publication solution which is used by quite a few open data publishers, so this issue may affect a broader range of users.
Reproducible example
Run
tar_make()
using the following_targets.R
:Here is a quick investigation of what headers and content the server returns using different curl options:
Created on 2021-02-07 by the reprex package (v0.3.0)
Expected result
It would be useful to see more about what error occurred - status code and error text. Even the current text could be adapted so as not to suggest a 404 HTTP error.
I don't see a way to get around the fact that the server does not accept HEAD requests: the same server does not respect the
range
curl option telling the server to only return e.g. the first 500 bytes, which suggests using this option will not be reliable. Failing that, the only other option is to GET the whole resource from the URL, which would defeat most of the purpose of the URL-format target.Diagnostic information
The text was updated successfully, but these errors were encountered: