Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

404, not found when providing format=original in Data Access API #6408

Closed
tainguyenbui opened this issue Nov 25, 2019 · 11 comments
Closed

404, not found when providing format=original in Data Access API #6408

tainguyenbui opened this issue Nov 25, 2019 · 11 comments

Comments

@tainguyenbui
Copy link
Contributor

tainguyenbui commented Nov 25, 2019

Not Found error, 404 is returned when attempting to download a file in its original format.

Use cases

Data has been ingested

The dataset contains at least two files, the original file and the ingested version.
For example, a .csv file is uploaded, Dataverse ingests the file and converts it to .tab, keeping as original the .csv

When we download the file, we could specify whether downloading the ingested one:
GET http://$SERVER/api/access/datafile/$FileId

or

Downloading the original version
GET http://$SERVER/api/access/datafile/$FileId?format=original

The above scenario works as expected

Data has not been ingested

The dataset contains at least 1 file, the one that was uploaded

GET http://$SERVER/api/access/datafile/$FileId

GET http://$SERVER/api/access/datafile/$FileId?format=original returns 404

Questions:

  • Is the above an expected behaviour?
  • Should the API not return the only file available when retrieving the original?

Notes:

The API guides already specify that the format query parameter is only available for tabular data

@pdurbin
Copy link
Member

pdurbin commented Nov 25, 2019

@tainguyenbui I think you're saying that you are unable to download a file that wasn't ingested. No, this is unexpected. Many, many files are not ingested (only certain tabular formats are) and they should all be downloadable. Can you reproduce this problem on https://demo.dataverse.org ?

@tainguyenbui
Copy link
Contributor Author

Sorry @pdurbin I might have explained it wrong. I am able to download files. However, there is a behaviour that does not sound right to me. If you had a file that cannot be ingested and keeps its original extension, then when you call
GET http://$SERVER/api/access/datafile/$FileId?format=original
it returns a 404. I don't think that is completely right though, because if the file was not treated at all, when I request its original format I should still be able to download the file.

Of course, the above request without the ?format=original does work.

@pdurbin
Copy link
Member

pdurbin commented Nov 25, 2019

@tainguyenbui ok, I was confused by this:

Screen Shot 2019-11-25 at 10 34 20 AM

Is it really a 404 in the case above?

@tainguyenbui
Copy link
Contributor Author

tainguyenbui commented Nov 25, 2019

my bad @pdurbin, I corrected it

@pdurbin
Copy link
Member

pdurbin commented Nov 25, 2019

@tainguyenbui thanks! It's all much more clear now. 😄

To boil it down... are you saying you'd like format=original to always allow you to download the original file so you don't have to think about what kind of file it is?

If so, I feel like this idea has been discussed before but I don't have any issues handy to link to. 😄

@tainguyenbui
Copy link
Contributor Author

In this case, I do not want to modify an existing behaviour unless it makes sense to everyone.

I would love to discuss and understand why it would not return the actual original format if the file has not been modified.

thanks a lot for your lightning responses @pdurbin

@pdurbin
Copy link
Member

pdurbin commented Nov 25, 2019

@tainguyenbui well, you'll never get consensus, of course. 😄

I guess we could add a brand new query parameter rather than changing the behavior of the old one.

Do you have a work around? From your comment at #6385 (comment) I'm guessing that you might be checking for the presence of fields like originalFileFormat or originalFormatLabel to know if the file was ingested or not.

I'm curious if you're allowing users of https://github.com/IQSS/dataverse-client-javascript to not worry about these details. You could offer them a "download original file" feature that does some checking and gives them what they want without thinking hard about this stuff. 😄

@tainguyenbui
Copy link
Contributor Author

tainguyenbui commented Nov 25, 2019

@pdurbin you are very very close to the approach we have taken. We have two different endpoints in our own backend, one finishing with the path /original.

Then, I have updated the client to have a flag, 'getOriginalFile' which is currently false by default and will not append the query param ?format=original at the end.

Right now, we are looking at the optional properties original... to determine whether we should be asking for the original or not 😬

of course, a lot of pain could be saved if this was handled at the very end 🤣

@pdurbin
Copy link
Member

pdurbin commented Feb 10, 2023

@tainguyenbui heads up that we poked a bit at this as part of #9374 and we think it might be fixed already. We'll leave this issue open for now until we do some more checking.

@cmbz
Copy link

cmbz commented Aug 20, 2024

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

@cmbz cmbz closed this as completed Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants