-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting error "Error: Internal error (invalid zip archive). Please try again." Take 2 #360
Comments
Thanks @corneliusroemer, we are continuing to look into this. Would you mind updating to Best, |
I am also seeing this error in our automated pipelines for zika, mpox, measles, and dengue, which are all scheduled to run at 9AM PDT. If I rerun the workflow at a later time, the error goes away. Does the time coincide with the datasets updates? |
@ericcox1 Yes, getting the error with 16.16.0 as well. An example run is: Is it possible that some part of the server struggles with the number of requests it's getting? As part of a project, I'm doing dataset downloads via CLI for a few taxa around every 3 minutes (it's run as part of CI). It's done with API key and the allowed rate is 10 requests per second so we should be far away from that limit but it might still be that no one else hitherto has sent requests so frequently. |
I've been getting the same error (Error: Internal error (invalid zip archive). Please try again) repeatedly for the past several days while trying to get influenza A genomes with this command:
Here is the gzipped --debug output: datasets.log.gz The download proceeds for a varying amount of time (~two to 39 minutes) and downloads a varying amount of data (haven't kept track but noticed different numbers of GB) before exiting with the error. I'm using datasets version: 16.17.0 |
Earlier today, this command succeeded for me:
-- it's the first example command on https://www.ncbi.nlm.nih.gov/datasets/docs/v2/how-tos/virus/get-influenza-genomes/ . In 87 minutes, it downloaded a 555MB (530MiB) file that includes data_report.jsonl and genome.fna, but not biosample.jsonl. Unfortunately the command above with |
Hi AngieHinrichs, Thanks for opening the issue. We're looking into it. Nuala |
Can you run this again with the |
OK, I am kicking off this command (there's no
|
OK, PHID is 2F4065564DC261B8F1FA965F. Log attached. |
Hi AngieHinrichs, We need to take a deeper look at the issue. We'll post her when we have a fix. Nuala |
Thanks @olearyna! |
Hi, Any good news on this? I had the same error since Monday, I though it was something wrong with my code until I read this post. |
Hi carolinasisco, We are actively working on a fix and aim to have it released within the week. We apologize for any inconvenience this may have caused. Thanks for the patience! Nuala |
Hi carolinasisco and AngieHinrichs, We have released a fix in the latest version (v16.18.1) of the command line tool that we believe addresses the reported issues. Please test this update and let us know if you encounter any further errors. Thanks |
Thanks @olearyna, I'll try it out right away! |
It worked and it was much faster than before! Thanks again! |
Great! I'll close this issue. |
Hi, it did not worked for me, any suggestions? |
Thanks so much @olearyna and @ericcox1! I will comment as soon as I see failures again. @carolinasisco are you sure you're using version 16.18.1? I think it would help the devs if you could run with --debug then and share the PHID 😀 |
Hi @carolinasisco, Yes, if you are still having issues with the latest version can you run |
Hi @olearyna I updated through conda --update, the version showing is 16.18.1, This is my code (I ran it with --debug as suggested): datasets download gene accession --inputfile ~/Desktop/wp_1_50 --filename wp150 --include gene,protein --debug Error: Download error: http2: server sent GOAWAY and closed the connection; LastDownloading: ncbi_dataset.zip 4.62MB error Thanks! |
Hi carolinasisco, Thanks for the information! I think this is a separate issue from the Nuala |
Hi, thank you. I'm trying to download a large set of sequences (nt and aa) from pseudomonas. |
Hi, I would like to add another example of this error, in hopes of it being helpful in finding a solution. I am using ncbi datasets version 16.31.0. I was trying to download Streptococcus genomic sequences using the following command: This results in the following outcome: On several attempts, the validation of the package files reaches 6 - 9 %. I reran the command while including either genomes or gbff. When downloading genomes only ( |
Hi @mverce, Thanks for your report. I wasn't able to reproduce this error and we think you may have encountered a temporary problem. If you don't mind trying this one more time, please add the
Best, |
Hi @ericcox1, I have tried it again with the commands that were problematic yesterday, as well as with your exact command (incl. --filename strep.zip), but the problem persists. The last Ncbi-Phid from the debug output is: 1CA6C01E4134F3592F685054.6.1 Thanks and best regards, |
I tried the same command as Eric listed and can't reproduce |
Sadly the issue is still active, at least for taxons ebola-zaire and mpox.
See #356
Originally posted by @corneliusroemer in #356 (comment)
The text was updated successfully, but these errors were encountered: