Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTP upload in upload form uses a separate request for each selected file #1090

Closed
blankenberg opened this issue Nov 13, 2015 · 9 comments
Closed

Comments

@blankenberg
Copy link
Member

This causes the case where my computer must remain on and connected to the internet until all of the ftp files are added to the history (running/queued state is OK). If I disconnect or leave the page, then my files stop being added to the history/job queue for upload. Since the files are already on the server, I should be able to expect this to be a very quick operation on the user end, and not require staying connected for more than a few seconds after clicking upload.

This processes can take several minutes to hours, depending on the number of files that I am trying to import to my history.

It also causes significant long-term client resource drain (and possible browser warning messages?) due to long running scripts.

Each file selected will also be its own upload job (instead of batching them together as one job with many files) -- not sure if this is better or worse. Multiple uploads will finish faster, but at the expense of being able to run other (non-upload) jobs while all of the individual upload jobs run, and for what is a single user action.

@guerler
Copy link
Contributor

guerler commented Nov 15, 2015

We could implement something like this here: https://gist.github.com/guerler/6fab29fe67a02972b70c.
But this would still trigger multiple jobs and will cause performance issues. After all we do not limit the number of ftp files i.e. a user could instantly request and trigger several thousand jobs. In this case the client would wait for all 'job accepted' responses at the same time. Alternatively we could trigger a batch mode job run with the upload tool. However the batch mode currently only operates on dataset selections, as far as I know iterating over other parameters is not supported yet.
What do you mean with 'significant long-term client resource drain'?

@blankenberg
Copy link
Member Author

What do you mean with 'significant long-term client resource drain'?

I click to 'upload' ~800 small fastq files into my history, there is a code running within that browser for ~ one hour, consuming resources, if I close the window before that hour has finished, the remaining FTP files will not be added to my history. I must leave that browser window alone, consuming resources and connected to the internet, for an entire hour; closing the browser window to stop the long running process or disconnecting from the internet during that hour will cause the remaining FTP files to not be added to the history.

It used to be one-click one-job (one POST is the important part), with job executed returned from the upload tool. Files would churn in the background and a user could use Galaxy while FTP files were being imported.

@guerler
Copy link
Contributor

guerler commented Nov 16, 2015

Thanks for the details, and yes, that needs to be fixed then.

@chambm
Copy link
Contributor

chambm commented Jun 3, 2016

+1. I run into this as well. Actually, even the multi-select function is slow with a few hundred items and causes a script-kill alert with Chrome.

@blankclemens
Copy link
Member

+1: Average CPU consumption of Chrome 45-65% while adding the files that are already uploaded via FTP, 768 files, 3,9GB - 30+ minutes? ;)

@guerler
Copy link
Contributor

guerler commented Jul 27, 2017

@martenson let me know if you need help and/or are working on this.

@martenson
Copy link
Member

@guerler I have not been working on this, it is on my shortlist, but I think that after the conference month I will start with libraries first. IOW feel free to tackle this.

@martenson
Copy link
Member

martenson commented Oct 9, 2017

This does not seem solved, since it still uses one request per file:
screenshot 2017-10-09 16 40 45

edit: my mistake, these are not requests for upload, these are requests for history, still a bug but not directly related

@martenson martenson reopened this Oct 9, 2017
@martenson
Copy link
Member

issue above tracked in #4780

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants