Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove empty file from a dataset collection #5090

Closed
alimatai opened this issue Nov 29, 2017 · 7 comments
Closed

Remove empty file from a dataset collection #5090

alimatai opened this issue Nov 29, 2017 · 7 comments

Comments

@alimatai
Copy link
Contributor

Hello,

I'm working on a tool which consists in two scripts following each others; the second script takes as input the output of the first one.

The point is that with some datafiles, the first script fails and there is no output file. I have implemented a proper exit in the second script to prevent a crash of the tool, but when I use a dataset collection, an empty output file remains (it's green and empty, not red).

Is it possible to delete it or not make it appear at all ? If I'm correct there is a tool which remove failed datafiles within a dataset collection, could it work in that case ?

All the best,

Mataivic.

@jmchilton
Copy link
Member

Thanks for the issue - I don't have any progress to report but I wanted to let you know I saw this and that I think it is a really good idea.

@alimatai
Copy link
Contributor Author

alimatai commented Mar 1, 2018

@jmchilton I'm looking at the class FilterFailedDatasetsTool (here). I assume an element of a collection is returned in the filtered collection when valid == True ?

Is there an element attribute which indicate that an element is empty ? If yes, a supplemental condition would prevent empty files to be returned in the filtered collection, right ? For example, use an empty state/attribute on the datafile, or check if the size of the file equals 0 ?

I'm trying to find if there is an element class or an empty attribute, but I don't know if there is any (it's my first time in galaxy source code :) ). The only things I found are here :

  • line 1272, in the class History : dict_element_visible_keys = ['id', 'name', 'genome_build', 'deleted', 'purged', 'update_time', 'published', 'importable', 'slug', 'empty']

  • line 1777 if the class DataSet : ```
    states = Bunch(NEW='new',UPLOAD='upload',QUEUED='queued',RUNNING='running',OK='ok',EMPTY='empty', ....)

Could it be possible to use the 'empty' element of these lists ?

@alimatai
Copy link
Contributor Author

alimatai commented Mar 3, 2018

I found a way to do it : write if element.is_ok and element.has_data(): instead of only if element.is_ok():. It seems to work fine on my galaxy instance. I can make a PR on monday.

@hexylena
Copy link
Member

I think we have a filter failed now

@nsoranzo
Copy link
Member

@erasche This was for filtering empty not failed datasets in a collection.

@nsoranzo nsoranzo reopened this Aug 14, 2019
@hexylena
Copy link
Member

hexylena commented Aug 14, 2019

the first script fails and there is no output file. I have implemented a proper exit in the second script to prevent a crash of the tool

so they should revert this proper exit, and then it's solved :)

@mvdbeek
Copy link
Member

mvdbeek commented Aug 14, 2019

But we got this now as well, @Mataivic implemented this in #5640

@mvdbeek mvdbeek closed this as completed Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants