Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relabel List Identifiers doesn't change Name field? #5239

Closed
mblue9 opened this issue Dec 21, 2017 · 9 comments
Closed

Relabel List Identifiers doesn't change Name field? #5239

mblue9 opened this issue Dec 21, 2017 · 9 comments

Comments

@mblue9
Copy link
Contributor

mblue9 commented Dec 21, 2017

Hello, I was trying to use the Dataset Collections tool "Relabel List Identifiers" to change the name of datasets in a collection, to then use the new names as the identifiers, e.g. so featurecounts will add that identifier to the counts file header as here: galaxyproject/tools-iuc#1582

But after relabelling the collection, while the name looks like it's changed in the History, the Name field retains the old name and that's being used by featurecounts instead of the new name. See screenshot below, the Name field is still showing the old name "HISAT on data..." instead of the new name "ib6-3".

Is Relabel meant to change the Name field?
screen shot 2017-12-21 at 3 34 08 pm

@jmchilton
Copy link
Member

This is confusing but datasets in collections are also datasets in histories (called HDAs internally). HDAs have names - which can change over time and aren't super useful for sample tracking (IMO). Datasets in collections also have identifiers (the thing displayed instead of the name in the history bar in your screenshot) - which are preserved and are more useful for sample tracking. If I could redo collections I would change this - they shouldn't have names - but they do and these names leak out and are visible to in the GUI in different ways (but fewer ways over time). This relabel identifiers tool only rebuilds a new collection with new identifiers - the name is indeed unchanged. Hopefully most tools that consume and use identifiers in reporting instead of names - if there is a tool that is using the name - we should fix the tool (e.g. deeptools/deepTools#500). If we fix the tools - it will hopefully not matter that the old names are preserved - does that make sense? Does that help clarify this?

@mblue9
Copy link
Contributor Author

mblue9 commented Dec 21, 2017

Thanks for the clarification @jmchilton! Yes that does make sense, that the relabel identifier changes the identifier and not the name. I'm just confused then though, as to why the version of featurecounts that I'm using 1.6.0.1, which was changed to use the identifier in this PR galaxyproject/tools-iuc#1582, is still adding the name/old identifier "HISAT2 on..." and not the relabelled new identifier to the column heading in the output counts file. I don't know if that's a tool issue or what. I'll try to look into it further.

@mvdbeek
Copy link
Member

mvdbeek commented Dec 31, 2017

Hmm, the element_identifier may not be available in all situations (that is only the name fallback is available ...). From the wrapper point-of-view it should work, but it may be possible that the <action/> scope doesn't have access to the element identifier, which would make this a missing feature in galaxy.

@mblue9
Copy link
Contributor Author

mblue9 commented Jan 1, 2018

Ah I see thanks for looking into this @mvdbeek! Do you have any idea how difficult it would be to add that feature? As it would be so good to be able to get the naming working so that the real sample identifier can get added into the header for featurecounts (and not the really hard to keep-track-of "HISAT2 on.."). I need this for lots of workflows and I don't have a good way around it at the moment :(

@mvdbeek
Copy link
Member

mvdbeek commented Jan 1, 2018

I'll have a look tomorrow.

@mvdbeek
Copy link
Member

mvdbeek commented Jan 2, 2018

Hmm, I can't recapitulate the problem on 17.09 or dev, it seems that I am getting the correct header, but maybe we're not doing the same thing. I have uploaded the test bam twice and created a collection from the test data, where the identifiers are A and B, while I didn't change the name, which is still featureCounts_input1.bam.

<img width="1181" alt="screen shot 2018-01-02 at 11 02 14" src="https://user-images.githubusercontent.com/6804901/34479007-d1ff46da-efac-11e7-8a1d-294b7cf8b51e.

I did try this with 17.05 as well, where it does actually do what you have described, due to #5049 not being in 17.05. Are you still on 17.05 ? Or maybe 17.09 that doesn't include 1973338 ?

@mblue9
Copy link
Contributor Author

mblue9 commented Jan 2, 2018

Ah we are on 17.09 from before that commit - 24da84a so I guess that's why!

Thanks a lot for figuring that out @mvdbeek !!

@mvdbeek
Copy link
Member

mvdbeek commented Jan 2, 2018

I'll close this, but feel free to re-open if you still have problems after updating!

@mblue9
Copy link
Contributor Author

mblue9 commented Jan 10, 2018

Just to confirm - updating 17.09 fixed this issue. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants