Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the trackhub composite datatype #2348

Merged
merged 9 commits into from
Jul 22, 2016
3 changes: 3 additions & 0 deletions config/datatypes_conf.xml.sample
Original file line number Diff line number Diff line change
Expand Up @@ -530,6 +530,9 @@
<datatype extension="mothur.axes" type="galaxy.datatypes.mothur:Axes" display_in_upload="true"/>
<datatype extension="mothur.sff.flow" type="galaxy.datatypes.mothur:SffFlow" display_in_upload="true"/>
<datatype extension="mothur.count_table" type="galaxy.datatypes.mothur:CountTable" display_in_upload="true"/>
<datatype extension="trackhub" type="galaxy.datatypes.tracks:UCSCTrackHub" display_in_upload="true">
<display file="ucsc/trackhub.xml" />
</datatype>
</registration>
<sniffers>
<!--
Expand Down
6 changes: 6 additions & 0 deletions display_applications/ucsc/trackhub.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<display id="ucsc_trackhub" version="1.0.0" name="display at Track Hub UCSC">
<link id="main" name="main">
<url>https://genome.ucsc.edu/cgi-bin/hgHubConnect?hubUrl=${qp($hub_file.url + '/myHub/hub.txt')}&amp;hgHub_do_firstDb=on&amp;hgHub_do_redirect=on&amp;hgHubConnect.remakeTrackHub=on</url>
<param type="data" name="hub_file" url="galaxy_${DATASET_HASH}" allow_extra_files_access="True" />
</link>
</display>
58 changes: 58 additions & 0 deletions lib/galaxy/datatypes/hubAssembly.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
"""
HubAssembly datatype
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we squeeze this datatype into an other file? tracks.py maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds logical!

import logging
import galaxy.version as version

# Support for Galaxy <= 16.01
if version.VERSION_MAJOR <= "16.01":
from galaxy.datatypes.images import Html
else:
from galaxy.datatypes.text import Html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oo, why is this needed?
Can we have one central import to avoid such things in the future?

Copy link
Contributor Author

@remimarenco remimarenco May 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, importing from galaxy.datatyes.images should work in any versions because of this.
But I preferred to use the text lib, to drop the images one in the future (once we drop support for < 16.04...so in a long time heh).

How do you think we should solve this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to care for older Galaxy versions, since this PR will be merged in the dev branch and the datatype released only in Galaxy 16.07.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was afraid about the case where somebody wanted to use the datatype on 16.01 or earlier (and it was my case for various reasons). So as you wish :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@remimarenco this still needs to be addressed, isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New features are not cherry-picked. If you will backport locally, than you can also add the workaround!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why theses lines prevent the merging in Galaxy codebase. Does it harm something? Just to know so I won't do the same errors in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the argument is that since datatypes are distributed within the Galaxy code, this case should never happen. It's dead code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We distribute datatypes but they are a plugin - I'd be totally willing to include work arounds like this in datatypes code in core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @jxtx. I got that point, and as I said, I had the case where I needed my datatype whereas I was not in 16.04.

But I guess this is a corner case, so I understand :).

Thanks for the support @jmchilton, I removed it as it seems to be a majority not in favor :).


log = logging.getLogger(__name__)


class HubAssembly( Html ):
"""
derived class for BioC data structures in Galaxy
"""

file_ext = 'huba'
composite_type = 'auto_primary_file'

def __init__(self, **kwd):
Html.__init__(self, **kwd)

def generate_primary_file( self, dataset=None ):
"""
This is called only at upload to write the html file
cannot rename the datasets here - they come with the default unfortunately
"""
rval = [
'<html><head><title>Files for Composite Dataset (%s)</title></head><p/>\
This composite dataset is composed of the following files:<p/><ul>' % (
self.file_ext)]
for composite_name, composite_file in self.get_composite_files( dataset=dataset ).iteritems():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just items()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why items() over iteritems(), out of curiosity? Use of the generator seems fine practice here to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dannon we do try to port Galaxy slowly to Python3 isn't it? I was assuming this is more portable and speed is not a concern in this call for python2 installations.
But yes it is not a must.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see, you were thinking about python3. Good to have the rationale here instead of simply 'just items()' :)

Definitely not a requirement to change this now @remimarenco, up to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, was a breakfast-review ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of being more Python 3 compatible, will change it! :)

opt_text = ''
if composite_file.optional:
opt_text = ' (optional)'
rval.append('<li><a href="%s">%s</a>%s' % ( composite_name, composite_name, opt_text) )
rval.append('</ul></html>')
return "\n".join(rval)

def set_peek( self, dataset, is_multi_byte=False ):
if not dataset.dataset.purged:
dataset.peek = "Track Hub structure: Visualization in UCSC Track Hub"
else:
dataset.peek = 'file does not exist'
dataset.blurb = 'file purged from disk'

def display_peek( self, dataset ):
try:
return dataset.peek
except:
return "Track Hub structure: Visualization in UCSC Track Hub"

def sniff( self, filename ):
return False
47 changes: 47 additions & 0 deletions lib/galaxy/datatypes/tracks.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
import binary
import logging

from galaxy.datatypes.text import Html

log = logging.getLogger(__name__)

# GeneTrack is no longer supported but leaving the datatype since
Expand Down Expand Up @@ -36,3 +38,48 @@ def __init__(self, **kwargs):
# link = "%s?filename=%s&hashkey=%s&input=%s&GALAXY_URL=%s" % ( url, encoded, hashkey, data_id, galaxy_url )
# ret_val.append( ( name, link ) )
# return ret_val


class UCSCTrackHub( Html ):
"""
Datatype for UCSC TrackHub
"""

file_ext = 'trackhub'
composite_type = 'auto_primary_file'

def __init__(self, **kwd):
Html.__init__(self, **kwd)

def generate_primary_file( self, dataset=None ):
"""
This is called only at upload to write the html file
cannot rename the datasets here - they come with the default unfortunately
"""
rval = [
'<html><head><title>Files for Composite Dataset (%s)</title></head><p/>\
This composite dataset is composed of the following files:<p/><ul>' % (
self.file_ext)]
for composite_name, composite_file in self.get_composite_files( dataset=dataset ).items():
opt_text = ''
if composite_file.optional:
opt_text = ' (optional)'
rval.append('<li><a href="%s">%s</a>%s' % ( composite_name, composite_name, opt_text) )
rval.append('</ul></html>')
return "\n".join(rval)

def set_peek( self, dataset, is_multi_byte=False ):
if not dataset.dataset.purged:
dataset.peek = "Track Hub structure: Visualization in UCSC Track Hub"
else:
dataset.peek = 'file does not exist'
dataset.blurb = 'file purged from disk'

def display_peek( self, dataset ):
try:
return dataset.peek
except:
return "Track Hub structure: Visualization in UCSC Track Hub"

def sniff( self, filename ):
return False