Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TAD alignment problem between two samples #19

Open
shenlinyong opened this issue Aug 9, 2022 · 1 comment
Open

TAD alignment problem between two samples #19

shenlinyong opened this issue Aug 9, 2022 · 1 comment

Comments

@shenlinyong
Copy link

shenlinyong commented Aug 9, 2022

Thank you for developing such great software, I would like to use the class you wrote for tadlib.hitad.aligner.DomainSet class to find the difference TAD. However, I don't understand what the phrase "enstr: Unique identifier for input domain set" means, how should I prepare the enstr file for my data, and is the domainlistlist [['chr1', 150000, 360000, 0], ['chr1', 360000, 440000, 0], ['chr1', 440000, 860000, 0], ['chr1', 860000, 1200000, 0], ['chr1', 1200000, 1340000, 2], ['chr1', 1200000, 1590000, 1]]I prepared correct?

class tadlib.hitad.aligner.DomainSet(en, domainlist, res, hier=True)[[source]](https://xiaotaowang.github.io/TADLib/_modules/tadlib/hitad/aligner.html#DomainSet)
Parse and hold a hierarchical domain set.

Parameters
enstr
Unique identifier for input domain set.

domainlistlist
List of domains. See [tadlib.hitad.aligner.BoundSet](https://xiaotaowang.github.io/TADLib/hitad_api.html#tadlib.hitad.aligner.BoundSet) for details.

This is the python script I ran:

import sys
sys.path.append('/home/SLY68/anaconda3/envs/tadlib/lib/python3.7/site-packages/')
from tadlib.hitad.aligner import DomainAligner as DA
import pandas as pds
***************************************************************************
Version 0.4.3 is out of date, Version 0.4.4 is available.

***************************************************************************
fat_tad = pds.read_table("./2022/hic/juicer/down_analysis/tad/tadlib/fat.txt")
print(fat_tad[0:6])
   chr1        0   150000  0
0  chr1   150000   360000    0
1  chr1   360000   440000    0
2  chr1   440000   860000    0
3  chr1   860000  1200000    0
4  chr1  1200000  1340000    2
5  chr1  1200000  1590000    1
fat_tad=fat_tad.apply(lambda x: list(x), axis=1).values.tolist()
print(fat_tad[0:6])
[['chr1', 150000, 360000, 0], ['chr1', 360000, 440000, 0], ['chr1', 440000, 860000, 0], ['chr1', 860000, 1200000, 0], ['chr1', 1200000, 1340000, 2], ['chr1', 1200000, 1590000, 1]]
fat_data = DA("fat", fat_tad, 10000)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_208882/2791293192.py in <module>
----> 1 fat_data = DA("fat", fat_tad, 10000)

~/anaconda3/envs/tadlib/lib/python3.7/site-packages/tadlib/hitad/aligner.py in __init__(self, *args)
    564         self.DomainSets = {}
    565         for domains in args:
--> 566             self.DomainSets[domains.Label] = domains
    567         self.Results = {}
    568 

AttributeError: 'str' object has no attribute 'Label'

Thanks again for your help!

@XiaoTaoWang
Copy link
Owner

Sorry for the late response and thank you for your interests!

To align/compare between two domain sets, you first need to read the domain lists of two samples using the readHierDomain function:

>>> from tadlib.hitad.aligner import *
>>> list1 = readHierDomain('sample1.txt')
>>> list2 = readHierDomain('sample2.txt')

After that, pass the above lists to DomainSet , which will represent hierarchical domains in trees:

>>> sample1 = DomainSet('sample1', list1, 10000) # supposing your domains were called at the 10kb resolution
>>> sample2 = DomainSet('sample2', list2, 10000)

Finally, perform the alignment using DomainAligner :

>>> test_align = DomainAligner(sample1, sample2)
>>> test_align.align('sample1', 'sample2')

Different types of domain-level alignments can be then accessed through this object:

>>> conserved = test_align.conserved('sample1', 'sample2') # Conserved TADs
>>> semi = test_align.inner_changed('sample1', 'sample2') # Semi-Conserved TADs
>>> merged = test_align.merged('sample1', 'sample2') # Merged TADs
>>> split = test_align.split('sample1', 'sample2') # Split TADs

Let me know if you have any further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants