-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fanc pairs takes a long time and without any more ouput #42
Comments
Hey, I'm sorry you are experiencing these slowdown issues - HiC-Pro import is not particularly optimised. I am not sure where exactly it might be stuck. If the HiC-Pro pairs file is very big, perhaps a manual parallelisation would work. I.e. you could split the file into several smaller chunks and then run the pairs command on each one individually, without any filtering. I would also recommend an SSD for this, if you are not using one already. If you are on a network file system enabling the There isn't a built-in command line function at the moment to merge the individual pairs files, but you can use this code from within a Python shell: import fanc
pairs_list = ["first.pairs", "second.pairs"] # replace as necessary
pairs = [fanc.load(file_name) for file_name in pairs_list]
merged = fanc.ReadPairs.merge(pairs, file_name="/path/to/output.pairs")
merged.close() Once you have the merged file, you can run the filtering on it. |
Dear @kaukrise Thank you. I tried Best wishes, |
The dev version won't affect HiC-Pro import, unfortunately. For CHESS, you could create a Juicer file, I think they also support import from pairs files. |
Dear @kaukrise Yes, Juicer file is also OK. CHESS publication used fanc and has the detailed normalisation and filtering instructions, thus, it is more clearly for me to use fanc. If I use Juicer hic file as input for CHESS, i can not find the corresponding filtering options, such as balancing by chromosome and masking the low interaction bins. Best wishes, |
Juicer automatically balances by chromosome. Bins with 0 interactions are automatically masked by FAN-C, also in Juicer files. I think using a Juicer file directly should be quite okay for your needs. If you really need filtering for sparsely populated bins and want to keep working with FAN-C files, you could
But since you appear to be working with quite large matrices, the conversion from Juicer to FAN-C will take some time again. |
Thank you for your quickly reply. I will have a try. |
Dear all,
I used the valid pairs generated by HiC-Pro as input for
fanc pairs
, the command is as following, but it runs slowly and three days without any more ouput. The timestamp of the output file just stayed on the time I submitted the commands. Do you have some suggestions on accelerating the progress?fanc pairs ${pre}_mm10_index.bwt2pairs.validPairs ${pre}_fanc.pairs -g fragment.bed -t 10 -s ${pre}.statistic --filter-pcr-duplicates 1
Best wishes,
Zheng zhuqing
The text was updated successfully, but these errors were encountered: