Ecdna-finder is a modification and improvement of CircleMap which is under MIT license.
The idea of ecdna-finder is halfly the same as CircleMap. Ecdna-finder reuses the idea of CircleMap and modifies it to detect ecDNA(bigger than eccDNA which is the focus of CircleMap) efficiently.
Three files influding extract.py, utils.py, mates.py is a modification of extract_circle_SV_reads.py,utils.py,repeats.py respectively in CircleMap/circlemap/.
Thanks to @iprada for developing CircleMap and it totally facilitates my development of ecDNA-finder.
To run the software simply, you only need to prepare coordinate-sorted bam file and queryname-sorted bam file.
Use follow command to transform a coordinate-sorted bam file to queryname-sorted.
samtools sort -n -@ {threads_num} -o {queryname-sorted} {coordinate-sorted}
- numpy
- pandas
- pysam
All can be easy installed by pip or conda
Run simply
python main.py -coord {coordinate-sorted} -query {queryname-sorted} -dir {dirname}
-cutoff : the default value of cutoff is 0, which means use the round(depth_average / 20) to cutoff peak and split read mate. The depth_average is calculated automatically. Certainly, the mininum allowable value is 1. YOU can provide this value to change the result. BUT the amount of time used varies accordingly.
Results is placed in a new directory named {dirname} in the current directory. The circ_results.tsv is a tab-delimiter file.
usage: main.py [-h] -coord COORDINATE -query QUERYNAME -dir DIRNAME
[-cutoff CUTOFF]
ecDNA-finder
optional arguments:
-h, --help show this help message and exit
-coord COORDINATE, --coordinate COORDINATE
bam file sorted by coordinate
-query QUERYNAME, --queryname QUERYNAME
bam file sorted by queryname
-dir DIRNAME, --dirname DIRNAME
result directory
-cutoff CUTOFF, --cutoff CUTOFF
seed interval cutoff and support read cutoff
A pbs_file recommended for lab use
#!/bin/sh
#PBS -N PBS_ecDNA
#PBS -l nodes=1:ppn=5
#PBS -l walltime=16:00:00
#PBS -S /bin/bash
#PBS -q normal_3
#PBS -o /public/home/zhangjing1/yangjk/ecDNA/result/ecDNA_out
#PBS -e /public/home/zhangjing1/yangjk/ecDNA/result/ecDNA_out
start='------------START------------'$(date "+%Y %h %d %H:%M:%S")'------------START------------'
echo $start
echo $start >&2
dirname=SRR8236745
cd /public/home/zhangjing1/yangjk/ecDNA/result/
coordinate=/public/home/zhangjing1/yangjk/data/bam/${dirname}/sorted_coordinate.bam
queryname=/public/home/zhangjing1/yangjk/data/bam/${dirname}/sorted_query_name.bam
/public/home/liuxs/anaconda3/envs/ecDNA/bin/python /public/home/zhangjing1/yangjk/ecDNA/code/main.py -coord ${coordinate} -query ${queryname} -dir ${dirname}
end='-------------END-------------'$(date "+%Y %h %d %H:%M:%S")'-------------END-------------'
echo $end
echo $end >&2