-
Notifications
You must be signed in to change notification settings - Fork 17
Detailed Use: Quantifying Peak Coverages from Multiple BAM Files
A detailed description can be found here
Quantification of peaks is a one step process with BAMscale. As input, you will need a bed file (with peaks to quantify) as well as one or more BAM files.
Basic Command (example)
./BAMscale cov --bed <preaks.bed> --bam <SAMPLE1.bam> --bam <SAMPLE2.bam> ... --bam <SAMPLEn.bam>
This command will generate for files containing the quantified peaks:
1. Raw Peak Coverages
2. FPKM Normalized Peaks
3. TPM Normalized Peaks
4. Library-Size Normalized Peaks
For input data, we use ATAC-seq from:
"SLFN11 Blocks Stressed Replication Forks Independently of ATR. Mol Cell, 2018"
A detailed walkthrough of the processing can be seen below!
After the alignment, peak-calling, and peak-merging, we are left with 6 BAM (coverage) files and one BED (peak) file. Three BAM files originate from CEM-CCRF cell line, that are either untreated, or treated with CPT for 2 and 4 hours. The other three BAM files are from CB45 (CEM-CCRF SLFN11 gene knockout) cells that were untreated, or treated with CPT for 2 and 4 hours.
Quantifying peaks to compare the global ATAC-seq signal changes in the BAM files can be done with the following command:
./BAMscale cov --bed merged_peaks.bed --prefix ATAC_peaks.quant \
--bam CEM_CPT0h.sorted.dedup.bam \
--bam CEM_CPT2h.sorted.dedup.bam \
--bam CEM_CPT4h.sorted.dedup.bam \
--bam CB45_CPT0h.sorted.dedup.bam \
--bam CB45_CPT2h.sorted.dedup.bam \
--bam CB45_CPT4h.sorted.dedup.bam
Four files are generated from this step:
ATAC_peaks.quant.FPKM_normalized_coverages.tsv
ATAC_peaks.quant.TPM_normalized_coverages.tsv
ATAC_peaks.quant.Library_normalized_coverages.tsv
ATAC_peaks.quant.raw_coverages.tsv
The generated matrices with scaled peak scores can be used to plot with preferred tools, or with the R script we provided here (here LINK). The script uses ggplot2.
A) SLFN11+ (CEM) cells untreated [x-axis] vs. treated with CPT (2h, 4h) [y-axis]
As we can see from the dotplots (heatmap), there are some ATAC sites that open from treatment.
B) SLFN11- (CB45) cells untreated [x-axis] vs. treated with CPT (2h, 4h) [y-axis]
In case of the SLFN11 negative cells, ATAC-sites do not change from treatment
As we can see from the dotplots (heatmap), a big portion of peaks does not change, but there are some ATAC sites that open from treatment.
The first step is to download the samples using the sratoolkit.
fasterq-dump SRR5831755 --split-3
fasterq-dump SRR5831756 --split-3
fasterq-dump SRR5831757 --split-3
fasterq-dump SRR5831758 --split-3
fasterq-dump SRR5831759 --split-3
fasterq-dump SRR5831760 --split-3
Raw reads are aligned with the BWA mem aligner using default alignment settings (with the exception of multi-threading) to the hg19 human genome. Reads aligned with BWA were directly converted from SAM format to BAM format using samtools.
Sample CEM_CPT0h (SRR5831755)
bwa mem -t 8 <hg19.idx> SRR5831755_1.fastq SRR5831755_2.fastq \
| samtools view -Sbh > CEM_CPT0h.unsorted.bam
Sample CEM_CPT4h (SRR5831756)
bwa mem -t 8 <hg19.idx> SRR5831756_1.fastq SRR5831756_2.fastq \
| samtools view -Sbh > CEM_CPT2h.unsorted.bam
Sample CEM_CPT4h (SRR5831757)
bwa mem -t 8 <hg19.idx> SRR5831757_1.fastq SRR5831757_2.fastq \
| samtools view -Sbh > CEM_CPT4h.unsorted.bam
Sample CEM_CPT0h (SRR5831758)
bwa mem -t 8 <hg19.idx> SRR5831758_1.fastq SRR5831758_2.fastq \
| samtools view -Sbh > CB45_CPT0h.unsorted.bam
Sample CEM_CPT4h (SRR5831759)
bwa mem -t 8 <hg19.idx> SRR5831759_1.fastq SRR5831759_2.fastq \
| samtools view -Sbh > CB45_CPT2h.unsorted.bam
Sample CEM_CPT4h (SRR5831760)
bwa mem -t 8 <hg19.idx> SRR5831760_1.fastq SRR5831760_2.fastq \
| samtools view -Sbh > CB45_PT4h.unsorted.bam
Next, we sort reads, mark duplicates, and index the BAM files.
Sample CEM_CPT0h (SRR5831755)
samtools sort -o CEM_CPT0h.sorted.bam CEM_CPT0h.unsorted.bam
samtools index CEM_CPT0h.sorted.bam
java -Xmx8g -jar picardtools.jar MarkDuplicates \
I=CEM_CPT0h.sorted.bam \
O=CEM_CPT0h.sorted.dedup.bam \
M=CEM_CPT0h.sorted.dedup.metrics
samtools index CEM_CPT0h.sorted.dedup.bam
Sample CEM_CPT2h (SRR5831756)
samtools sort -o CEM_CPT2h.sorted.bam CEM_CPT0h.unsorted.bam
samtools index CEM_CPT2h.sorted.bam
java -Xmx8g -jar picardtools.jar MarkDuplicates \
I=CEM_CPT2h.sorted.bam \
O=CEM_CPT2h.sorted.dedup.bam \
M=CEM_CPT2h.sorted.dedup.metrics
samtools index CEM_CPT2h.sorted.dedup.bam
Sample CEM_CPT4h (SRR5831757)
samtools sort -o CEM_CPT4h.sorted.bam CEM_CPT4h.unsorted.bam
samtools index CEM_CPT4h.sorted.bam
java -Xmx8g -jar picardtools.jar MarkDuplicates \
I=CEM_CPT4h.sorted.bam \
O=CEM_CPT4h.sorted.dedup.bam \
M=CEM_CPT4h.sorted.dedup.metrics
samtools CEM_CPT4h.sorted.dedup.bam
Sample CB45_CPT0h (SRR5831758)
samtools sort -o CB45_CPT0h.sorted.bam CB45_CPT0h.unsorted.bam
samtools index CB45_CPT0h.sorted.bam
java -Xmx8g -jar picardtools.jar MarkDuplicates \
I=CB45_CPT0h.sorted.bam \
O=CB45_CPT0h.sorted.dedup.bam \
M=CB45_CPT0h.sorted.dedup.metrics
samtools index CB45_CPT0h.sorted.dedup.bam
Sample CB45_CPT2h (SRR5831759)
samtools sort -o CB45_CPT2h.sorted.bam CB45_CPT2h.unsorted.bam
samtools index CB45_CPT2h.sorted.bam
java -Xmx8g -jar picardtools.jar MarkDuplicates \
I=CB45_CPT2h.sorted.bam \
O=CB45_CPT2h.sorted.dedup.bam \
M=CB45_CPT2h.sorted.dedup.metrics
samtools index CB45_CPT2h.sorted.dedup.bam
Sample CB45_CPT4h (SRR5831760)
samtools sort -o CB45_CPT4h.sorted.bam CB45_CPT4h.unsorted.bam
samtools index CB45_CPT4h.sorted.bam
java -Xmx8g -jar picardtools.jar MarkDuplicates \
I=CB45_CPT4h.sorted.bam \
O=CB45_CPT4h.sorted.dedup.bam \
M=CB45_CPT4h.sorted.dedup.metrics
samtools index CB45_CPT4h.sorted.dedup.bam
4) Calling Peaks with MACS
macs2 callpeak -t CEM_CPT0h.sorted.dedup.bam --nomodel -g hs -f BAMPE -q 0.01 -n CEM_CPT0h
macs2 callpeak -t CEM_CPT2h.sorted.dedup.bam --nomodel -g hs -f BAMPE -q 0.01 -n CEM_CPT2h
macs2 callpeak -t CEM_CPT4h.sorted.dedup.bam --nomodel -g hs -f BAMPE -q 0.01 -n CEM_CPT4h
macs2 callpeak -t CB45_CPT0h.sorted.dedup.bam --nomodel -g hs -f BAMPE -q 0.01 -n CB45_CPT0h
macs2 callpeak -t CB45_CPT2h.sorted.dedup.bam --nomodel -g hs -f BAMPE -q 0.01 -n CB45_CPT2h
macs2 callpeak -t CB45_CPT4h.sorted.dedup.bam --nomodel -g hs -f BAMPE -q 0.01 -n CB45_CPT4h
5) Merging Called Peaks with bedtools
In this step we combine the called peaks in the previous step, sort the peaks, and merge them into a unique st of peaks.
cat *.narrowPeak | bedtools sort | bedtools merge > merged_peaks.bed
./BAMscale cov --bed merged_peaks.bed --prefix ATAC_peaks.quant --bam CEM_CPT0h.sorted.dedup.bam --bam CEM_CPT2h.sorted.dedup.bam --bam CEM_CPT4h.sorted.dedup.bam --bam CB45_CPT0h.sorted.dedup.bam --bam CB45_CPT2h.sorted.dedup.bam --bam CB45_CPT4h.sorted.dedup.bam
Finally, we generate the scaled coverage tracks.
./BAMscale scale --bam CEM_CPT0h.sorted.dedup.bam -t 4
./BAMscale scale --bam CEM_CPT2h.sorted.dedup.bam -t 4
./BAMscale scale --bam CEM_CPT4h.sorted.dedup.bam -t 4
./BAMscale scale --bam CB45_CPT0h.sorted.dedup.bam -t 4
./BAMscale scale --bam CB45_CPT2h.sorted.dedup.bam -t 4
./BAMscale scale --bam CB45_CPT4h.sorted.dedup.bam -t 4
The output of the two commands will be two files in BigWig format. BAMscale "scale" function appends the ".bw" suffix to the end of the input file. For these two examples, the output names will be:
1) CEM_CPT0H.sorted.dedup.bam.bw
2) CEM_CPT2H.sorted.dedup.bam.bw
3) CEM_CPT4H.sorted.dedup.bam.bw
4) CB45_CPT0H.sorted.dedup.bam.bw
5) CB45_CPT2H.sorted.dedup.bam.bw
6) CB45_CPT4H.sorted.dedup.bam.bw
- Main page
- Home
- Installation
- Benchmarking
- Brief examples
- Detailed Manuals
- Visualization scripts