Skip to content
Andy Pohl edited this page Oct 23, 2013 · 5 revisions

Usage:

bwtool summary - provide some summary stats for each region in a bed file
   or at regular intervals.
usage:
   bwtool summary loci input.bw[:chr:start-end] output.txt
where:
   -"loci" corresponds to either (a) a bed file with regions to summarize or
    (b) a size of interval to summarize genome-wide.
options:
   -with-quantiles  output 10%/25%/75%/90% quantiles as well surrounding the
                    median.  With -total, this essentially provides a boxplot.
   -with-sum-of-squares
                    output sum of squared deviations from the mean along with 
                    the other fields
   -with-sum        output sum, also
   -keep-bed        if the loci bed is given, keep as many bed file
   -total           only output a summary as if all of the regions are pasted
                    together
   -header          put in a header (fields are easy to forget)

Examples

Using the same example from the aggregate page:

we can first get the summary of each 10 bp: 1-10, 11-20, 21-30, and 31-36 (6 bp in this case). For this demonstration we'll use the -header option, but often times this option isn't necessary, especially if there is any post-processing of the summary:

$ bwtool summary 10 main.bigWig /dev/stdout -header -with-sum
#chrom	start	end	size	num_data	min	max	mean	median	sum
chr	0	10	10	10	1.00	6.00	4.00	5.00	40.00
chr	10	20	10	10	0.00	10.00	4.00	3.50	40.00
chr	20	30	10	6	1.00	4.00	2.33	2.00	14.00
chr	30	36	6	6	2.00	6.00	4.33	4.00	26.00

Everything should be relatively self-explanatory here, although perhaps it's worth mentioning that sometimes these examples switch between referring to bases as 1-10 and 0-10. This is because BED format uses half-open zero-based intervals while WIG format uses 1-based. The former is convenient in terms of calculating the size of a region, while the latter is easier when drawing pictures. In bases 21-30 the num_data column is 6 because 4 bases have missing data. If one prefers to treat missing data as zero, the -fill=0 option can be used:

$ bwtool summary 10 main.bigWig /dev/stdout -header -with-sum -fill=0
#chrom	start	end	size	num_data	min	max	mean	median	sum
chr	0	10	10	10	1.00	6.00	4.00	5.00	40.00
chr	10	20	10	10	0.00	10.00	4.00	3.50	40.00
chr	20	30	10	10	0.00	4.00	1.40	1.50	14.00
chr	30	36	6	6	2.00	6.00	4.33	4.00	26.00

with num_data, mean, and median changing correspondingly.

Clone this wiki locally