Skip to content
Andy Pohl edited this page Oct 23, 2013 · 5 revisions

Usage:

bwtool summary - provide some summary stats for each region in a bed file
   or at regular intervals.
usage:
   bwtool summary loci input.bw[:chr:start-end] output.txt
where:
   -"loci" corresponds to either (a) a bed file with regions to summarize or
    (b) a size of interval to summarize genome-wide.
options:
   -with-quantiles  output 10%/25%/75%/90% quantiles as well surrounding the
                    median.  With -total, this essentially provides a boxplot.
   -with-sum-of-squares
                    output sum of squared deviations from the mean along with 
                    the other fields
   -with-sum        output sum, also
   -keep-bed        if the loci bed is given, keep as many bed file
   -total           only output a summary as if all of the regions are pasted
                    together
   -header          put in a header (fields are easy to forget)

Examples

Using the same example from the aggregate page:

we can first get the summary of each 10 bp: 1-10, 11-20, 21-30, and 31-36 (6 bp in this case). For this demonstration we'll use the -header option, but often times this option isn't necessary, especially if there is any post-processing of the summary:

$ bwtool summary agg1.bed main.bigWig /dev/stdout -header -with-sum 
#chrom	start	end	size	num_data	min	max	mean	median	sum
chr	0	4	4	4	1.00	6.00	3.50	3.50	14.00
chr	9	19	10	10	0.00	10.00	4.30	4.00	43.00
chr	28	35	7	7	3.00	6.00	4.43	4.00	31.00

Everything should be relatively self-explanatory here, although perhaps it's worth mentioning that sometimes these examples switch between referring to bases as 1-10 and 0-10. This is because BED format uses half-open zero-based intervals while WIG format uses 1-based. The former is convenient in terms of calculating the size of a region, while the latter is easier when drawing pictures.

Clone this wiki locally