-
Notifications
You must be signed in to change notification settings - Fork 22
summary
Andy Pohl edited this page Oct 23, 2013
·
5 revisions
Usage:
bwtool summary - provide some summary stats for each region in a bed file
or at regular intervals.
usage:
bwtool summary loci input.bw[:chr:start-end] output.txt
where:
-"loci" corresponds to either (a) a bed file with regions to summarize or
(b) a size of interval to summarize genome-wide.
options:
-with-quantiles output 10%/25%/75%/90% quantiles as well surrounding the
median. With -total, this essentially provides a boxplot.
-with-sum-of-squares
output sum of squared deviations from the mean along with
the other fields
-with-sum output sum, also
-keep-bed if the loci bed is given, keep as many bed file
-total only output a summary as if all of the regions are pasted
together
-header put in a header (fields are easy to forget)
Using the same example from the aggregate page:
we can first get the summary of each 10 bp: 1-10, 11-20, 21-30, and 31-36 (6 bp in this case). For this demonstration we'll use the -header option, but often times this option isn't necessary, especially if there is any post-processing of the summary:
$ bwtool summary agg1.bed main.bigWig /dev/stdout -header -with-sum
#chrom start end size num_data min max mean median sum
chr 0 4 4 4 1.00 6.00 3.50 3.50 14.00
chr 9 19 10 10 0.00 10.00 4.30 4.00 43.00
chr 28 35 7 7 3.00 6.00 4.43 4.00 31.00
Everything should be relatively self-explanatory here, although perhaps it's worth mentioning that sometimes these examples switch between referring to bases as 1-10 and 0-10. This is because BED format uses half-open zero-based intervals while WIG format uses 1-based. The former is convenient in terms of calculating the size of a region, while the latter is easier when drawing pictures.