-
Notifications
You must be signed in to change notification settings - Fork 4
Output
On each execution CAMSA generates multiple files in the directory specified as output
with a flag -o
in the command line invocation. If no output
directory name is given, it is automatically named as camsa_{date}
.
Quick anchor links:
The structure of the output directory is shown below:
.output\
+-- input
| +-- camsa_config.txt
| +-- as1.camsa.points
| +-- as2.camsa.points
| +-- ...
+-- libs
| +-- js
| | +-- xx.js
| +-- images
| | +-- xx.png
| +-- fonts
| | +-- xx.ttf
| +-- css
| | +-- xx.css
+-- comparative
| +-- subgroups
| | +-- group1.camsa.points
| | +-- group2.camsa.points
| | +-- ...
| +-- collapsed.camsa.points
| +-- original.camsa.points
+-- merged
| +-- merged.camsa.points
+-- report.html
-
camsa_config.txt
: is a plain text file with the humanly readable dump if the multilayered configuration with which CAMSA was executed in this particular experiment. -
asx.camsa.points
: all the unchanged input files, that were submitted to CAMSA
-
xx.js
: JavaScript library files required for a proper work of the interactive HTML report -
xx.png
: images utilized in the interactive HTML report -
xx.ttf
: font files required for a proper work of the interactive HTML report -
xx.css
: CSS files required for a proper work of the interactive report
-
groupx.camsa.points
: a set of assembly points that were reported by all of the assemblies in the specified group. These points are never mentioned in any of the subgroups ofgroupx
.groupx
part of the filename is comprised of assembly names (that constitute the group) separated by.
. -
original.camsa.points
: a set of all of the assembly points in all of the input assemblies with assigned original assembly points ids. -
collapsed.camsa.points
: a set of all unique assembly points in all of the input assemblies, with assigned collapsed ids, a reference to the original points ids, that each collapsed point is comprised of, as well as with the information about all the conflicts, that each collapsed assembly point (and consequently all the original assembly points that this one is comprised of) assembly is involved in.
An interactive HTML, JavaScript, CSS powered document, that contains the information about all the comparative metrics, computed for the input assemblies as well as for individual assembly points.
Report consists of 5 section:
- per assembly overview
- grouped assemblies overview comparison
- individual assemblies
- aggregated assembly points
- visualization of SAG, corresponding to a (sub)set of assembly points
All tables in the report utilize the DataTables JavaScript library, which makes them responsive, sortable, and searchable.
This table provides, alongside assembly short id and its full name, an overview over computed assembly-wise statistics, such as:
-
#Aps
:total number of assembly points -
Orientedness
:-
O
: number of oriented assembly points in the scaffold assembly -
SO
: number of semi-oriented assembly points in the scaffold assembly -
U
: number of unoriented assembly points in the scaffold assembly
-
-
NC
: number of assembly point, that do not conflict with any other assembly -
OC
: number of assembly points, that are out conflicting in the scaffold assembly -
OSC
: number of assembly points, that are out semi-conflicting in the scaffold assembly -
IC
: number of assembly points, that are in conflicting within the scaffold assembly -
ISC
: number of assembly points, that are in semi-conflicting within the scaffold assembly -
MAP
: percentage of assembly points from the scaffold assembly, that participate by any of its realizations in the merged assembly computed by CAMSA
This graph is powered by Highcharts JavaScript library. Here we group assembly points by their origin: for a given assembly point we compute the subset of input assemblies that all reported it, and then the graph of such subsets is drawn, where the length of the chart represents the number of assembly points in each subset. We note, that each assembly point is counted only once for each subset of input assemblies containing it (but not for any smaller subset of them). For each assemblies subset we provide the following information:
- total number of assembly points, that were reported by the assemblies subset
- number of assembly points that participate in the merged scaffold assembly and were reported by the assemblies subset
- number of assembly points, that are in conflicting with respect to this subset of input scaffold assemblies
- number of assembly points, that are in semi-conflicting with respect to this subset of input scaffold assemblies
The image can be altered by enabling/disabling certain metrics and further exported in multiple formats.
In individual assemblies section, we provide a separate table for each individual assembly (in the collapsible frame), where each assembly point is represented as a separate row. Such overview allows one to filter by multiple parameters and obtain specific subsets of assembly points in each individual assembly and thus narrow the scope of further analysis of the corresponding genomic region.
This table section provides a most comprehensive view over all of the assembly points in all input scaffold assemblies. This table features built-in DataTables filter options. In addition to that, a separate (collapsable) advanced search panel is provided, allowing for regexp filtering/keeping with logical or
and and
operators with respect to multiple columns. This set allows one to focus on any (sub)set of assembly point in the input assemblies for further processing and analysis.
The (filtered) table can be exported at any point in two possible ways:
- as set of assembly points
- as a JSON representation of a Scaffold Assembly Graph, suitable for further Cytoscape processing
Interactive HTML report produced by CAMSA employs Cytoscape.js JavaScript library to enable Scaffold Assembly (sub)graph visualization. We provide a set of 3 layouts to utilize in graph visualization:
- grid (fastest, but least pretty)
- dagre (quite fast, quite pretty)
- cose-bilkent (slowest, prettiest)
Sergey Aganezov & Max A. Alkseyev, Computational Biology Institute, The George Washington University