Skip to content
Sergey Aganezov jr edited this page Dec 1, 2018 · 9 revisions

On each execution CAMSA generates multiple files in the directory specified as output with a flag -o in the command line invocation. If no output directory name is given, it is automatically named as camsa_{date}.

Quick anchor links:

The structure of the output directory is shown below:

.output\
+-- input
|  +-- camsa_config.txt
|  +-- as1.camsa.points
|  +-- as2.camsa.points
|  +-- ...
+-- libs
|  +-- js
|  |  +-- xx.js
|  +-- images
|  |  +-- xx.png
|  +-- fonts
|  |  +-- xx.ttf
|  +-- css
|  |  +-- xx.css
+-- comparative
|  +-- subgroups
|  |  +-- group1.camsa.points
|  |  +-- group2.camsa.points
|  |  +-- ...
|  +-- collapsed.camsa.points
|  +-- original.camsa.points
+-- merged
|  +-- merged.camsa.points
+-- report.html

input

  • camsa_config.txt: is a plain text file with the humanly readable dump if the multilayered configuration with which CAMSA was executed in this particular experiment.
  • asx.camsa.points: all the unchanged input files, that were submitted to CAMSA

libs

  • xx.js: JavaScript library files required for a proper work of the interactive HTML report
  • xx.png: images utilized in the interactive HTML report
  • xx.ttf: font files required for a proper work of the interactive HTML report
  • xx.css: CSS files required for a proper work of the interactive report

comparative

  • groupx.camsa.points: a set of assembly points that were reported by all of the assemblies in the specified group. These points are never mentioned in any of the subgroups of groupx. groupx part of the filename is comprised of assembly names (that constitute the group) separated by ..
  • original.camsa.points: a set of all of the assembly points in all of the input assemblies with assigned original assembly points ids.
  • collapsed.camsa.points: a set of all unique assembly points in all of the input assemblies, with assigned collapsed ids, a reference to the original points ids, that each collapsed point is comprised of, as well as with the information about all the conflicts, that each collapsed assembly point (and consequently all the original assembly points that this one is comprised of) assembly is involved in.

Report.html

An interactive HTML, JavaScript, CSS powered document, that contains the information about all the comparative metrics, computed for the input assemblies as well as for individual assembly points.

Report consists of 5 section:

  • per assembly overview
  • grouped assemblies overview comparison
  • individual assemblies
  • aggregated assembly points
  • visualization of SAG, corresponding to a (sub)set of assembly points

All tables in the report utilize the DataTables JavaScript library, which makes them responsive, sortable, and searchable.

Per assembly overview

This table provides, alongside assembly short id and its full name, an overview over computed assembly-wise statistics, such as:

  • #Aps:total number of assembly points
  • Orientedness:
    • O: number of oriented assembly points in the scaffold assembly
    • SO: number of semi-oriented assembly points in the scaffold assembly
    • U: number of unoriented assembly points in the scaffold assembly
  • NC: number of assembly point, that do not conflict with any other assembly
  • OC: number of assembly points, that are out conflicting in the scaffold assembly
  • OSC: number of assembly points, that are out semi-conflicting in the scaffold assembly
  • IC: number of assembly points, that are in conflicting within the scaffold assembly
  • ISC: number of assembly points, that are in semi-conflicting within the scaffold assembly
  • MAP: percentage of assembly points from the scaffold assembly, that participate by any of its realizations in the merged assembly computed by CAMSA

Grouped assemblies overview comparison

This graph is powered by Highcharts JavaScript library. Here we group assembly points by their origin: for a given assembly point we compute the subset of input assemblies that all reported it, and then the graph of such subsets is drawn, where the length of the chart represents the number of assembly points in each subset. We note, that each assembly point is counted only once for each subset of input assemblies containing it (but not for any smaller subset of them). For each assemblies subset we provide the following information:

  • total number of assembly points, that were reported by the assemblies subset
  • number of assembly points that participate in the merged scaffold assembly and were reported by the assemblies subset
  • number of assembly points, that are in conflicting with respect to this subset of input scaffold assemblies
  • number of assembly points, that are in semi-conflicting with respect to this subset of input scaffold assemblies

The image can be altered by enabling/disabling certain metrics and further exported in multiple formats.

Individual assemblies

In individual assemblies section, we provide a separate table for each individual assembly (in the collapsible frame), where each assembly point is represented as a separate row. Such overview allows one to filter by multiple parameters and obtain specific subsets of assembly points in each individual assembly and thus narrow the scope of further analysis of the corresponding genomic region.

Aggregated assembly points

This table section provides a most comprehensive view over all of the assembly points in all input scaffold assemblies. This table features built-in DataTables filter options. In addition to that, a separate (collapsable) advanced search panel is provided, allowing for regexp filtering/keeping with logical or and and operators with respect to multiple columns. This set allows one to focus on any (sub)set of assembly point in the input assemblies for further processing and analysis.

The (filtered) table can be exported at any point in two possible ways:

  • as set of assembly points
  • as a JSON representation of a Scaffold Assembly Graph, suitable for further Cytoscape processing

Visualization

Interactive HTML report produced by CAMSA employs Cytoscape.js JavaScript library to enable Scaffold Assembly (sub)graph visualization. We provide a set of 3 layouts to utilize in graph visualization:

  • grid (fastest, but least pretty)
  • dagre (quite fast, quite pretty)
  • cose-bilkent (slowest, prettiest)