-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathray-assembly.txt
102 lines (64 loc) · 2.54 KB
/
ray-assembly.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
Metagenomics Practical
======================
Metagenomics assemblies with Ray
--------------------------------
Ray is a particularly interesting genome assembler due to several
unusual features:
- It can scale to arbitrary numbers of processors and machines by distributing its assembly graph
- It has several functions specific to metagenome assembly 'Ray Meta'
- Ray's author, ``@sebhtml`` is incredibly responsive on Twitter :)
- Ray will happily mix input from several different sequencing techniques, e.g. Illumina and 454
- If run with the ``write-kmers`` option enabled, the resulting assembly graph may be viewed using the separate Ray Cloud Browser software
Installing Ray
--------------
Dependencies
~~~~~~~~~~~~
::
sudo apt-get install build-essential
sudo apt-get install git
sudo apt-get install openmpi1.6-bin openmpi1.6-common libopenmpi1.6-dev
Installing Ray from source code
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
::
git clone https://github.com/sebhtml/ray
git clone https://github.com/sebhtml/RayPlatform
cd ray
HAVE_LIBZ=y MAXKMERLENGTH=64 make
You can add this to your PATH:
::
export PATH=$PATH:`pwd`
A simple command-line for multi-processor execution:
For paired-end reads:
::
mpirun -np 8 Ray -k 31 -p pair1.fastq.gz pair2.fastq.gz -o output_directory
For interleaved paired-end reads:
::
mpirun -np 8 Ray -k 31 -i pairs.fastq.gz -o output_directory
For single-end reads:
::
mpirun -np 8 Ray -k 31 -s reads.fastq.gz -o output_directory
If you want to run Ray Cloud Browser, you will want the ``-write-kmers``
option:
::
mpirun -np 8 Ray -write-kmers -k 31 -p pair1.fastq.gz pair2.fastq.gz -o output directory
If you run via a cluster, i.e. StarCluster, mpirun can be set to execute
on multiple machines, e.g.:
::
mpirun -np 8 -H host1,host2,host3,host4 -k 31 -p pair1.fastq.gz pair2.fastq.gz -o x
For more command-line options, see:
https://github.com/sebhtml/ray/blob/master/MANUAL\_PAGE.txt
Ray Cloud Browser
~~~~~~~~~~~~~~~~~
Here is a useful script to set up Ray Cloud Browser from a kmers.txt and Contigs.fasta file:
::
#!/bin/bash
tag=$1
kmerfile=$2
contigfile=$3
mapid=$4
sectionid=$5
RayCloudBrowser-client create-map $kmerfile $tag.dat
RayCloudBrowser-client add-map config.json "$tag" $tag.dat
RayCloudBrowser-client create-section $contigfile $tag-contigs.dat
RayCloudBrowser-client create-map-annotations-with-section $tag.dat $tag-contigs.dat $sectionid
RayCloudBrowser-client add-section config.json $mapid "$tag Contigs" $tag-contigs.dat