-
Notifications
You must be signed in to change notification settings - Fork 10
/
notes.txt
75 lines (46 loc) · 2.59 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
Variant calling tutorial notes:
BQSR?
VQSR?
GATK and FB should proceed from the same alignment set instead of duplicating effort.
in actual text of tutorials should some things be shown? Or only displayed in terminal?
show help output from running, e.g. `freebayes` with no input?
show lines from files, e.g. samfile? fastq?
SCRIPTS:
check to ensure that slurm resource requests == process needs in each script.
check to ensure --job-name= is correct
where will reference genome/indexes reside?
add section in part 1 on IGV.
Add note on license for distribution
edit to keep "introduction" very short. add a "motivation" section after the table of contents to cover more background.
- done, but motivation sections are not all written.
- motivation sections could include a flow chart?
consider making parts 1 and 3 completely independent.
add flow chart(s) to readme.md that can be revisited in subsequent sections
make or add figures to explain freebayes, gatk.
asking for more resources for align script (12 procs instead of 4) to try to get all three alignments done quicker. check to see if typical workshop size + teaching partition can support this.
tried this. bwa does not like to run as a background process? tried two instances. one didn't use any processor.
## 19.09.06
checked parts 1,2,3. all scripts run fine.
alignment may take too long. consider trimming focal region to 2-3mb instead of 5?
in piped example, son takes 10min, mom and dad 18 min each.
need to time the other scripts.
## 19.09.09
get most recent freebayes on the cluster
consider removing -L targets from gatk section, that would make it independent of Part 4a
major problem with GATK
"16:49:17.475 WARN PairHMM - ***WARNING: Machine does not have the AVX instruction set support needed for the accelerated AVX PairHmm. Falling back to the MUCH slower LOGLESS_CACHING implementation!"
This leads to much longer run time when submitted through slurm than when executed in interactive session.
## 19.09.11
consider adding discussion of ins and outs of targets, when to specify them. target/exome capture is a good extra discussion point.
get freebayes updated
you will eventually have to point at a different version of the human genome, right?
issue with AVX PairHMM is confined to nodes 11-13 (so far).
this is a cluster problem.
need to make GATK run only on nodes that have AVX instruction set
can specify xeon partition, will likely solve the problem.
## 19.09.12
need to get some software installed/updated.
vt
vcflib
## 19.09.23
where are we going to get the reference genome from?!