-
Notifications
You must be signed in to change notification settings - Fork 22
DB to track files? #20
Comments
The convention I'm moving toward is to have only a single task per task-directory. The run-directory is inferred as the directory of the script. This has 2 advantages:
Let's say you have a directory like this: PATH/myjob/input.a
PATH/myjob/input.b
PATH/myjob/run.sh For (1), the wrapper creates this: /tmp/myjob/input.a -> symlink to PATH/input.a
/tmp/myjob/input.b -> symlink to PATH/input.b
/tmp/myjob/run.sh -> symlink to PATH/run.sh Then it runs our script. And when done, the wrapper copies everything that is not a symlink back to For (2), we simply add a symlink like this:
This part is already implemented, except that the symlink today points to Currently, the contents of $ ls mypwatcher/
exits heartbeats jobs state.py wrappers
$ ls mypwatcher/jobs
J0aaef15a4a19a1293ffc4111f962fc46e01157888ddeee2863d40b94255b63a5 J7af4926326d614413d9f80e5e6b3f523fa31d8288c63ea845a67ff5aefd52461
...
$ ls mypwatcher/wrappers/
run-J0aaef15a4a19a1293ffc4111f962fc46e01157888ddeee2863d40b94255b63a5.bash run-J7af4926326d614413d9f80e5e6b3f523fa31d8288c63ea845a67ff5aefd52461.bash
...
$ ls mypwatcher/exits/
exit-J0aaef15a4a19a1293ffc4111f962fc46e01157888ddeee2863d40b94255b63a5 exit-J7af4926326d614413d9f80e5e6b3f523fa31d8288c63ea845a67ff5aefd52461 So we have 2 good reasons to have only 1 task per directory. That's not a difficult restriction to follow. (Note that for (1), we would also sometimes like to copy the input into The bottom line is that, by following simple conventions, we don't really need a DB. (There is a DB for pwatcher, but that's different.) Observe: $ ls -trd run-*
run-bam2fasta run-fasta2referenceset run-pbalign-00 run-pbalign_gather run-gc-01 run-gc-gather
run-falcon run-pbalign-scatter run-pbalign-01 run-gc_scatter run-gc-00 run-polished-assembly-report
$ ls -1trgG run-*
run-bam2fasta:
total 120
-rw-rw-r-- 1 241 May 18 17:41 run_bam2fasta.sh
lrwxrwxrwx 1 242 May 18 17:41 pwatcher.dir -> /home/UNIXHOME/cdunn/repo/pb/smrtanalysis-client/smrtanalysis/siv/testkit-jobs/sa3_pipelines/hgap5_fake/s
ynth5k/job_output/tasks/falcon_ns.tasks.task_hgap_run-0/mypwatcher/jobs/J71c0fd091f2bce465a8cdab4debe4e053b0ef6ba8aeb88e808f46daebb761950
-rw-rw-r-- 1 1798 May 18 17:41 filtered.subreadset.xml
-rw-rw-r-- 1 102740 May 18 17:41 input.fasta
run-falcon:
total 316
-rw-rw-r-- 1 187 May 18 17:41 raw_reads.fofn
-rw-rw-r-- 1 893 May 18 17:41 fc.cfg
-rw-rw-r-- 1 1060 May 18 17:41 fc.json
-rw-rw-r-- 1 414 May 18 17:41 run_falcon.sh
lrwxrwxrwx 1 242 May 18 17:41 pwatcher.dir -> /home/UNIXHOME/cdunn/repo/pb/smrtanalysis-client/smrtanalysis/siv/testkit-jobs/sa3_pipelines/hgap5_fake/synth5k/job_output/tasks/falcon_ns.tasks.task_hgap_run-0/mypwatcher/jobs/J18f9e040a90143f08dec1ca0f85ca2c4a763a3c0dc3dae1b7b63342f18adef53
drwxrwxr-x 2 4096 May 18 17:41 scripts
drwxrwxr-x 2 4096 May 18 17:41 sge_log
drwxrwxr-x 6 4096 May 18 17:41 mypwatcher
drwxrwxr-x 5 4096 May 18 17:42 0-rawreads
-rw-rw-r-- 1 1013 May 18 17:42 fc.log
drwxrwxr-x 4 4096 May 18 17:42 1-preads_ovl
drwxrwxr-x 2 4096 May 18 17:42 2-asm-falcon
-rw-rw-r-- 1 267790 May 18 17:42 pypeflow.log
lrwxrwxrwx 1 21 May 18 17:42 asm.fasta -> 2-asm-falcon/p_ctg.fa
lrwxrwxrwx 1 30 May 18 17:42 preads.fofn -> 1-preads_ovl/input_preads.fofn
-rw-rw-r-- 1 26 May 18 17:42 asm.fasta.fai
run-fasta2referenceset:
total 8
...
run-polished-assembly-report:
total 116
-rw-rw-r-- 1 899 May 18 17:43 run_report.sh
lrwxrwxrwx 1 242 May 18 17:43 pwatcher.dir -> /home/UNIXHOME/cdunn/repo/pb/smrtanalysis-client/smrtanalysis/siv/testkit-jobs/sa3_pipelines/hgap5_fake/synth5k/job_output/tasks/falcon_ns.tasks.task_hgap_run-0/mypwatcher/jobs/J393c6e14ce6037183c55e2c3ddac9c52938b0c43e95e73fe5c89e78fa8fb6e8e
-rw-rw-r-- 1 76169 May 18 17:43 alignment.summary.gff
-rw-rw-r-- 1 13300 May 18 17:43 polished_coverage_vs_quality.png
-rw-rw-r-- 1 2974 May 18 17:43 polished_coverage_vs_quality_thumb.png
-rw-rw-r-- 1 1359 May 18 17:43 polished_assembly_report.json
-rw-rw-r-- 1 69 May 18 17:43 polished_coverage_vs_quality.csv |
…-PATH to develop * commit 'd45f554a5a4d09606881380b05dece0bd0085b77': use_tmpdir=False for local Task Use Dist for NPROC/MB; allow sge_option per task for fs_based; support local dist Stop telling about heartbeats (unused) Catch bad job_queue usage early (spaces are for blocking pwatcher) Moved gen_task a bit Add /bin to PATH in run.sh /bin/bash, since /bin might not be in $PATH
@pb-jchin wrote:
There is
pwatcher/state.py
, but maybe you really want a forward link from the run-dir into pwatcher.The text was updated successfully, but these errors were encountered: