Skip to content

Commit

Permalink
Reduce memory and CPU for CreateImportTsvs task, check for files befo…
Browse files Browse the repository at this point in the history
…re attempting load (#7121)

* reduce memory for ImportGenomes and add this branch to dockstore yml

* do not fail bq load if no files to ingest

* remove feature branch from dockstore
  • Loading branch information
mmorgantaylor committed Apr 6, 2021
1 parent ffd7bae commit 349ea29
Showing 1 changed file with 12 additions and 5 deletions.
17 changes: 12 additions & 5 deletions scripts/variantstore/wdl/ImportGenomes.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -210,10 +210,10 @@ task CreateImportTsvs {
>>>
runtime {
docker: docker
memory: "10 GB"
memory: "3.75 GB"
disks: "local-disk " + disk_size + " HDD"
preemptible: select_first([preemptible_tries, 5])
cpu: 2
cpu: 1
}
output {
String done = "true"
Expand Down Expand Up @@ -322,15 +322,22 @@ task LoadTable {
# even for non-superpartitioned tables (e.g. metadata), the TSVs do have the suffix
FILES="~{datatype}_${PADDED_TABLE_ID}_*"

NUM_FILES=$(gsutil ls "${DIR}${FILES}" | wc -l)

if [ ~{superpartitioned} = "true" ]; then
TABLE="~{dataset_name}.${PREFIX}~{datatype}_${PADDED_TABLE_ID}"
else
TABLE="~{dataset_name}.${PREFIX}~{datatype}"
fi

bq load --location=US --project_id=~{project_id} --skip_leading_rows=1 --source_format=CSV -F "\t" $TABLE $DIR$FILES ~{schema} || exit 1
echo "ingested ${FILES} file from $DIR into table $TABLE"
gsutil mv $DIR$FILES ${DIR}done/
if [ $NUM_FILES -gt 0 ]; then
bq load --location=US --project_id=~{project_id} --skip_leading_rows=1 --source_format=CSV -F "\t" $TABLE $DIR$FILES ~{schema} || exit 1
echo "ingested ${FILES} file from $DIR into table $TABLE"
gsutil mv $DIR$FILES ${DIR}done/
else
echo "no ${FILES} files to process in $DIR"
fi

>>>

runtime {
Expand Down

0 comments on commit 349ea29

Please sign in to comment.