Github actions for system testing failed with error "No space left on device" #78

ManavalanG · 2023-06-20T19:45:18Z

Github actions for system testing started failing even though there has not been any significant code changes. It happened during step "Run QuaC system testing - WGS mode AND no prior QC data" due to following error in multiple(>5) snakemake-triggered jobs:

FATAL:   while extracting /home/runner/work/quac/quac/.snakemake/singularity/e0c80565ed6b26b379a971ee706979ce.simg: root filesystem extraction failed: extract command failed: WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating
WARNING: Skipping mount /etc/hosts [binds]: /etc/hosts doesn't exist in container
WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
WARNING: Skipping mount proc [kernel]: /proc doesn't exist in container
WARNING: Skipping mount /opt/hostedtoolcache/singularity/3.8.3/x64/var/singularity/mnt/session/tmp [tmp]: /tmp doesn't exist in container
WARNING: Skipping mount /opt/hostedtoolcache/singularity/3.8.3/x64/var/singularity/mnt/session/var/tmp [tmp]: /var/tmp doesn't exist in container
WARNING: Skipping mount /opt/hostedtoolcache/singularity/3.8.3/x64/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

Write on output file failed because No space left on device

FATAL ERROR:writer: failed to write file /image/root/usr/local/bin/x86_64-conda_cos6-linux-gnu-objcopy
Parallel unsquashfs: Using 2 processors
17464 inodes (25034 blocks) to write

: exit status 1

My suspicion was that something happened at the end of github runners, and so I reran multiple times over several days. However, they kept failing due to same error albeit at different snakemake-triggered jobs. Next, I reran the workflow for a commit that was successful in the past, but it failed this time around. This added strength to the notion that runners were the cause for these failures.

As a next step after discussion with James, storage at multiple stages of the github actions workflow was printed out - e754213

In the beginning of workflow, root dir had 22G available, and it had 5.6G available when the workflow errored out. Note that Github runners are said to have 14GB of storage avaialble, and storage consumed here was ~16G. Suspicion at this stage was that we are using more storage than we are supposed to.

We thought about Github large runners but they would cost us:

* For larger runners, there is no additional cost for configurations that assign public static IP addresses to a larger runner. For more information on larger runners, see "Using larger runners."
* Entitlement minutes cannot be used for larger runners.
* The larger runners are not free for public repositories.

Source

We then decided to free up storage space after seeing this thread - 126f5be. The workflow is still running, but it is already past the step that used to error out. Overall, it freed 29G from the root folder. Note that, the step "Run QuaC system testing - WGS mode AND no prior QC data" of the workflow consumed ~7G.

The text was updated successfully, but these errors were encountered:

ManavalanG · 2023-06-20T20:00:16Z

The workflow with fix to free up storage space before running the system testing succeeded. Note that, ubuntu runner had 61G used space in the beginning, and it ended with 45G used.

ManavalanG added the bug Something isn't working label Jun 20, 2023

ManavalanG closed this as completed Jun 20, 2023

ManavalanG mentioned this issue Jun 20, 2023

Adds system testing workflow to github actions #75

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Github actions for system testing failed with error "No space left on device" #78

Github actions for system testing failed with error "No space left on device" #78

ManavalanG commented Jun 20, 2023 •

edited

Loading

ManavalanG commented Jun 20, 2023

Github actions for system testing failed with error "No space left on device" #78

Github actions for system testing failed with error "No space left on device" #78

Comments

ManavalanG commented Jun 20, 2023 • edited Loading

ManavalanG commented Jun 20, 2023

ManavalanG commented Jun 20, 2023 •

edited

Loading