A minimal working example of DMTCP checkpoint-restart inside a Singularity container.
You'll need to have Singularity installed locally.
make shub
make
make demonstrate
make test
This example uses Singularity v2.6.x
and DMTCP v3.0.0
(e.g., from the tip of master circa December 2018).
I didn't play around with other versions of these softwares.
Unfortunately, this example doesn't seem to be totally portable.
I was able to get the example to run on my own laptop just fine.
In order to get Singularity checkpoint/restart to work on CircleCI's virtual machines (i.e., machine
), I had to disable a runtime assert in the source for DMTCP (see here).
On a Michigan State University High Performance Computing Center development node, which runs CentOS 7 and uses Singularity v2.5.2-dist
, the demonstration currently crashes out at the first attempted checkpoint.
The iCER staff put together a nice tutorial of DMTCP checkpointing on the HPCC here.
With some further finessing along those lines, checkpointing Singularity containers on our HPCC might be possible.
.circleci
materials were pilfered from Container Tool's example builder for Singularity containers using Circle Continuous Integration.
Thanks @vsoch!