Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating shell scripts instead of functions for wrappers without specifying on the container.yaml #565

Closed
surak opened this issue Jul 18, 2022 · 15 comments · Fixed by #586

Comments

@surak
Copy link
Contributor

surak commented Jul 18, 2022

As the wrapper_scripts option is now enabled by default, shpc will only create scripts in cases the container.yaml file has the docker_scripts/singularity_scripts tag.

My suggestion would be that, instead of creating functions for calling the containers, simple default shell scripts with such content be created instead. This way, slurm and others will be happy to call it.

The current default is to only generate shell scripts if there's enough info on the container.yaml.

Changes needed:

  • The module would need to export a path, which is not done now.
  • Each function (like container_run, container_exec, container_container, container_shell) gets created as a shell script instead of a function.

What do you think? That's better than to have to add a script/wrapper entry to every single container in the registry.

@vsoch
Copy link
Member

vsoch commented Jul 18, 2022

Hmm this sounds like a bug - the scripts should generate using the default templates from the settings.yaml. @muffato was this added with the PR that updated wrapper scripts?

@muffato
Copy link
Contributor

muffato commented Jul 18, 2022

Yes, it sounds like a bug too ! Scripts should be created regardless of whether they have a template defined / overwritten in singularity_scripts. I can take a look later tonight

@muffato
Copy link
Contributor

muffato commented Jul 18, 2022

Hi. I can't reproduce the problem on a clean install.
I'm on fb9a99a

$ git status
On branch main
Your branch is up-to-date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   .gitignore
        modified:   shpc/settings.yml

no changes added to commit (use "git add" and/or "git commit -a")

$ git diff shpc/settings.yml
diff --git a/shpc/settings.yml b/shpc/settings.yml
index 703359384..dc3cd85f4 100644
--- a/shpc/settings.yml
+++ b/shpc/settings.yml
@@ -5,7 +5,7 @@
 # set a default module system
 # Currently lmod and tcl are supported. To request an additional system, 
 # please open an issue https://github.com/singularityhub/singularity-hpc
-module_sys: lmod
+module_sys: tcl
 
 # config editor
 config_editor: vim

$ shpc install quay.io/biocontainers/blast
singularity pull (...)
(...)
Module quay.io/biocontainers/blast:2.12.0--hf3cf87c_4 was created.

$ find modules/
modules/
modules/quay.io
modules/quay.io/biocontainers
modules/quay.io/biocontainers/blast
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blast_formatter
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blastdb_aliastool
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blastdbcheck
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blastdbcmd
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blastn
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blastp
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blastx
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/cleanup-blastdb-volumes.py
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/convert2blastmask
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/deltablast
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/eblast
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/makeblastdb
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/psiblast
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/rpsblast
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/rpstblastn
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/tblastn
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/tblastx
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/update_blastdb.pl
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/module.tcl
modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/99-shpc.sh

$ cat modules/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/bin/blastn
#!/bin/bash

script=`realpath $0`
wrapper_bin=`dirname $script`
moduleDir=`dirname $wrapper_bin`

singularity ${SINGULARITY_OPTS} exec ${SINGULARITY_COMMAND_OPTS} -B $moduleDir/99-shpc.sh:/.singularity.d/env/99-shpc.sh  /path/to/singularity-hpc/containers/quay.io/biocontainers/blast/2.12.0--hf3cf87c_4/quay.io-biocontainers-blast-2.12.0--hf3cf87c_4-sha256:9df91dee10f97405384734f964021feae38fcf68a721315f706be99be9366d86.sif /usr/local/bin/blastn "$@"

@surak : what settings do you have ? In particular the wrapper-scripts settings, incl. its subkeys singularity and docker

@vsoch
Copy link
Member

vsoch commented Jul 19, 2022

@surak that would be great if you could give the complete steps you took so we can reproduce! I can try as well.

@surak
Copy link
Contributor Author

surak commented Jul 19, 2022

 find .
.
./nvcr.io
./nvcr.io/nvidia
./nvcr.io/nvidia/tensorflow
./nvcr.io/nvidia/tensorflow/.version
./nvcr.io/nvidia/tensorflow/22.06-tf2-py3
./nvcr.io/nvidia/tensorflow/22.06-tf2-py3/module.lua
./nvcr.io/nvidia/tensorflow/22.06-tf2-py3/99-shpc.sh
./nvcr.io/nvidia/tensorflow/22.06-tf2-py3/nvcr.io-nvidia-tensorflow-22.06-tf2-py3-sha256:ea76aabaa17ee0e308a443fa1e7666f98350865f266b02610a60791bd0feeb40.sif

I can confirm that quay.io works with the aliases tag from the container.yaml file, but not the template functions.

@surak
Copy link
Contributor Author

surak commented Aug 1, 2022

So, I think we are talking about different things.

In quay.io/biocontainers/blast/container.yaml, the aliases are like this:

aliases:
  blast_formatter: /usr/local/bin/blast_formatter
  blastdb_aliastool: /usr/local/bin/blastdb_aliastool
  blastdbcheck: /usr/local/bin/blastdbcheck
  blastdbcmd: /usr/local/bin/blastdbcmd
  blastn: /usr/local/bin/blastn
  blastp: /usr/local/bin/blastp
  blastx: /usr/local/bin/blastx
  cleanup-blastdb-volumes.py: /usr/local/bin/cleanup-blastdb-volumes.py
  convert2blastmask: /usr/local/bin/convert2blastmask
  deltablast: /usr/local/bin/deltablast
  eblast: /usr/local/bin/eblast
  makeblastdb: /usr/local/bin/makeblastdb
  psiblast: /usr/local/bin/psiblast
  rpsblast: /usr/local/bin/rpsblast
  rpstblastn: /usr/local/bin/rpstblastn
  tblastn: /usr/local/bin/tblastn
  tblastx: /usr/local/bin/tblastx
  update_blastdb.pl: /usr/local/bin/update_blastdb.pl

Which make sense: they are aliases to binaries on the filesystem of the container.

This is related to the aliases generated in the template. For example, here: https://github.com/singularityhub/singularity-hpc/blob/main/shpc/main/modules/templates/singularity.lua , where you have

 - {|module_name|}-run:
       singularity run {% if features.gpu %}{{ features.gpu }} {% endif %}{% if features.home %}-B {{ features.home }} --home {{ features.home }} {% endif %}{% if features.x11 %}-B {{ features.x11 }} {% endif %}{% if settings.environment_file %}-B <moduleDir>/{{ settings.environment_file }}:/.singularity.d/env/{{ settings.environment_file }}{% endif %} {% if settings.bindpaths %}-B {{ settings.bindpaths }} {% endif %}<container> "$@"
 - {|module_name|}-shell:
       singularity shell -s {{ settings.singularity_shell }} {% if features.gpu %}{{ features.gpu }} {% endif %}{% if features.home %}-B {{ features.home }} --home {{ features.home }} {% endif %}{% if features.x11 %}-B {{ features.x11 }} {% endif %}{% if settings.environment_file %}-B <moduleDir>/{{ settings.environment_file }}:/.singularity.d/env/{{ settings.environment_file }}{% endif %} {% if settings.bindpaths %}-B {{ settings.bindpaths }} {% endif %}<container>
 - {|module_name|}-exec:
       singularity exec {% if features.gpu %}{{ features.gpu }} {% endif %}{% if features.home %}-B {{ features.home }} --home {{ features.home }} {% endif %}{% if features.x11 %}-B {{ features.x11 }} {% endif %}{% if settings.environment_file %}-B <moduleDir>/{{ settings.environment_file }}:/.singularity.d/env/{{ settings.environment_file }}{% endif %} {% if settings.bindpaths %}-B {{ settings.bindpaths }} {% endif %}<container> "$@"
 - {|module_name|}-inspect-runscript:
       singularity inspect -r <container>
 - {|module_name|}-inspect-deffile:
       singularity inspect -d <container>
 - {|module_name|}-container:
       echo "$SINGULARITY_CONTAINER"

This creates shell functions when loading the module, and those won't work in Slurm.

So if I understand it correctly, the "bug" is in the TCL file which generates the module file, like here:

set execCmd "{{ command }} \${PODMAN_OPTS} run -i{% if settings.enable_tty %}t{% endif %} \${PODMAN_COMMAND_OPTS} -u `id -u`:`id -g` --rm {% if settings.environment_file %} --env-file ${moduleDir}/{{ settings.environment_file }}{% endif %} {% if settings.bindpaths %}-v {{ settings.bindpaths }}{% endif %}{% if features.home %}-v {{ features.home }} {% endif %} -v $workdir -w $workdir"

and here:

set shellCmd "singularity \${SINGULARITY_OPTS} shell \${SINGULARITY_COMMAND_OPTS} -s {{ settings.singularity_shell }} {% if features.gpu %}{{ features.gpu }} {% endif %}{% if features.home %}-B {{ features.home | replace("$", "\$") }} --home {{ features.home | replace("$", "\$") }} {% endif %}{% if features.x11 %}-B {{ features.x11 | replace("$", "\$") }} {% endif %}{% if settings.environment_file %}-B ${moduleDir}/{{ settings.environment_file }}:/.singularity.d/env/{{ settings.environment_file }}{% endif %} {% if settings.bindpaths %}-B {{ settings.bindpaths }}{% endif %} ${containerPath}"

@muffato
Copy link
Contributor

muffato commented Aug 1, 2022

Hi @surak . Right, I think I understand, and your original message is becoming clearer. This issue is about ${module_name}-(run|shell|inspect-runscript|inspect-deffile|container) being shell functions even when the wrapper scripts option is turned on ?
I've always seen it like this in shpc, and I don't know if it was designed this way on purpose. @vsoch ?
I def. think that -run and -shell should be treated like the command aliases. Maybe the case for -inspect-* and -container is less strong, but it might be simpler to treat all the same.

@vsoch
Copy link
Member

vsoch commented Aug 1, 2022

ah yes, now I understand the issue- correct they indeed should be wrapper scripts and not aliases - it's just an oversight. I think I likely didn't change them because (I suspected) the use case for using in a workflow wasn't as strong, and the aliases would work most of the time.

@surak
Copy link
Contributor Author

surak commented Aug 1, 2022

Hi @surak . Right, I think I understand, and your original message is becoming clearer. This issue is about ${module_name}-(run|shell|inspect-runscript|inspect-deffile|container) being shell functions even when the wrapper scripts option is turned on ?
I've always seen it like this in shpc, and I don't know if it was designed this way on purpose. @vsoch ?
I def. think that -run and -shell should be treated like the command aliases. Maybe the case for -inspect-* and -container is less strong, but it might be simpler to treat all the same.

Exactly. Thing is, slurm won't accept functions as executables.

So, right now, to run the tensorflow container installed with shpc install nvcr.io/nvidia/tensorflow, I have to create a slurm submission script that has this:

srun --mpi=pspmix --cpu_bind=none,v env PMIX_SECURITY_MODE=native \
     singularity ${SINGULARITY_OPTS} exec --nv ${SINGULARITY_COMMAND_OPTS} \
     /p/project/ccstao/cstao05/easybuild/juwelsbooster/modules/containers/nvcr.io/nvidia/tensorflow/22.06-tf2-py3/nvcr.io-nvidia-tensorflow-22.06-tf2-py3-sha256:ea76aabaa17ee0e308a443fa1e7666f98350865f266b02610a60791bd0feeb40.sif  \
     python  MYCODE.py

If that looks like tensorflow-run from the module, it's because it is. And it kinda defeats having a module for a container, right? I can simply use singularity/apptainer directly in this case.

With a shell script instead of a function, I could do something like this on the batch submission script:

module load nvcr.io/nvidia/tensorflow/22.06-tf2-py3
srun tensorflow-run mycode.py

Instead.

@vsoch
Copy link
Member

vsoch commented Aug 1, 2022

Ah indeed, run should definitely be a non-alias / wrapper!

@vsoch
Copy link
Member

vsoch commented Aug 1, 2022

I'm pretty deep into the remote registry refactor, but I can fix this up right after!

@vsoch
Copy link
Member

vsoch commented Aug 30, 2022

Will work on this soon!

@vsoch
Copy link
Member

vsoch commented Aug 31, 2022

heyo! Nothing is ready yet, but just wanted to update you that I have a proof of concept - just for one command (a singularity shell, e.g., "python-shell"). I think this should work, so I should be able to do run/exec/container/inspects and then the same for docker and probably just need a few more evenings for a PR. So stay tuned! Here is progress:

https://github.com/singularityhub/singularity-hpc/compare/shell-script-wrappers?expand=1

I refactored the wrappers logic into more clear functions for generation, and put them in their own module, and then the singularity/docker specific templates for the various commands will live in the wrappers templates folder named for their respective container technology. So WIP - more soon (probably not tonight!)

@vsoch
Copy link
Member

vsoch commented Aug 31, 2022

okay here is a first shot! #586 This should generate the container command wrapper scripts, akin to the alias ones. The jinja2 is a bit hard to write so review/additional test would be greatly appreciated!

@vsoch vsoch closed this as completed in #586 Sep 5, 2022
@surak
Copy link
Contributor Author

surak commented Sep 5, 2022

Wooohooo! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants