Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RP fails on Amarel due to SLURM old version (--nodefile not recognized) #2451

Closed
AymenFJA opened this issue Oct 12, 2021 · 4 comments
Closed

Comments

@AymenFJA
Copy link
Contributor

Based on the PR #2448, RP is failing on Amarel for the following reason:

/usr/bin/srun: unrecognized option '--nodefile=/home/afa64/radical.pilot.sandbox/rp.session.amarel2.amarel.rutgers.edu.afa64.018907.0010/pilot.0000/task.000013//task.000013.nodes'

Now, it is obvious why this is happening, Amarel uses an old version of SLURM (18.0.8) which if you check the archive documentation (https://slurm.schedmd.com/archive/slurm-18.08.8/srun.html) they do not support the flag --nodefile. How should we proceed? should we support backward compatibility of the old version of SLURM?

@andre-merzky
Copy link
Member

Per discussion on the devel call, we should be switching to --nodelist as that is faster (less file I/O). We are limited to the max number of command line characters though, and beyond that need to switch to --nodefile for the command to work at all. In that case Amarel is out of luck. We should address that case only once there is a use case / user requesting it. Mikhael suggested to look into env variables also to work around the command length limit.

@AymenFJA : what is the priority on this?

@AymenFJA
Copy link
Contributor Author

@andre-merzky medium!

@AymenFJA
Copy link
Contributor Author

Hello @andre-merzky . Any news regarding this issue? Thanks.

@AymenFJA
Copy link
Contributor Author

This is related to #2480 and it is solved now! I will reopen it if something comes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants