Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --exclusive and --mem=0 to GPU example jobscripts #125

Open
wants to merge 1 commit into
base: production
Choose a base branch
from

Conversation

Danzelot
Copy link
Contributor

@Danzelot Danzelot commented Aug 7, 2023

Many users use the example script on small-g and dev-g without realising that they are not exclusive and have different default memory settings.

@Danzelot Danzelot requested a review from olouant August 7, 2023 10:43
@klust
Copy link
Contributor

klust commented Aug 16, 2023

I'd use --mem=480g instead because that would at the same time protect against memory leaks in the OS as we have already had.

@olouant
Copy link
Contributor

olouant commented Aug 16, 2023

I would comment the --exclusive to avoid problems with users using 1 GPU on small/dev-g and copy-pasting the example without properly check their job script. By commenting, users need to do an edit in order to activate exclusive. We don't want tickets complaining about being billed at a 8x rate because of our examples.

@klust
Copy link
Contributor

klust commented Aug 16, 2023

In fact, Fredrik wants us to go after such users and is working on tools to detect this. But on the other hand he wants jobs that need 4 nodes or less to make more use of the small partitions to reduce the load on especially standard. So we should make very clear what this example is for: Only to use full nodes on small-g. So I agree with Orian that we should probably at least add a warning in the job script and/or comment the line out.

@olouant
Copy link
Contributor

olouant commented Aug 16, 2023

Ok, then I think a complete rewrite of this page will be better, highlighting how to choose the right partition according to the job requirements with corresponding job script examples. Same for the equivalent page for LUMI-C.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants