Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft lesson outline #1

Closed
reid-a opened this issue Apr 20, 2023 · 6 comments
Closed

Draft lesson outline #1

reid-a opened this issue Apr 20, 2023 · 6 comments

Comments

@reid-a
Copy link
Contributor

reid-a commented Apr 20, 2023

In light of the decision made at the April co-working meeting, viz. that we should not ever run the Amdahl code on the head node or on learner's laptops, some revisions are appropriate for the lesson content relative to the old lesson.

Initial Amdahl runs will be done on the cluster via batch files, then Snakemake basics will be explored with "Hello, world" executables on the head node, and then the two will be combined. Graduates of HPC Intro will know how to do the first thing, but realistically we should add some time for refreshing this knowledge for actual human learners with imperfect retention.

So a high-level version of the set of tasks now maybe looks something like this:

  1. Run the amdahl code on the cluster. Learners should be able to identify what output files the code generates, and know what data is in them.
  2. Introduce the Snakemake tool, and construct a "Hello, world" snakefile. Learners should be able to correctly predict whether the rule in the snakefile will fire or not, based on the presence and currency of the output file.
  3. Generate a multi-rule snakefile, with a dependency, to introduce the concept of the task graph, and illustrate that the order of operations. We can continue to use "Hello, world" level executables here. Learners should be able to correctly predict which snakemake rules will fire on an invocation, and in what order, based on the presence and currency of the output targets.
  4. Generate a single-rule snakefile that runs on the cluster. At first, manually specify all the cluster stuff, like the partition name and so forth, to foreground it. Learners should be able to predict how their snakefile will dispatch to the cluster, and predict the location and character of the resulting outputs.
  5. Introduce the cluster config file, and populate it for the local cluster. Repeat the task of the previous lesson, but with the cluster info implicit in the configuration. Same learner capability, I guess?
  6. Dispatch multiple jobs to the cluster via snakemake. Observe that the snakemake process itself remains active on the head node until the jobs are finished. (Deal with the thing where a cluster rule exits at dispatch-time, but the target doesn't appear until later?) Once this content is more developed, the goal can probably be clarified, beyond the obvious "learners should be able to correctly predict the sequence of operations that will result from running their snakefile", which is the emerging theme here.

From here, the tasks get a bit more murky in my mind, but the two beats to hit include:

  1. Plan and execute the workflow that generates the data needed for the Amdahl plot.
  2. Actually generate the Amdahl plot, and observe and appreciate the diminishing returns to increased parallelism.
@ocaisa
Copy link
Contributor

ocaisa commented Jun 1, 2023

Need to include localrule to ensure simple things are done locally, only "heavy" work on the cluster (ref. jdblischak/smk-simple-slurm#19)

@tkphd
Copy link
Contributor

tkphd commented Jun 1, 2023

(docs on localrule)

@reid-a
Copy link
Contributor Author

reid-a commented Jun 1, 2023

I think I have some of this in hand. It looks like in the absence of a profile, rules are run locally by default, at least in my initial experiments -- snakemake won't try to run sbatch unless you tell it to either via a profile or with the --slurm command-line argument.

@ocaisa
Copy link
Contributor

ocaisa commented May 16, 2024

@reid-a I think this can be closed?

@reid-a
Copy link
Contributor Author

reid-a commented May 16, 2024

Agreed!

@reid-a reid-a closed this as completed May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants