Skip to content

Commit

Permalink
source commit: 3d286cb
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Feb 12, 2024
0 parents commit d9643ec
Show file tree
Hide file tree
Showing 24 changed files with 845 additions and 0 deletions.
13 changes: 13 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
title: "Contributor Code of Conduct"
---

As contributors and maintainers of this project,
we pledge to follow the [The Carpentries Code of Conduct][coc].

Instances of abusive, harassing, or otherwise unacceptable behavior
may be reported by following our [reporting guidelines][coc-reporting].


[coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html
[coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
79 changes: 79 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
title: "Licenses"
---

## Instructional Material

All Carpentries (Software Carpentry, Data Carpentry, and Library Carpentry)
instructional material is made available under the [Creative Commons
Attribution license][cc-by-human]. The following is a human-readable summary of
(and not a substitute for) the [full legal text of the CC BY 4.0
license][cc-by-legal].

You are free:

- to **Share**---copy and redistribute the material in any medium or format
- to **Adapt**---remix, transform, and build upon the material

for any purpose, even commercially.

The licensor cannot revoke these freedoms as long as you follow the license
terms.

Under the following terms:

- **Attribution**---You must give appropriate credit (mentioning that your work
is derived from work that is Copyright (c) The Carpentries and, where
practical, linking to <https://carpentries.org/>), provide a [link to the
license][cc-by-human], and indicate if changes were made. You may do so in
any reasonable manner, but not in any way that suggests the licensor endorses
you or your use.

- **No additional restrictions**---You may not apply legal terms or
technological measures that legally restrict others from doing anything the
license permits. With the understanding that:

Notices:

* You do not have to comply with the license for elements of the material in
the public domain or where your use is permitted by an applicable exception
or limitation.
* No warranties are given. The license may not give you all of the permissions
necessary for your intended use. For example, other rights such as publicity,
privacy, or moral rights may limit how you use the material.

## Software

Except where otherwise noted, the example programs and other software provided
by The Carpentries are made available under the [OSI][osi]-approved [MIT
license][mit-license].

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

## Trademark

"The Carpentries", "Software Carpentry", "Data Carpentry", and "Library
Carpentry" and their respective logos are registered trademarks of [Community
Initiatives][ci].

[cc-by-human]: https://creativecommons.org/licenses/by/4.0/
[cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode
[mit-license]: https://opensource.org/licenses/mit-license.html
[ci]: https://communityin.org/
[osi]: https://opensource.org
126 changes: 126 additions & 0 deletions amdahl_foundation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
---
title: "Running a Parallel Application on the Cluster"
teaching: 10
exercises: 2
---

:::::::::::::::::::::::::::::: questions

- What output does the Amdahl code generate?
- Why does parallelizing the amdahl code make it faster?

::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::: objectives

- Run the amdahl parallel code on the cluster
- Note what output is generated, and where it goes
- Predict the trend of execution time vs parallelism

::::::::::::::::::::::::::::::::::::::::

## Introduction

A high-performance computing cluster offers powerful
computational resources to its users, but taking advantage
of these resources is not always straightforward. The
cluster system does not work in the same way as systems
you may be more familiar with.

The software we will use in this lesson is a model of
the kind of parallel task that is well-adapted to
high-performance computing resources. It's called "amdahl",
named for Eugene Amdahl, a famous computer scientist who
coined "Amdahl's Law", which is about the advantages and
limitations of parallelism in code execution.

:::::::::::::::::::::::::::::::: callout

[Amdahl's Law](https://en.wikipedia.org/wiki/Amdahl%27s_law) is
a statement about how much benefit you can expect to get by
parallelizing a computer program.

The limitation arises from the fact that, in any application,
there is some fraction of the work to be done which is inherently
serial, and some fraction which is amenable to parallelization.
The law is a quantitative expression of the fact that, by
parallelizing the code, you can only ever make the parallel
part faster, you cannot reduce the execution time of the
serial part.

As a practical matter, this means that developer effort spent
on parallelization has diminishing returns on the overall
reduction in execution time.

::::::::::::::::::::::::::::::::::::::::

## The Amdahl Code

Download it and install it, via pip.
Note that `amdahl` depends on MPI,
so make sure that's also available.

On the HPC Carpentry cluster:

``` shell
[user@login1 ~]$ module load OpenMPI
[user@login1 ~]$ module load Python
[user@login1 ~]$ pip install amdahl
```

## Running It on the Cluster

Use the `sacct` command to see the run-time.
The run-time is also recorded in the output itself.

``` shell
[user@login1 ~]$ nano amdahl_1.sh
```

``` bash
#!/bin/bash
#SBATCH -t 00:01 # max 1 minute
#SBATCH -p smnodes # max 4 cores
#SBATCH -n 1 # use 1 core
#SBATCH -o amdahl-np1.out # record result

module load OpenMPI
module load Python

mpirun amdahl
```

``` shell
[user@login1 ~]$ sbatch amdahl_1.sh
```

:::::::::::::::::::::::::::::: challenge

Run the amdhal code with a few (small!) levels
of parallelism. Make a quantitative estimate of
how much faster the code will run with 3 processors
than 2. The naive estimate would be that it would run
1.5× the speed, or equivalently, that it would
complete in 2/3 the time.

:::::::::::::::: solution

``` shell
[user@login1 ~]$ sbatch amdahl_1.sh # serial job ~ 25 sec
[user@login1 ~]$ sbatch amdahl_2.sh # 2-way parallel ~ 20 sec
[user@login1 ~]$ sbatch amdahl_3.sh # 3-way parallel ~ 16 sec
```

The amdahl code runs faster with 3 processors than with
2, but the speed-up is less than 1.5×.

:::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::: keypoints

- The amdahl code is a model of a parallel application
- The execution speed depends on the degree of parallelism

::::::::::::::::::::::::::::::::::::::::
61 changes: 61 additions & 0 deletions amdahl_snakemake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
title: "Amdahl Parallel Runs"
teaching: 10
exercises: 2
---

:::::::::::::::::::::::::::::: questions

- How can we collect data on Amdahl run times?

::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::: objectives

- Collect systematic data on the runtime of the amdahl code

::::::::::::::::::::::::::::::::::::::::

## Systematic Data Collection

Using what we have learned so far, including Snakemake
profiles and rules, we will now compose a Snakefile
that runs the Amdahl example code over a range of
parallel widths. This workflow will generate the
data we will use in the next module to demonstrate
the diminishing returns of increasing parallelism.

## Write a File

Compose the Snakemake file that does what we want.

We can put the widths in a list and iterate over
them. We will use the profile generated previously
to ensure that the jobs run on the cluster.

## Run Snakemake

Throw the switch!

:::::::::::::::::::::::::::::: challenge

Our example has a single paramter, the parallelism,
that we vary. How would you generalize this to arbitrary
parameters?

:::::::::::::::: solution

Arbitrary parameters are still finite, so you could
just generate a flat list of all the combinations, and iterate
over that. Or you could generate two lists and do a nested
loop.

:::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::: keypoints

- A relatively compact snakemake file collects interesting data.

::::::::::::::::::::::::::::::::::::::::
83 changes: 83 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#------------------------------------------------------------
# Values for this lesson.
#------------------------------------------------------------

# Which carpentry is this (swc, dc, lc, or cp)?
# swc: Software Carpentry
# dc: Data Carpentry
# lc: Library Carpentry
# cp: Carpentries (to use for instructor training for instance)
# incubator: The Carpentries Incubator
carpentry: 'incubator'

# Overall title for pages.
title: 'HPC Workflow Management with Snakemake' # FIXME

# Date the lesson was created (YYYY-MM-DD, this is empty by default)
created: 2023-04-19

# Comma-separated list of keywords for the lesson
keywords: 'HPC Carpentry, snakemake, workflows, hpc'

# Life cycle stage of the lesson
# possible values: pre-alpha, alpha, beta, stable
life_cycle: 'pre-alpha'

# License of the lesson
license: 'CC-BY 4.0'

# Link to the source repository for this lesson
source: 'https://github.com/carpentries-incubator/hpc-workflows'

# Default branch of your lesson
branch: 'main'

# Who to contact if there are any issues
contact: '[email protected]'

# Navigation ------------------------------------------------
#
# Use the following menu items to specify the order of
# individual pages in each dropdown section. Leave blank to
# include all pages in the folder.
#
# Example -------------
#
# episodes:
# - introduction.md
# - first-steps.md
#
# learners:
# - setup.md
#
# instructors:
# - instructor-notes.md
#
# profiles:
# - one-learner.md
# - another-learner.md

# Order of episodes in your lesson
episodes:
- amdahl_foundation.md
- snakemake_single.md
- snakemake_multiple.md
- snakemake_cluster.md
- snakemake_profiles.md
- amdahl_snakemake.md

# Information for Learners
learners:

# Information for Instructors
instructors:

# Learner Profiles
profiles:

# Customisation ---------------------------------------------
#
# This space below is where custom yaml items (e.g. pinning
# sandpaper and varnish versions) should live


8 changes: 8 additions & 0 deletions files/Snakefile_amdahl_cluster
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
rule one:
input:
output: 'amdahl_cluster.txt'
resources:
mpi="mpirun",
tasks=3
shell:
"module load OpenMPI; mpirun -np {resources.tasks} amdahl > amdahl_cluster.txt"
4 changes: 4 additions & 0 deletions files/Snakefile_cluster
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
rule:
input:
output: 'host.txt'
shell: 'hostname > host.txt'
Loading

0 comments on commit d9643ec

Please sign in to comment.