generated from carpentries/workbench-template-rmd
-
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit d9643ec
Showing
24 changed files
with
845 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
--- | ||
title: "Contributor Code of Conduct" | ||
--- | ||
|
||
As contributors and maintainers of this project, | ||
we pledge to follow the [The Carpentries Code of Conduct][coc]. | ||
|
||
Instances of abusive, harassing, or otherwise unacceptable behavior | ||
may be reported by following our [reporting guidelines][coc-reporting]. | ||
|
||
|
||
[coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html | ||
[coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
--- | ||
title: "Licenses" | ||
--- | ||
|
||
## Instructional Material | ||
|
||
All Carpentries (Software Carpentry, Data Carpentry, and Library Carpentry) | ||
instructional material is made available under the [Creative Commons | ||
Attribution license][cc-by-human]. The following is a human-readable summary of | ||
(and not a substitute for) the [full legal text of the CC BY 4.0 | ||
license][cc-by-legal]. | ||
|
||
You are free: | ||
|
||
- to **Share**---copy and redistribute the material in any medium or format | ||
- to **Adapt**---remix, transform, and build upon the material | ||
|
||
for any purpose, even commercially. | ||
|
||
The licensor cannot revoke these freedoms as long as you follow the license | ||
terms. | ||
|
||
Under the following terms: | ||
|
||
- **Attribution**---You must give appropriate credit (mentioning that your work | ||
is derived from work that is Copyright (c) The Carpentries and, where | ||
practical, linking to <https://carpentries.org/>), provide a [link to the | ||
license][cc-by-human], and indicate if changes were made. You may do so in | ||
any reasonable manner, but not in any way that suggests the licensor endorses | ||
you or your use. | ||
|
||
- **No additional restrictions**---You may not apply legal terms or | ||
technological measures that legally restrict others from doing anything the | ||
license permits. With the understanding that: | ||
|
||
Notices: | ||
|
||
* You do not have to comply with the license for elements of the material in | ||
the public domain or where your use is permitted by an applicable exception | ||
or limitation. | ||
* No warranties are given. The license may not give you all of the permissions | ||
necessary for your intended use. For example, other rights such as publicity, | ||
privacy, or moral rights may limit how you use the material. | ||
|
||
## Software | ||
|
||
Except where otherwise noted, the example programs and other software provided | ||
by The Carpentries are made available under the [OSI][osi]-approved [MIT | ||
license][mit-license]. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy of | ||
this software and associated documentation files (the "Software"), to deal in | ||
the Software without restriction, including without limitation the rights to | ||
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies | ||
of the Software, and to permit persons to whom the Software is furnished to do | ||
so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. | ||
|
||
## Trademark | ||
|
||
"The Carpentries", "Software Carpentry", "Data Carpentry", and "Library | ||
Carpentry" and their respective logos are registered trademarks of [Community | ||
Initiatives][ci]. | ||
|
||
[cc-by-human]: https://creativecommons.org/licenses/by/4.0/ | ||
[cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode | ||
[mit-license]: https://opensource.org/licenses/mit-license.html | ||
[ci]: https://communityin.org/ | ||
[osi]: https://opensource.org |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
--- | ||
title: "Running a Parallel Application on the Cluster" | ||
teaching: 10 | ||
exercises: 2 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::: questions | ||
|
||
- What output does the Amdahl code generate? | ||
- Why does parallelizing the amdahl code make it faster? | ||
|
||
:::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::: objectives | ||
|
||
- Run the amdahl parallel code on the cluster | ||
- Note what output is generated, and where it goes | ||
- Predict the trend of execution time vs parallelism | ||
|
||
:::::::::::::::::::::::::::::::::::::::: | ||
|
||
## Introduction | ||
|
||
A high-performance computing cluster offers powerful | ||
computational resources to its users, but taking advantage | ||
of these resources is not always straightforward. The | ||
cluster system does not work in the same way as systems | ||
you may be more familiar with. | ||
|
||
The software we will use in this lesson is a model of | ||
the kind of parallel task that is well-adapted to | ||
high-performance computing resources. It's called "amdahl", | ||
named for Eugene Amdahl, a famous computer scientist who | ||
coined "Amdahl's Law", which is about the advantages and | ||
limitations of parallelism in code execution. | ||
|
||
:::::::::::::::::::::::::::::::: callout | ||
|
||
[Amdahl's Law](https://en.wikipedia.org/wiki/Amdahl%27s_law) is | ||
a statement about how much benefit you can expect to get by | ||
parallelizing a computer program. | ||
|
||
The limitation arises from the fact that, in any application, | ||
there is some fraction of the work to be done which is inherently | ||
serial, and some fraction which is amenable to parallelization. | ||
The law is a quantitative expression of the fact that, by | ||
parallelizing the code, you can only ever make the parallel | ||
part faster, you cannot reduce the execution time of the | ||
serial part. | ||
|
||
As a practical matter, this means that developer effort spent | ||
on parallelization has diminishing returns on the overall | ||
reduction in execution time. | ||
|
||
:::::::::::::::::::::::::::::::::::::::: | ||
|
||
## The Amdahl Code | ||
|
||
Download it and install it, via pip. | ||
Note that `amdahl` depends on MPI, | ||
so make sure that's also available. | ||
|
||
On the HPC Carpentry cluster: | ||
|
||
``` shell | ||
[user@login1 ~]$ module load OpenMPI | ||
[user@login1 ~]$ module load Python | ||
[user@login1 ~]$ pip install amdahl | ||
``` | ||
|
||
## Running It on the Cluster | ||
|
||
Use the `sacct` command to see the run-time. | ||
The run-time is also recorded in the output itself. | ||
|
||
``` shell | ||
[user@login1 ~]$ nano amdahl_1.sh | ||
``` | ||
|
||
``` bash | ||
#!/bin/bash | ||
#SBATCH -t 00:01 # max 1 minute | ||
#SBATCH -p smnodes # max 4 cores | ||
#SBATCH -n 1 # use 1 core | ||
#SBATCH -o amdahl-np1.out # record result | ||
|
||
module load OpenMPI | ||
module load Python | ||
|
||
mpirun amdahl | ||
``` | ||
|
||
``` shell | ||
[user@login1 ~]$ sbatch amdahl_1.sh | ||
``` | ||
|
||
:::::::::::::::::::::::::::::: challenge | ||
|
||
Run the amdhal code with a few (small!) levels | ||
of parallelism. Make a quantitative estimate of | ||
how much faster the code will run with 3 processors | ||
than 2. The naive estimate would be that it would run | ||
1.5× the speed, or equivalently, that it would | ||
complete in 2/3 the time. | ||
|
||
:::::::::::::::: solution | ||
|
||
``` shell | ||
[user@login1 ~]$ sbatch amdahl_1.sh # serial job ~ 25 sec | ||
[user@login1 ~]$ sbatch amdahl_2.sh # 2-way parallel ~ 20 sec | ||
[user@login1 ~]$ sbatch amdahl_3.sh # 3-way parallel ~ 16 sec | ||
``` | ||
|
||
The amdahl code runs faster with 3 processors than with | ||
2, but the speed-up is less than 1.5×. | ||
|
||
::::::::::::::::::::::::: | ||
|
||
:::::::::::::::::::::::::::::::::::::: | ||
|
||
:::::::::::::::::::::::::::::: keypoints | ||
|
||
- The amdahl code is a model of a parallel application | ||
- The execution speed depends on the degree of parallelism | ||
|
||
:::::::::::::::::::::::::::::::::::::::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
--- | ||
title: "Amdahl Parallel Runs" | ||
teaching: 10 | ||
exercises: 2 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::: questions | ||
|
||
- How can we collect data on Amdahl run times? | ||
|
||
:::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::: objectives | ||
|
||
- Collect systematic data on the runtime of the amdahl code | ||
|
||
:::::::::::::::::::::::::::::::::::::::: | ||
|
||
## Systematic Data Collection | ||
|
||
Using what we have learned so far, including Snakemake | ||
profiles and rules, we will now compose a Snakefile | ||
that runs the Amdahl example code over a range of | ||
parallel widths. This workflow will generate the | ||
data we will use in the next module to demonstrate | ||
the diminishing returns of increasing parallelism. | ||
|
||
## Write a File | ||
|
||
Compose the Snakemake file that does what we want. | ||
|
||
We can put the widths in a list and iterate over | ||
them. We will use the profile generated previously | ||
to ensure that the jobs run on the cluster. | ||
|
||
## Run Snakemake | ||
|
||
Throw the switch! | ||
|
||
:::::::::::::::::::::::::::::: challenge | ||
|
||
Our example has a single paramter, the parallelism, | ||
that we vary. How would you generalize this to arbitrary | ||
parameters? | ||
|
||
:::::::::::::::: solution | ||
|
||
Arbitrary parameters are still finite, so you could | ||
just generate a flat list of all the combinations, and iterate | ||
over that. Or you could generate two lists and do a nested | ||
loop. | ||
|
||
::::::::::::::::::::::::: | ||
|
||
:::::::::::::::::::::::::::::::::::::::: | ||
|
||
:::::::::::::::::::::::::::::: keypoints | ||
|
||
- A relatively compact snakemake file collects interesting data. | ||
|
||
:::::::::::::::::::::::::::::::::::::::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
#------------------------------------------------------------ | ||
# Values for this lesson. | ||
#------------------------------------------------------------ | ||
|
||
# Which carpentry is this (swc, dc, lc, or cp)? | ||
# swc: Software Carpentry | ||
# dc: Data Carpentry | ||
# lc: Library Carpentry | ||
# cp: Carpentries (to use for instructor training for instance) | ||
# incubator: The Carpentries Incubator | ||
carpentry: 'incubator' | ||
|
||
# Overall title for pages. | ||
title: 'HPC Workflow Management with Snakemake' # FIXME | ||
|
||
# Date the lesson was created (YYYY-MM-DD, this is empty by default) | ||
created: 2023-04-19 | ||
|
||
# Comma-separated list of keywords for the lesson | ||
keywords: 'HPC Carpentry, snakemake, workflows, hpc' | ||
|
||
# Life cycle stage of the lesson | ||
# possible values: pre-alpha, alpha, beta, stable | ||
life_cycle: 'pre-alpha' | ||
|
||
# License of the lesson | ||
license: 'CC-BY 4.0' | ||
|
||
# Link to the source repository for this lesson | ||
source: 'https://github.com/carpentries-incubator/hpc-workflows' | ||
|
||
# Default branch of your lesson | ||
branch: 'main' | ||
|
||
# Who to contact if there are any issues | ||
contact: '[email protected]' | ||
|
||
# Navigation ------------------------------------------------ | ||
# | ||
# Use the following menu items to specify the order of | ||
# individual pages in each dropdown section. Leave blank to | ||
# include all pages in the folder. | ||
# | ||
# Example ------------- | ||
# | ||
# episodes: | ||
# - introduction.md | ||
# - first-steps.md | ||
# | ||
# learners: | ||
# - setup.md | ||
# | ||
# instructors: | ||
# - instructor-notes.md | ||
# | ||
# profiles: | ||
# - one-learner.md | ||
# - another-learner.md | ||
|
||
# Order of episodes in your lesson | ||
episodes: | ||
- amdahl_foundation.md | ||
- snakemake_single.md | ||
- snakemake_multiple.md | ||
- snakemake_cluster.md | ||
- snakemake_profiles.md | ||
- amdahl_snakemake.md | ||
|
||
# Information for Learners | ||
learners: | ||
|
||
# Information for Instructors | ||
instructors: | ||
|
||
# Learner Profiles | ||
profiles: | ||
|
||
# Customisation --------------------------------------------- | ||
# | ||
# This space below is where custom yaml items (e.g. pinning | ||
# sandpaper and varnish versions) should live | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
rule one: | ||
input: | ||
output: 'amdahl_cluster.txt' | ||
resources: | ||
mpi="mpirun", | ||
tasks=3 | ||
shell: | ||
"module load OpenMPI; mpirun -np {resources.tasks} amdahl > amdahl_cluster.txt" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
rule: | ||
input: | ||
output: 'host.txt' | ||
shell: 'hostname > host.txt' |
Oops, something went wrong.