Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function #14

Merged
merged 31 commits into from
Dec 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
d5b7910
Use penguins for whole lesson
carpentries-bot Jun 3, 2023
37e0c70
Update lesson
carpentries-bot Jun 3, 2023
6e7464d
Update functions
carpentries-bot Jun 3, 2023
3875a30
Fix repo address
carpentries-bot Jun 3, 2023
1f405f3
Make sure to source most recent raw code
carpentries-bot Jun 3, 2023
6d9d4a2
Other updates to use penguins
carpentries-bot Jun 3, 2023
e55ae1b
Fix required packages
carpentries-bot Jun 3, 2023
d1950d2
Try to get quarto to stick
carpentries-bot Jun 3, 2023
e13e2ef
Skip compiling quarto report
carpentries-bot Jun 3, 2023
3998e94
Add description of binder to setup
carpentries-bot Jun 3, 2023
aa3903b
Update renv.lock
carpentries-bot Jun 3, 2023
2320795
Update setup
carpentries-bot Jun 3, 2023
553274b
Add section on where to get more info, closes #1
carpentries-bot Jun 3, 2023
945ca74
Add explanation about deleting cache
carpentries-bot Jun 3, 2023
22aa245
Update instructor notes
carpentries-bot Jun 3, 2023
0f68c50
Change from binder to posit
carpentries-bot Jun 6, 2023
602c18a
fix conflicts
drmowinckels Jun 14, 2023
ff94149
draft functions
drmowinckels Jun 14, 2023
4714b31
draft functions
drmowinckels Jun 14, 2023
f628dce
change rproj filename
drmowinckels Jun 14, 2023
84be40f
change rproj filename
drmowinckels Jun 14, 2023
9090f63
add tidyverse incubator lesson link
drmowinckels Jun 14, 2023
70cb64f
fix #6 - switch to tidyr::drop_na over ggplot2:remove_missing
drmowinckels Jun 14, 2023
c18a90b
Merge branch 'main' into function
drmowinckels Jun 14, 2023
45e187a
Rstudio extraction to callout
drmowinckels Jun 19, 2023
ecb4957
Merge branch 'function' of github.com:drmowinckels/targets-workshop i…
drmowinckels Jun 19, 2023
995c5d3
Merge main into drmowinckels/function
joelnitta Dec 24, 2024
4addea0
Fix function loading
joelnitta Dec 24, 2024
1c7ec97
Adjust teaching time
joelnitta Dec 24, 2024
9d37d6d
Fix grammar and code style
joelnitta Dec 24, 2024
29a3840
Add sentence to intro about use of custom fns in `targets`
joelnitta Dec 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ contact: '[email protected]'
episodes:
- introduction.Rmd
- basic-targets.Rmd
- functions.Rmd
- cache.Rmd
- lifecycle.Rmd
- organization.Rmd
Expand Down
3 changes: 2 additions & 1 deletion episodes/basic-targets.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,8 @@
penguins_csv_file
```

We will use the `tidyverse` set of packages for loading and manipulating the data. We don't have time to cover all the details about using `tidyverse` now, but if you want to learn more about it, please see the ["Manipulating, analyzing and exporting data with tidyverse" lesson](https://datacarpentry.org/R-ecology-lesson/03-dplyr.html).
We will use the `tidyverse` set of packages for loading and manipulating the data. We don't have time to cover all the details about using `tidyverse` now, but if you want to learn more about it, please see the ["Manipulating, analyzing and exporting data with tidyverse" lesson](https://datacarpentry.org/R-ecology-lesson/03-dplyr.html), or the Carpentry incubator lesson [R and the tidyverse for working with datasets](https://carpentries-incubator.github.io/r-tidyverse-4-datasets/).


Let's load the data with `read_csv()`.

Expand Down Expand Up @@ -167,7 +168,7 @@
For the purposes of this analysis, we only need species name, bill length, and bill depth.
In the raw data, the rather technical term "culmen" is used to refer to the bill.

![Illustration of bill (culmen) length and depth. Artwork by @allison_horst.](https://allisonhorst.github.io/palmerpenguins/reference/figures/culmen_depth.png)

Check warning on line 171 in episodes/basic-targets.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

[image missing alt-text]: https://allisonhorst.github.io/palmerpenguins/reference/figures/culmen_depth.png

Check warning on line 171 in episodes/basic-targets.Rmd

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: https://allisonhorst.github.io/palmerpenguins/reference/figures/culmen_depth.png

Let's clean up the data to make it easier to use for downstream analyses.
We will also remove any rows with missing data, because this could cause errors for some functions later.
Expand Down
191 changes: 191 additions & 0 deletions episodes/functions.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
---
title: 'A brief introduction to functions'
teaching: 20
exercises: 1
---

:::::::::::::::::::::::::::::::::::::: questions

- What are functions?
- Why should we know how to write them?
- What are the main components of a function?

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- Understand the usefulness of custom functions
- Understand the basic concepts around writing functions

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: {.instructor}

Episode summary: A very brief introduction to functions, when you have learners who have no experience with them.

::::::::::::::::::::::::::::::::::::::::::::::::

```{r}
#| label: setup
#| echo: FALSE
#| message: FALSE
#| warning: FALSE
library(targets)

if (interactive()) {
setwd("episodes")
}

source("files/lesson_functions.R")
```

## Create a function

### About functions

Functions in R are something we are used to thinking of as something that comes from a package. You find, install and use specialized functions from packages to get your work done.

But you can, and arguably should, be writing your own functions too!
Functions are a great way of making it easy to repeat the same operation but with different settings.
How many times have you copy-pasted the exact same code in your script, only to change a couple of things (a variable, an input etc.) before running it again?
Only to then discover that there was an error in the code, and when you fix it, you need to remember to do so in all the places where you copied that code.

Through writing functions you can reduce this back and forth, and create a more efficient workflow for yourself.
When you find the bug, you fix it in a single place, the function you made, and each subsequent call of that function will now be fixed.

Furthermore, `targets` makes extensive use of custom functions, so a basic understanding of how they work is very important to successfully using it.

### Writing a function

There is not much difference between writing your own function and writing other code in R, you are still coding with R!
Let's imagine we want to convert the millimeter measurements in the Penguins data to centimeters.

```{r}
#| label: targets-functions-problem
library(palmerpenguins)
library(tidyverse)

penguins |>
mutate(
bill_length_cm = bill_length_mm / 10,
bill_depth_cm = bill_depth_mm / 10
)

```

This is not a complicated operation, but we might want to make a convenient custom function that can do this conversion for us anyways.

To write a function, you need to use the `function()` function.
With this function we provide what will be the input arguments of the function inside its parentheses, and what the function will subsequently do with those input arguments in curly braces `{}` after the function parentheses.
The object name we assign this to, will become the function's name.

```{r}
#| label: targets-functions-skeleton
#| eval: false
my_function <- function(argument1, argument2) {
# the things the function will do
}
# call the function
my_function(1, "something")
```

For our mm to cm conversion the function would look like so:

```{r}
#| label: targets-functions-cm
mm2cm <- function(x) {
x / 10
}
# use it
penguins |>
mutate(
bill_length_cm = mm2cm(bill_length_mm),
bill_depth_cm = mm2cm(bill_depth_mm)
)
```

Our custom function will now transform any numerical input by dividing it by 10.

### Make a function from existing code

Many times, we might already have a piece of code that we'd like to use to create a function.
For instance, we've copy-pasted a section of code several times and realize that this piece of code is repetitive, so a function is in order.
Or, you are converting your workflow to `targets`, and need to change your script into a series of functions that `targets` will call.

Recall the code snippet we had to clean our Penguins data:

```{r}
#| label: code-to-convert-to-function
#| eval: false
penguins_data_raw |>
select(
species = Species,
bill_length_mm = `Culmen Length (mm)`,
bill_depth_mm = `Culmen Depth (mm)`
) |>
drop_na()
```

We need to adapt this code to become a function, and this function needs a single argument, which is the dataset it should clean.

It should look like this:
```{r}
#| label: clean-data-function
clean_penguin_data <- function(penguins_data_raw) {
penguins_data_raw |>
select(
species = Species,
bill_length_mm = `Culmen Length (mm)`,
bill_depth_mm = `Culmen Depth (mm)`
) |>
drop_na()
}
```

::::::::::::::::: callout

# RStudio function extraction

RStudio also has a handy helper to extract a function from a piece of code.
Once you have basic familiarity with functions, it may help you figure out the necessary input when turning code into a function.

To use it, highlight the piece of code you want to make into a function.
In our case that is the entire pipeline from `penguins_data_raw` to the `drop_na()` statement.
Once you have done this, in RStudio go to the "Code" section in the top bar, and select "Extract function" from the list.
A prompt will open asking you to hit enter, and you should have the following code in your script where the cursor was.

This function will not work however, because it contains more stuff than is needed as an argument.
This is because tidyverse uses non-standard evaluation, and we can write unquoted column names inside the `select()`.
The function extractor thinks that all unquoted (or back-ticked) text in the code is a reference to an object.
You will need to do some manual cleaning to get the function working, which is why its more convenient if you have a little experience with functions already.

::::::::::::::::::

::::::::::::::::::::::::::::::::::::: {.challenge}

## Challenge: Write a function that takes a numerical vector and returns its mean divided by 10.

:::::::::::::::::::::::::::::::::: {.solution}

```{r}
#| label: write-function-answer
vecmean <- function(x) {
mean(x) / 10
}
```

::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::

Congratulations, you've started a whole new journey into functions!
This was a very brief introduction to functions, and you will likely need to get more help in learning about them.
There is an episode in the R Novice lesson from Carpentries that is [all about functions](https://swcarpentry.github.io/r-novice-gapminder/10-functions.html) which you might want to read.

::::::::::::::::::::::::::::::::::::: keypoints

- Functions are crucial when repeating the same code many times with minor differences
- RStudio's "Extract function" tool can help you get started with converting code into functions
- Functions are an essential part of how `targets` works.

::::::::::::::::::::::::::::::::::::::::::::::::
7 changes: 7 additions & 0 deletions instructors/instructor-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,10 @@ title: 'Instructor Notes'
## General notes

The examples gradually build up to a [full analysis](https://github.com/joelnitta/penguins-targets) of the [Palmer Penguins dataset](https://allisonhorst.github.io/palmerpenguins/). However, there are a few places where completely different code is demonstrated to explain certain concepts. Since a given `targets` project can only have one `_targets.R` file, this means the participants may have to delete their existing `_targets.R` file and write a new one to follow along with the examples. This may cause frustration if they can't keep a record of what they have done so far. One solution would be to save the old `_targets.R` file as `_targets_old.R` or similar, then rename it when it should be run again.

## Optional episodes:
The "Function" episode is an optional episode and will depend on the learners coming to your workshop.
We would recommend having a show of hands (or stickies) who has experience with functions, and if you have learners who do not, run this episode.

targets relies so much on functions we believe it is worth spending a little time on if you have learners inexperienced with them, they will quickly fall behind and not be empowered to use targets at the end of the workshop if they don't get a short introduction.

File renamed without changes.
Loading