diff --git a/config.yaml b/config.yaml index 63382b36..2cb20e43 100644 --- a/config.yaml +++ b/config.yaml @@ -61,6 +61,7 @@ contact: 'joelnitta@gmail.com' episodes: - introduction.Rmd - basic-targets.Rmd +- functions.Rmd - cache.Rmd - lifecycle.Rmd - organization.Rmd diff --git a/episodes/basic-targets.Rmd b/episodes/basic-targets.Rmd index 5c1f4e2b..4ac83a0d 100644 --- a/episodes/basic-targets.Rmd +++ b/episodes/basic-targets.Rmd @@ -137,7 +137,8 @@ penguins_csv_file <- path_to_file("penguins_raw.csv") penguins_csv_file ``` -We will use the `tidyverse` set of packages for loading and manipulating the data. We don't have time to cover all the details about using `tidyverse` now, but if you want to learn more about it, please see the ["Manipulating, analyzing and exporting data with tidyverse" lesson](https://datacarpentry.org/R-ecology-lesson/03-dplyr.html). +We will use the `tidyverse` set of packages for loading and manipulating the data. We don't have time to cover all the details about using `tidyverse` now, but if you want to learn more about it, please see the ["Manipulating, analyzing and exporting data with tidyverse" lesson](https://datacarpentry.org/R-ecology-lesson/03-dplyr.html), or the Carpentry incubator lesson [R and the tidyverse for working with datasets](https://carpentries-incubator.github.io/r-tidyverse-4-datasets/). + Let's load the data with `read_csv()`. diff --git a/episodes/functions.Rmd b/episodes/functions.Rmd new file mode 100644 index 00000000..c2e4e316 --- /dev/null +++ b/episodes/functions.Rmd @@ -0,0 +1,191 @@ +--- +title: 'A brief introduction to functions' +teaching: 20 +exercises: 1 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- What are functions? +- Why should we know how to write them? +- What are the main components of a function? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Understand the usefulness of custom functions +- Understand the basic concepts around writing functions + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: {.instructor} + +Episode summary: A very brief introduction to functions, when you have learners who have no experience with them. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +```{r} +#| label: setup +#| echo: FALSE +#| message: FALSE +#| warning: FALSE +library(targets) + +if (interactive()) { + setwd("episodes") +} + +source("files/lesson_functions.R") +``` + +## Create a function + +### About functions + +Functions in R are something we are used to thinking of as something that comes from a package. You find, install and use specialized functions from packages to get your work done. + +But you can, and arguably should, be writing your own functions too! +Functions are a great way of making it easy to repeat the same operation but with different settings. +How many times have you copy-pasted the exact same code in your script, only to change a couple of things (a variable, an input etc.) before running it again? +Only to then discover that there was an error in the code, and when you fix it, you need to remember to do so in all the places where you copied that code. + +Through writing functions you can reduce this back and forth, and create a more efficient workflow for yourself. +When you find the bug, you fix it in a single place, the function you made, and each subsequent call of that function will now be fixed. + +Furthermore, `targets` makes extensive use of custom functions, so a basic understanding of how they work is very important to successfully using it. + +### Writing a function + +There is not much difference between writing your own function and writing other code in R, you are still coding with R! +Let's imagine we want to convert the millimeter measurements in the Penguins data to centimeters. + +```{r} +#| label: targets-functions-problem +library(palmerpenguins) +library(tidyverse) + +penguins |> + mutate( + bill_length_cm = bill_length_mm / 10, + bill_depth_cm = bill_depth_mm / 10 + ) + +``` + +This is not a complicated operation, but we might want to make a convenient custom function that can do this conversion for us anyways. + +To write a function, you need to use the `function()` function. +With this function we provide what will be the input arguments of the function inside its parentheses, and what the function will subsequently do with those input arguments in curly braces `{}` after the function parentheses. +The object name we assign this to, will become the function's name. + +```{r} +#| label: targets-functions-skeleton +#| eval: false +my_function <- function(argument1, argument2) { + # the things the function will do +} +# call the function +my_function(1, "something") +``` + +For our mm to cm conversion the function would look like so: + +```{r} +#| label: targets-functions-cm +mm2cm <- function(x) { + x / 10 +} +# use it +penguins |> + mutate( + bill_length_cm = mm2cm(bill_length_mm), + bill_depth_cm = mm2cm(bill_depth_mm) + ) +``` + +Our custom function will now transform any numerical input by dividing it by 10. + +### Make a function from existing code + +Many times, we might already have a piece of code that we'd like to use to create a function. +For instance, we've copy-pasted a section of code several times and realize that this piece of code is repetitive, so a function is in order. +Or, you are converting your workflow to `targets`, and need to change your script into a series of functions that `targets` will call. + +Recall the code snippet we had to clean our Penguins data: + +```{r} +#| label: code-to-convert-to-function +#| eval: false +penguins_data_raw |> + select( + species = Species, + bill_length_mm = `Culmen Length (mm)`, + bill_depth_mm = `Culmen Depth (mm)` + ) |> + drop_na() +``` + +We need to adapt this code to become a function, and this function needs a single argument, which is the dataset it should clean. + +It should look like this: +```{r} +#| label: clean-data-function +clean_penguin_data <- function(penguins_data_raw) { + penguins_data_raw |> + select( + species = Species, + bill_length_mm = `Culmen Length (mm)`, + bill_depth_mm = `Culmen Depth (mm)` + ) |> + drop_na() +} +``` + +::::::::::::::::: callout + +# RStudio function extraction + +RStudio also has a handy helper to extract a function from a piece of code. +Once you have basic familiarity with functions, it may help you figure out the necessary input when turning code into a function. + +To use it, highlight the piece of code you want to make into a function. +In our case that is the entire pipeline from `penguins_data_raw` to the `drop_na()` statement. +Once you have done this, in RStudio go to the "Code" section in the top bar, and select "Extract function" from the list. +A prompt will open asking you to hit enter, and you should have the following code in your script where the cursor was. + +This function will not work however, because it contains more stuff than is needed as an argument. +This is because tidyverse uses non-standard evaluation, and we can write unquoted column names inside the `select()`. +The function extractor thinks that all unquoted (or back-ticked) text in the code is a reference to an object. +You will need to do some manual cleaning to get the function working, which is why its more convenient if you have a little experience with functions already. + +:::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: {.challenge} + +## Challenge: Write a function that takes a numerical vector and returns its mean divided by 10. + +:::::::::::::::::::::::::::::::::: {.solution} + +```{r} +#| label: write-function-answer +vecmean <- function(x) { + mean(x) / 10 +} +``` + +:::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: + +Congratulations, you've started a whole new journey into functions! +This was a very brief introduction to functions, and you will likely need to get more help in learning about them. +There is an episode in the R Novice lesson from Carpentries that is [all about functions](https://swcarpentry.github.io/r-novice-gapminder/10-functions.html) which you might want to read. + +::::::::::::::::::::::::::::::::::::: keypoints + +- Functions are crucial when repeating the same code many times with minor differences +- RStudio's "Extract function" tool can help you get started with converting code into functions +- Functions are an essential part of how `targets` works. + +:::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/instructors/instructor-notes.md b/instructors/instructor-notes.md index 697f3a03..d6d0293f 100644 --- a/instructors/instructor-notes.md +++ b/instructors/instructor-notes.md @@ -5,3 +5,10 @@ title: 'Instructor Notes' ## General notes The examples gradually build up to a [full analysis](https://github.com/joelnitta/penguins-targets) of the [Palmer Penguins dataset](https://allisonhorst.github.io/palmerpenguins/). However, there are a few places where completely different code is demonstrated to explain certain concepts. Since a given `targets` project can only have one `_targets.R` file, this means the participants may have to delete their existing `_targets.R` file and write a new one to follow along with the examples. This may cause frustration if they can't keep a record of what they have done so far. One solution would be to save the old `_targets.R` file as `_targets_old.R` or similar, then rename it when it should be run again. + +## Optional episodes: +The "Function" episode is an optional episode and will depend on the learners coming to your workshop. +We would recommend having a show of hands (or stickies) who has experience with functions, and if you have learners who do not, run this episode. + +targets relies so much on functions we believe it is worth spending a little time on if you have learners inexperienced with them, they will quickly fall behind and not be empowered to use targets at the end of the workshop if they don't get a short introduction. + diff --git a/FIXME.Rproj b/targets-workshop.Rproj similarity index 100% rename from FIXME.Rproj rename to targets-workshop.Rproj