Skip to content

Commit

Permalink
1111 python
Browse files Browse the repository at this point in the history
  • Loading branch information
mvanrongen committed Nov 11, 2024
1 parent 99cd2fe commit 61c2b41
Show file tree
Hide file tree
Showing 7 changed files with 159 additions and 46 deletions.
4 changes: 2 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,12 @@ authors:
family-names: van Rongen
affiliation: Cambridge Centre for Research Informatics Training
orcid: 'https://orcid.org/0000-0002-1441-367X'
alias: 'writing - review & editing; conceptualisation; software'
alias: 'writing - original draft; conceptualisation; software'
- given-names: Alexia
family-names: Cardona
affiliation: Cambridge Centre for Research Informatics Training
orcid: 'https://orcid.org/0000-0002-7877-5565'
alias: 'conceptualisation'
alias: 'writing - original draft; conceptualisation'
repository-code: 'https://github.com/cambiotraining/quarto-course-template'
url: 'https://cambiotraining.github.io/found-coding-for-research-r/'
license: CC-BY-4.0
Expand Down
4 changes: 2 additions & 2 deletions _freeze/materials/01-intro-software/execute-results/html.json

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{
"hash": "464cc6f60270c235167ae2143b0f7d08",
"hash": "e14abd0305f35f4a00354f71d9b92eb4",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: Objects & data types\n---\n\n\n\n\n::: {.callout-tip}\n#### Learning objectives\n\n- \n:::\n\n\n## Context\n\nWe’ve seen examples where we entered data directly into a function. Most of the time we have data from elsewhere, such as a spreadsheet. In the previous section we created single objects. We’ll build up from this and introduce vectors and tabular data. We'll also briefly mention other data types, such as matrices, arrays.\n\n## Objects\n\n### Creating objects\n\nJust running lines of code can be helpful if you only need an answer, but in order to do useful and interesting things, we often need to save values so we can work with them.\n\nTo do this, we *assign* values to *objects*. An object acts as a container for that value.\n\nTo create an object, we need to give it a name followed by the\nassignment operator `<-`, and the value we want to give it, for example:\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntemperature <- 23\n```\n:::\n\n\n\n\n::: {.callout-important}\n## The assignment operator\n\nIn R we use `<-` as the assignment operator. It assigns values on the right to objects on\nthe left. So, after executing:\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- 3\n```\n:::\n\n\n\n\nthe value of `x` is `3`. You can think of it as 3 **goes into** `x`.\n\nFor historical reasons you can also use `=` for assignments, but not in every context. Because of the\n[slight](http://blog.revolutionanalytics.com/2008/12/use-equals-or-arrow-for-assignment.html)\n[differences](http://r.789695.n4.nabble.com/Is-there-any-difference-between-and-tp878594p878598.html)\nin syntax, it is good practice to always use `<-` for assignments.\n\nIn RStudio, typing <kbd>Alt</kbd> + <kbd>-</kbd> (push <kbd>Alt</kbd> at the\nsame time as the <kbd>-</kbd> key) will write ` <- ` in a single keystroke on a PC, while typing <kbd>Option</kbd> + <kbd>-</kbd> (push <kbd>Option</kbd> at the\nsame time as the <kbd>-</kbd> key) does the same on a Mac.\n:::\n\nObjects can be given almost any name such as `x`, `current_temperature`, or\n`subject_id`. You want the object names to be explicit and short. There are some exceptions / considerations (see below).\n\n::: {.callout-warning}\n## Restrictions on object names\n\n* Object names are not allowed to start with a number (`2x` is not valid, but `x2` is).\n* R is case sensitive (e.g., `weight_kg` is different from `Weight_kg`).\n* Some names are forbidden, because they are the names of fundamental functions in R (e.g.,\n`if`, `else`, `for`, see\n[here](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Reserved.html)\nfor a complete list).\n* Generally, avoid using other function names (e.g., `c`, `T`, `mean`, `data`, `df`, `weights`), even if it is allowed. If in doubt, check the help to see if the name is already in use.\n* Avoid dots (`.`) within an object name as in `my.dataset`. There are many\nfunctions in R with dots in their names for historical reasons, but because dots\nhave a special meaning in R (for methods) and other programming languages, it's\nbest to avoid them.\n* It's important to be consistent in the **styling** of your\ncode (where you put spaces, how you name objects, etc.). Using a consistent\ncoding style makes your code clearer to read for your future self and your\ncollaborators. In R, popular style guides are:\n * [tidyverse's](http://style.tidyverse.org/).\n * [Google's](https://google.github.io/styleguide/Rguide.xml)\n \nYou can install the [`lintr`](https://github.com/jimhester/lintr) package to automatically check\nfor issues in the styling of your code.\n:::\n\n### Using objects\nLO: using objects (calculations etc)\n\n### Vectors\n\n* LO: create vectors\n* LO: operations with vectors\n* LO: subsetting (indices, conditionals with TRUE/FALSE)\n\n### Dealing with missing data\n\n* LO: why is missing data important?\n* LO: good practices of dealing with missing data\n\n## Summary\n\n::: {.callout-tip}\n#### Key points\n\n- \n:::\n",
"markdown": "---\ntitle: Objects & data types\n---\n\n\n\n\n::: {.callout-tip}\n#### Learning objectives\n\n- \n:::\n\n\n## Context\n\nWe’ve seen examples where we entered data directly into a function. Most of the time we have data from elsewhere, such as a spreadsheet. In the previous section we created single objects. We’ll build up from this and introduce vectors and tabular data. We'll also briefly mention other data types, such as matrices, arrays.\n\n## Objects\n\n### Creating objects\n\nJust running lines of code can be helpful if you only need an answer, but in order to do useful and interesting things, we often need to save values so we can work with them.\n\nTo do this, we *assign* values to *objects*. An object acts as a container for that value.\n\nTo create an object, we need to give it a name followed by the\nassignment operator `<-`, and the value we want to give it, for example:\n\n::: {.panel-tabset group=\"language\"}\n## R\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntemperature <- 23\n```\n:::\n\n\n\n\nWe can read the code as: the value `23` is assigned to the object `temperature`. Note that when you run this line of code the object you just created appears on your environment tab (top-right panel).\n\nWhen assigning a value to an object, R does not print anything on the console. You can print the value by typing the object name on the console.\n\n::: {.callout-important}\n## The assignment operator\n\nIn R we use `<-` as the assignment operator. It assigns values on the right to objects on\nthe left. So, after executing:\n\nIn RStudio, typing <kbd>Alt</kbd> + <kbd>-</kbd> (push <kbd>Alt</kbd> at the same time as the <kbd>-</kbd> key) will write ` <- ` in a single keystroke on a PC, while typing <kbd>Option</kbd> + <kbd>-</kbd> (push <kbd>Option</kbd> at the same time as the <kbd>-</kbd> key) does the same on a Mac.\n\n:::\n\nObjects can be given almost any name such as `x`, `current_temperature`, or\n`subject_id`. You want the object names to be explicit and short. There are some exceptions / considerations (see below).\n\n::: {.callout-warning}\n## Restrictions on object names\n\nObject names can contain letters, numbers, underscores and periods. They *cannot start with a number nor contain spaces*. Different people use different conventions for long variable names, two common ones being:\n\nUnderscore: my_long_named_object\n\nCamel case: myLongNamedObject\n\nWhat you use is up to you, but be consistent. R is **case-sensitive** so `temperature` is different from `Temperature.`\n\n* Some names are forbidden, because they are the names of fundamental functions in R (e.g.,\n`if`, `else`, `for`, see\n[here](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Reserved.html)\nfor a complete list).\n* Avoid using function names (e.g., `c`, `T`, `mean`, `data`, `df`, `weights`), even if allowed. If in doubt, check the help to see if the name is already in use.\n* Avoid full-stops (`.`) within an object name as in `my.data`. Full-stops often have meaning in programming languages, so it's best to avoid them.\n* Use consistent styling. In R, popular style guides are:\n * [tidyverse's](http://style.tidyverse.org/).\n * [Google's](https://google.github.io/styleguide/Rguide.xml)\n\n**Whatever style you use, be consistent!**\n:::\n:::\n\n### Using objects\n\nNow that we have the `temperature` in memory, we can use it to perform operations. For example, this might the temperature in Celsius and we might want to calculate it to Kelvin.\n\nTo do this, we need to add `273.15`:\n\n::: {.panel-tabset group=\"language\"}\n## R\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntemperature + 273.15\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 296.15\n```\n\n\n:::\n:::\n\n\n\n:::\n\nWe can change an object's value by assigning a new one:\n\n::: {.panel-tabset group=\"language\"}\n## R\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntemperature <- 36\ntemperature + 273.15\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 309.15\n```\n\n\n:::\n:::\n\n\n\n:::\n\nFinally, assigning a value to one object does not change the values of other objects. For example, let’s store the outcome in Kelvin into a new object `temp_K`:\n\n::: {.panel-tabset group=\"language\"}\n## R\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntemp_K <- temperature + 273.15\n```\n:::\n\n\n\n:::\n\nChanging the value of `temperature` does not change the value of `temp_K`.\n\n::: {.panel-tabset group=\"language\"}\n## R\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntemperature <- 14\ntemp_K\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 309.15\n```\n\n\n:::\n:::\n\n\n\n:::\n\n### Vectors\n\n* LO: create vectors\n* LO: operations with vectors\n* LO: subsetting (indices, conditionals with TRUE/FALSE)\n\n### Dealing with missing data\n\n* LO: why is missing data important?\n* LO: good practices of dealing with missing data\n\n## Summary\n\n::: {.callout-tip}\n#### Key points\n\n- \n:::\n",
"supporting": [
"02-basic-objects-and-data-types_files"
],
Expand Down
2 changes: 1 addition & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ book:
- icon: mastodon
href: https://genomic.social/@BioInfoCambs
aria-label: Bioinformatics Training Facility Mastodon
title: "Coding for research in R"
title: "Coding for research"
chapters:
- part: "Welcome"
chapters:
Expand Down
8 changes: 4 additions & 4 deletions index.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
---
title: "Coding for research in R"
author: "Alexia Cardona, Hugo Tavares, Martin van Rongen"
title: "Coding for research"
author: "Hugo Tavares, Alexia Cardona, Martin van Rongen"
date: today
number-sections: false
---

## Overview

These sessions provide an introduction to coding in R. The aim is to get you comfortable with coding techniques commonly used in scientific research.
These sessions provide an introduction to coding in R and Python. The aim is to get you comfortable with coding techniques commonly used in scientific research.

::: {.callout-tip}
### Learning Objectives

- Get familiar with the R programming language
- Get familiar with the R or Python programming language
- Learn to visualise data
- Be able to manipulate and transform data
:::
Expand Down
87 changes: 82 additions & 5 deletions materials/01-intro-software.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -270,11 +270,23 @@ This creates a little table of contents in the bottom-left corner of the script

The simplest way of using R is to type directly into the console. For example, you can use R as a glorified calculator:

::: {.panel-tabset group="language"}
## R

```{r}
3 + 5
12 / 7
```

## Python

```{python}
3 + 5
12 / 7
```

:::

Running code like this directly in the console is generally not a good idea, because then we can't keep track of what we are doing. So, we first need to create a script to save our code in. Then, we can then play around.

::: {.callout-important}
Expand All @@ -285,23 +297,42 @@ Please complete @ex-createscript and @ex-runningcode.

Functions are "canned scripts" that automate more complicated sets of commands
including operations assignments, etc. Many functions are predefined, or can be
made available by importing R *packages* (more on that later). A function
made available by importing *packages* (more on that later). A function
usually takes one or more inputs called *arguments*. Functions often (but not
always) return a *value*. A typical example would be the function `sqrt()`. The
input (the argument) must be a number, and the return value (in fact, the
output) is the square root of that number.

::: {.panel-tabset group="language"}
## R

```{r}
#| eval: false
sqrt(9)
```

## Python

The `sqrt()` function is not available by default, but is stored in the `math` module. Before we can use it, we need to load this module:

```{python}
import math
```

Next, we can use the `sqrt()` function, specifying that it comes from the `math`module. We separate the two with a full-stop (`.`):

```{python}
math.sqrt(9)
```

:::

Here, the value `9` is given to the `sqrt()` function. This function
calculates the square root, and returns the value. This function is very simple, because it takes just one argument.

The return 'value' of a function need not be numerical (like that of `sqrt()`),
and it also does not need to be a single item: it can be a set of things, or
even a data set. We'll see that when we read data files into R.
even a data set. We'll see that when we read data files.


### Arguments
Expand All @@ -315,15 +346,30 @@ of your choice which will be used instead of the default.

Let's try a function that can take multiple arguments: `round()`.

::: {.panel-tabset group="language"}
## R

```{r}
round(3.14159)
```

## Python

```{python}
round(3.14159)
```

:::

Here, we've called `round()` with just one argument, `3.14159`, and it has
returned the value `3`. That's because the default is to round to the nearest
whole number. If we want more digits we can see how to do that by getting
information about the `round` function. We can use `args(round)` to find what
arguments it takes, or look at the help for this function using `?round`.
information about the `round()` function.

::: {.panel-tabset group="language"}
## R

We can use `args(round)` to find what arguments it takes, or look at the help for this function using `?round`.

```{r}
args(round)
Expand Down Expand Up @@ -352,7 +398,38 @@ And if you do name the arguments, you can switch their order:
round(digits = 2, x = 3.14159)
```

It's good practice to put the non-optional arguments (like the number you're rounding) first in your function call, and to then specify the names of all optional arguments. If you don't, someone reading your code might have to look up the definition of a function with unfamiliar arguments to understand what you're doing.
## Python
We can use `help(round)` to find what arguments it takes.

```{python}
#| eval: false
help(round)
```

We see that if we want a different number of digits, we can
type `ndigits = 2` or however many we want. For example:

```{python}
round(3.14159, ndigits = 2)
```

If you provide the arguments in the exact same order as they are defined you
don't have to name them:

```{python}
round(3.14159, 2)
```

Python still expects the arguments in the correct order, so this gives an error:

```{python}
#| eval: false
round(ndigits = 2, 3.14159)
```
:::

It's good practice be explicit about the names of the arguments. That way you can avoid confusion later on when looking back at your code or when sharing your code.


## Adding functionality using packages
LO: adding functionality (installing + loading packages)
Expand Down
Loading

0 comments on commit 61c2b41

Please sign in to comment.