RStudio is a free, open source R integrated development environment. It provides a built in editor and provides many advantages such as integration with version control and project management.
When you first open RStudio, you will be greeted by three panels:
- The interactive R console (entire left)
- The Environment/History (tabbed in upper right)
- The Files/Plots/Packages/Help/Viewer (tabbed in lower right)
The interactive console is where you will run all of your code, and can be a useful environment to try out ideas before adding them to an R script file. This console in RStudio is the same as the one you would get if you just typed in R
in your commandline environment. The first thing you will see in the R interactive session is a bunch of information, followed by a ">" and a blinking cursor.
Once you open a file, such as an R script, an editor panel will also open in the top left.
There are two main ways one can work within RStudio.
You can type and run code using the interactive R console. This works well when you are doing small tests, but its hard to keep track of what you are doing.
You can also begin by writing in a .R file, using RStudio's command to execute selected lines in the interactive R console. This is a great way to start because all your code is saved for later, and you can execute an entire workflow with the click of a button.
When writing scripts, anything that follows a #
is ignored when R executes code. This allows you to add "comments" to your script. Your #
can come at the beginning or the middle of the line. How many #
s are used is a style choice.
One advantage that RStudio has over R on its own is that it has autocompletion abilities that allow you to more easily look up functions, their arguments, and the values that they take.
The simplest thing you could do with R is do arithmetic:
If you type the following at the console:
1 + 100
You will see the following output:
[1] 101
When using R as a calculator, the order of operations is the same as you would have learnt back in school. Use parentheses to group operations in order to force the order of evaluation if it differs from the default, or to make clear what you intend.
From highest to lowest precedence:
- Parentheses:
(
,)
- Exponents:
^
or**
- Divide:
/
- Multiply:
*
- Add:
+
- Subtract:
-
3 + 5 * 2
[1] 13
and
(3 + 5) * 2
[1] 16
When the output is a really small or large number, it will be reported in scientific notation. 10^XX
is shorthand for "multiplied by ", so 2e-4
is shorthand for 2 * 10^(-4)
.
2/10000
[1] 2e-04
You can write numbers in scientific notation too, using e
in place of 10^
:
5e3
[1] 5000
R has many built in mathematical functions. To call a function, we simply type its name, followed by open and closing parentheses. Anything we type inside the parentheses is called the function's arguments:
sin(1) # trigonometry functions
[1] 0.841471
log(1) # natural logarithm
[1] 0
log10(10) # base-10 logarithm
[1] 1
To figure out the synatx for a mathematical function, you can Google it. If you can remember the start of the function's name, you can use the tab completion in RStudio.
Typing a ?
before the name of a command will open the help page for that command. As well as providing a detailed description of the command and how it works, scrolling ot the bottom of the help page will usually show a collection of code examples which illustrate command usage. We'll go through an example later.
We can also do comparison in R:
1 == 1 # equality (note two equals signs, read as "is equal to")
[1] TRUE
1 != 2 # inequality (read as "is not equal to")
[1] TRUE
1 < 2 # less than
[1] TRUE
1 <= 1 # less than or equal to
[1] TRUE
1 > 0 # greater than
[1] TRUE
1 >= -9 # greater than or equal to
[1] TRUE
We can store values in variables using the assignment operator <-
. You will notice that assignment does not print a value. Instead, we stored it for later in something called a variable. x
now contains the value 0.025
:
x <- 1/40
x
[1] 0.025
More precisely, the stored value is a decimal approximation of this fraction called a floating point number.
Look for the Environment
tab in one of the panes of RStudio, and you will see that x
and its value have appeared. Our variable x
can be used in place of a number in any calculation that expects a number:
log(x)
[1] -3.688879
Variables can be reassigned. We can easily reassign the value of x
from 0.25 to 100 with the command.
x <- 100
x
[1] 100
Assignment values can contain the variable being assigned to. The right hand side of the assignment can be any valid R expression. The right hand side is fully evaluated before the assignment occurs.
x <- x + 1 #notice how RStudio updates its description of x on the top right tab
x <- x=+1
x
[1] 101
It is also possible to use the =
operator for assignment; however, there are occasionally places where it is less confusing to use <-
than =
. We recommending using the more commonly used <-
, but the most important thing is to be consistent with the operator you use.
x = 1/40
mass <- 47.5 age <- 122 mass <- mass * 2.3 age <- age - 20
Run the code from the previous challenge, and write a command to compare mass and age. Is mass larger than age?
Variable names can contain letters, numbers, underscores and periods. Variable names cannot start with a number nor contain spaces at all.
Different people use different conventions for long variable names. What you use is up to you, but being consistent will help make your research more reproducible. Some example conventions include:
- periods.between.words
- underscores_between_words
- CamelCaseToSeparateWords
min_height max.height _age .mass MaxLength min-length 2widths celsius2kelvin celsius kelvin
There are a few useful commands you can use to interact with the R session.
ls
will list all of the variables and functions stored in the global environment ( or your working R session):
ls()
[1] "x"
If you want to see hidden variables (they begin with a .
) type"
ls(all.names=TRUE)
[1] "x" ".mass"
If we type ls
by itself, R will print out the source code for that function!
ls
You can use rm
to delete objects you no longer need:
rm(x)
The command getwd()
will print your working directory. This is very useful for figuring out where you are.
The command setwd('path_to_bin')
is use to set the working directory. I like to set my working directory as the directory with my scripts. For instance setwd('~/Desktop/FilesForRCourse/bin')
Clean up your working environment by deleting the mass and age variables.
Pay attention when R does something unexpected! Errors are thrown when R cannot proceed with a calculation. Warnings on the other hand usually mean that the function has run, but it probably hasn't worked as expected. In both cases, the message that R prints out usually give you clues how to fix a problem.
It is possible to add functions to R by writing a package, or by obtaining a package written by someone else. As of this writing, there are over 8,300 packages available on CRAN (the comprehensive R archive network). R and RStudio have functionality for managing packages:
- To see what packages are installed,
installed.packages()
- To install a package, type
install.packages("packagename")
, wherepackagename
is the package name, in quotes. - To make a package available for use, type
library(packagename)
- To update an already-installed packages, type
update.packages()
- To remove a package, type
remove.packages("packagename")
Install and load the following packages:
ggplot2
,dplyr
,gapminder
,cowplot
Next Lesson: 02 Best Practices
Previous Lesson README.md