Skip to content

Commit

Permalink
Tutorial 4
Browse files Browse the repository at this point in the history
  • Loading branch information
katerobsau committed Nov 24, 2024
1 parent c276bd6 commit ef2cba9
Showing 1 changed file with 235 additions and 0 deletions.
235 changes: 235 additions & 0 deletions week4/tutorial/tutorial-04.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
---
title: "Tutorial-04"
author: Kate Saunders
format:
html:
toc: true
css: tutorial.css
embed-resources: true
pdf:
toc: true
editor: visual
---

## Visualisation in R

### Learning Objectives

- Practice how to create plots in R.

- This will involve understanding the grammar of graphics and what each of the different layers are.

- You will learn to create a range of different plots.

### Preparation

- Ensure you installed R and RStudio.

- Ensure you have installed the `ggplot2` package. You can install this package directly, but this package is also part of the `tidyverse` package.

- Before you get started set yourself up an R project. This will help you to direct R to where your data is installed.\

- If you need any help doing the above, refer to the Lecture 1 and Tutorial 1 material.

- Download the dataset from Moodle, `boston_celtics.csv,` and place it in a folder called `data` within your R Project.

### Task

Today you will be creating visualisations in `ggplot2` to analyse sporting statistics from the Boston Celtics NBA basketball team.

```{r echo = FALSE}
show_solutions = FALSE
```

### Exercise 1

1. Read your data set into R and follow the Reading Data Checklist from the lecture.\
\
We recommend installing the package `here` to help keep things organised when referencing files.

```{r echo = TRUE, warning = FALSE, message = FALSE}
if(!require(here)){
install.packages("here")
}
library(here)
```

```{r echo = TRUE, warning = FALSE, message = FALSE}
library(tidyverse)
boston_celtics <- read_csv(here("data", "boston_celtics.csv"))
```

If you have an trouble run through the check list:
```{r echo = TRUE, eval = FALSE, warning = FALSE, message = FALSE}
# Check your working directory
getwd()
# Check your data file is where you think it is
file.exists(here("data","boston_celtics.csv"))
# Read in your data
library(tidyverse)
boston_celtics <- read_csv(here("data", "boston_celtics.csv"))
# Look at your data
View(boston_celtics)
# Look at a summary
summary(boston_celtics)
```


2. Create a scatter plot that shows the `game_date` and `team_scores`. Change the colour of the points using `team_winner` to show if the team won the game. Make the colour of the points `forestgreen`. Focus on the geometry and aesthetic layer for now.

```{r eval = FALSE}
ggplot(data = ???) +
geom_???(aes(x = ???, y = ???), col = ???)
```

```{r echo = show_solutions, eval = show_solutions}
ggplot(data = boston_celtics) +
geom_point(aes(x = game_date, y = team_score), col = "forestgreen")
```

3. Colour the points according to if the team won or lost. Use the hex colour codes from the [Boston Celtics team]("https://teamcolorcodes.com/boston-celtics-color-codes/") to set the colour scheme.

```{r eval = FALSE}
ggplot(data = ???) +
geom_???(aes(x = ???, y = ???, col = ???))
```

```{r echo = show_solutions, eval = show_solutions}
ggplot(data = boston_celtics) +
geom_point(aes(x = game_date, y = team_score, col = team_winner)) +
scale_colour_manual(label = c("Loss", "Win"), values = c("TRUE" = "#007A33", "FALSE" = "#BA9653"))
```

4. As there are a lot of points, change the size and the transparency to reduce overplotting.

```{r eval = FALSE}
ggplot(data = ???) +
geom_???(
aes(x = ???, y = ???, col = ???),
size = ???, alpha = ???
)
```

```{r echo = show_solutions, eval = show_solutions}
ggplot(data = boston_celtics) +
geom_point(aes(x = game_date, y = team_score, col = team_winner), size = 0.75, alpha = 0.4) +
scale_colour_manual(label = c("Loss", "Win"), values = c("TRUE" = "#007A33", "FALSE" = "#BA9653"))
```

5.Look at the [theme options]("https://ggplot2.tidyverse.org/reference/ggtheme.html") and try using a few different ones, before deciding on the theme for your plot.

```{r eval = FALSE}
ggplot(data = ???) +
geom_???(
aes(x = ???, y = ???, col = ???),
size = ???, alpha = ???
) +
theme_???()
```

```{r echo = show_solutions, eval = show_solutions}
ggplot(data = boston_celtics) +
geom_point(aes(x = game_date, y = team_score, col = team_winner), size = 0.75, alpha = 0.4) +
scale_colour_manual(label = c("Loss", "Win"), values = c("TRUE" = "#007A33", "FALSE" = "#BA9653")) +
theme_bw()
```

5. Add appropriate labels to your plot and edit the legend position.

```{r eval = FALSE}
ggplot(data = ???) +
geom_???(
aes(x = ???, y = ???, col = ???),
size = ???, alpha = ???
) +
theme_???() +
labs(title = ???, subtitle = ???, x = ???, y = ???)
```

```{r echo = show_solutions, eval = show_solutions}
ggplot(data = boston_celtics) +
geom_point(aes(x = game_date, y = team_score, col = team_winner), size = 0.75, alpha = 0.4) +
scale_colour_manual(label = c("Loss", "Win"), values = c("TRUE" = "#007A33", "FALSE" = "#BA9653")) +
theme_bw() +
labs(title = "Boston Celtics Team Score",
subtitle = "2021 - Current Season",
x = "Date",
y = "Points")
```
6. Move the legend on the plot to increase the data-density. *(Note the solutions show a more advanced way to move the legend that the lecture notes.)*

```{r eval = FALSE}
ggplot(data = ???) +
geom_???(
aes(x = ???, y = ???, col = ???),
size = ???, alpha = ???
) +
theme_???() +
labs(title = ???, subtitle = ???, x = ???, y = ???)
theme(legend.position = ???)
```

```{r echo = show_solutions, eval = show_solutions}
ggplot(data = boston_celtics) +
geom_point(aes(x = game_date, y = team_score, col = team_winner), size = 0.75, alpha = 0.4) +
scale_colour_manual(label = c("Loss", "Win"), values = c("TRUE" = "#007A33", "FALSE" = "#BA9653")) +
theme_bw() +
labs(title = "Boston Celtics Team Score",
subtitle = "2021 - Current Season",
x = "Date",
y = "Points") +
theme(legend.position = c(0.8, 1.1), legend.title = element_blank(), legend.direction = "horizontal")
```
### Exercise 2

Now we've mastered the basic layers, take some time to creating some other visualisations that are interesting to you!

* You may like to produce plots looking at the distribution of some of the key variables. For example, `assists`, `rebounds`, `steals` etc.

```{r echo = show_solutions, eval = show_solutions}
boston_celtics |>
ggplot(aes(x = assists)) +
geom_histogram(aes(y = ..density..)) +
geom_density()
boston_celtics |>
ggplot(aes(x = steals)) +
geom_histogram(aes(y = ..density..)) +
geom_density()
boston_celtics |>
ggplot() +
geom_histogram(aes(x = total_rebounds), fill = "blue", alpha = 0.3) +
geom_histogram(aes(x = offensive_rebounds), fill = "green", alpha = 0.3) +
geom_histogram(aes(x = defensive_rebounds), fill = "orange", alpha = 0.3)
```

* Maybe you are interested in how the shot percentage varies between `field_goal_pct` (2 pts) and `three_point_field_goal_pct` (3 pts).

```{r echo = show_solutions, eval = show_solutions}
boston_celtics |>
filter(season == 2025) |>
ggplot() +
geom_point(aes(x = field_goal_pct, y = three_point_field_goal_pct))
```

* Perhaps you want to create a bar chart (`geom_bar`) to see how many times they've played opposing teams this season.

```{r echo = show_solutions, eval = show_solutions}
boston_celtics |>
filter(season == 2025) |>
ggplot() +
geom_bar(aes(y = opponent_team_name))
```
### Finishing Up

At the end of the tutorial, share your favourite plot to the discussion forum.

##### Material developed by Dr. Kate Saunders.

##### © Copyright 2024 Monash University

0 comments on commit ef2cba9

Please sign in to comment.