-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathREADME.Rmd
26 lines (16 loc) · 1.43 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
output:
md_document:
variant: markdown_github
---
# MLDay18: Random Forests and Gradient Boosting Machines in R
Slides for **Machine Learning Day '18**. This talk provides an overview of the following topics, as well as some of their implementations in the R programming language:
* [Decision trees](https://en.wikipedia.org/wiki/Decision_tree_learning)
* [Random forests](https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm)
* [Gradient boosting machines](https://projecteuclid.org/euclid.aos/1013203451)
[Launch slides](https://bgreenwell.github.io/MLDay18/MLDay18.html#1)
# Abstract
Good modeling tools should be universally applicable in classification and regression, have state-of-the-art accuracy, scale well to large data sets, and handle missing values effectively. Additionally, it would be nice for these tools to be able to automatically discover which variables are important, how they interact, and whether there are any novel cases or outliers. In this presentation, we discuss two such modeling tools: random forests and gradient boosting machines. The talk will cover a brief background of both methodologies (including decision trees) as well as various implementations of each in the R software environment for statistical computing. The pros and cons of each implementation will also be covered.
```{r img, echo=FALSE, fig.align="center", out.width="80%"}
knitr::include_graphics("docs/figures/MLDay18.jpg")
```