forked from hadley/r-pkgs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
style.rmd
292 lines (212 loc) · 8.09 KB
/
style.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
---
title: Style guide
layout: default
output: oldbookdown::html_chapter
---
# Style guide {#style}
Good coding style is like using correct punctuation. You can manage without it, but it sure makes things easier to read. As with styles of punctuation, there are many possible variations. The following guide describes the style that I use (in this book and elsewhere). It is based on Google's [R style guide][1], with a few tweaks. You don't have to use my style, but you really should use a consistent style. \index{style guide} \index{code style}
Good style is important because while your code only has one author, it'll usually have multiple readers. This is especially true when you're writing code with others. In that case, it's a good idea to agree on a common style up-front. Since no style is strictly better than another, working with others may mean that you'll need to sacrifice some preferred aspects of your style.
The formatR package, by Yihui Xie, makes it easier to clean up poorly formatted code. It can't do everything, but it can quickly get your code from terrible to pretty good. Make sure to read [the notes](https://yihui.name/formatR/) before using it.
## Notation and naming {#style-notnam}
### File names
File names should be meaningful and end in `.R`.
# Good
fit_models.R
utility_functions.R
# Bad
foo.r
stuff.r
If files need to be run in sequence, prefix them with numbers:
0_download.R
1_parse.R
2_explore.R
Pay attention to capitalization, since you, or some of your collaborators, might be using an operating system with a case-insensitive file system (e.g., Microsoft Windows or OS X) which can lead to problems with (case-sensitive) revision control systems. Never use filenames that differ only in capitalisation.
### Object names
> "There are only two hard things in Computer Science: cache invalidation and
> naming things."
>
> --- Phil Karlton
Variable and function names should be lowercase. Use an underscore (`_`) to separate words within a name. Generally, variable names should be nouns and function names should be verbs. Strive for names that are concise and meaningful (this is not easy!).
Although standard R uses dots extensively in function names (`contrib.url()`), methods (`all.equal`), or class names (`data.frame`), it's better to use underscores. For example, the basic S3 scheme to define a method for a class, using a generic function, would be to concatenate them with a dot, like this `generic.class`. This can lead to confusing methods like `as.data.frame.data.frame()` whereas something like `print.my_class()` is unambiguous.
```{r, eval = FALSE}
# Good
day_one
day_1
# Bad
first_day_of_the_month
DayOne
dayone
djm1
```
Where possible, avoid using names of existing functions and variables. This will cause confusion for the readers of your code.
```{r, eval = FALSE}
# Bad
T <- FALSE
c <- 10
mean <- function(x) sum(x)
```
## Syntax {#style-syn}
### Spacing
Place spaces around all infix operators (`=`, `+`, `-`, `<-`, etc.). The same rule applies when using `=` in function calls. Always put a space after a comma, and never before (just like in regular English).
```{r, eval = FALSE}
# Good
average <- mean(feet / 12 + inches, na.rm = TRUE)
# Bad
average<-mean(feet/12+inches,na.rm=TRUE)
```
There's a small exception to this rule: `:`, `::` and `:::` don't need spaces around them.
```{r, eval = FALSE}
# Good
x <- 1:10
base::get
# Bad
x <- 1 : 10
base :: get
```
Place a space before left parentheses, except in a function call.
```{r, eval = FALSE}
# Good
if (debug) show(x)
plot(x, y)
# Bad
if(debug)show(x)
plot (x, y)
```
Extra spacing (i.e., more than one space in a row) is ok if it improves alignment of equal signs or assignments (`<-`).
```{r, eval = FALSE}
list(
total = a + b + c,
mean = (a + b + c) / n
)
```
Do not place spaces around code in parentheses or square brackets (unless there's a comma, in which case see above).
```{r, eval = FALSE}
# Good
if (debug) do(x)
diamonds[5, ]
# Bad
if ( debug ) do(x) # No spaces around debug
x[1,] # Needs a space after the comma
x[1 ,] # Space goes after comma not before
```
### Curly braces
An opening curly brace should never go on its own line and should always be followed by a new line. A closing curly brace should always go on its own line, unless it's followed by `else` (or a closing parenthesis).
Always indent the code inside curly braces. When indenting your code, use two spaces. Never use tabs or mix tabs and spaces.
```{r, eval = FALSE}
# Good
if (y < 0 && debug) {
message("y is negative")
}
if (y == 0) {
if (x > 0) {
log(x)
} else {
message("x is negative or zero")
}
} else {
y ^ x
}
# Bad
if (y < 0 && debug)
message("Y is negative")
if (y == 0)
{
if (x > 0) {
⇥ log(x)
} else {
⇥ message("x is negative or zero")
}
}
else { y ^ x }
```
It's ok to leave very short statements on the same line:
```{r, eval = FALSE}
if (y < 0 && debug) message("Y is negative")
```
### Pipes
If you use the `%>%` operator from the tidyverse, put each verb on its own line. This makes it simpler to rearrange them later, and makes it harder to overlook a step. It is ok to keep a one-step pipe in one line.
```{r, eval = FALSE}
# Good
iris %>%
group_by(Species) %>%
summarize_all(mean) %>%
ungroup %>%
gather(measure, value, -Species) %>%
arrange(value)
iris %>% arrange(Petal.Width)
# Bad
iris %>% group_by(Species) %>% summarize_all(mean) %>%
ungroup %>% gather(measure, value, -Species) %>%
arrange(value)
```
### Line length
Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If you find yourself running out of room, this is a good indication that you should encapsulate some of the work in a separate function.
### Indentation
If a function definition runs over multiple lines, indent the second line to where the definition starts.
```{r, eval = FALSE}
# Good
long_function_name <- function(a = "a long argument",
b = "another argument",
c = "another long argument") {
# As usual code is indented by two spaces.
}
# Bad
long_function_name <- function(a = "a long argument",
b = "another argument",
c = "another long argument") {
}
```
If a function call is too long, put the function name, each argument, and the closing parenthesis on a separate line. This makes the code easier to read and to change later. You may also place several arguments on the same line if they are closely related to each other, e.g., strings in calls to `paste()` or `stop()`:
```{r, eval = FALSE}
# Good
do_something_very_complicated(
"that",
requires = many,
arguments = "some of which may be long"
)
paste0(
"Requirement: ", requires, "\n",
"Result: ", result, "\n"
)
# Bad
do_something_very_complicated("that", requires, many, arguments,
"some of which may be long"
)
paste0(
"Requirement: ", requires,
"\n", "Result: ",
result, "\n")
```
### Assignment
Use `<-`, not `=`, for assignment. \index{assignment}
```{r}
# Good
x <- 5
# Bad
x = 5
```
### Quotes
Use `"`, not `'`, for quoting text. The only exception is when the text already contains double quotes and no single quotes.
```{r}
# Good
"Text"
'Text with "quotes"'
# Bad
'Text'
'Text with "double" and \'single\' quotes'
```
## Functions {#style-fun}
* Should be verbs, where possible.
* Only use `return()` for early returns.
* Strive to keep blocks within a function on one screen, so around
20-30 lines maximum. Some even argue that if a *function* has 20 lines it
should be split into smaller functions.
## Organisation {#style-org}
### Commenting guidelines
Comment your code. Each line of a comment should begin with the comment symbol and a single space: `# `. Comments should explain the why, not the what. \index{comments}
Use commented lines of `-` and `=` to break up your file into easily readable chunks.
```{r, eval = FALSE}
# Load data ---------------------------
# Plot data ---------------------------
```
[1]: http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html