Personnal coding rules in R

Note: the following post is supposed to be updated regularly

Style guides

Some references are unavoidable, the Hadley’s style guide, the Google’s R style guide, and also The R Inferno by Patrick Burns

As Hadley wrote, the most important is to have a consistent and clear style.

Notations

Starting with variable, choices doesn’t really matters.

I use this style of notation:

# GOOD
variableName # camelCasing style
variable_Name

# Not that good (personnal taste)
variable.Name

# BAD
variablename

I like the camelCasing when the capital letters look like the humps on a camel’s back (e.g. camelBack).

I also like to write the name of the variable in first and then it’s specification:

# GOOD
kk = 1:5
kk_max = max(kk)
kk_sd = sd(kk)
kk_log10Mean = log10(mean(kk))

# BAD (personnal taste)
max_kk = max(kk)
sd_kk = sd(kk)
log10Mean_kk = log10(mean(kk))

So then, variables have the follwing look:

# GOOD
frenchMen_weight <- c(80, 84, 74)
frenchMen_size <- c(180, 175, 176)
frenchWomen_weight <- c(NA, 75, 65)
frenchWomen_size <- c(160, 177, 167)

frenchMen_sizeMean <- mean(frenchMen_size)
frenchMen_sizeSD <- sd(frenchMen_size)

Don’t do

Some letters are use in R, like T which is used for the boolean TRUE, therefore, do not use as variable:

# BAD
T
F
TRUE
FALSE

# GOOD
Temperature ; Total
Fox ; Flower

name of functions

I do not have a definitive point of view for naming function. Of course, I do not use name already used:

# BAD
c() # vector
lm() # linear model

And I use S3 dispatch when necessary, with the generic functions plot.myClass, print.myClass, summary.myClass, …

Look at the difference for Figures ?? and See Figure 1

a <- data.frame(col1 = 1:10, col2 = 1:10)

plot(a)

class(a) <- "myClass"

plot.myClass = function(x) plot(x[[1]] ~ x[[2]], type = "l")

plot(a)
Result of S3 dispatch on class myClass.

Figure 1: Result of S3 dispatch on class myClass.

Assignement operator: <- or =

Big question in R: revolutionanalytics? stackoverflow

Following the answer on stackoverflow by Richie Cotton, lets

Let’s take this function: median(x, na.rm = FALSE, ...), and look at all of these codes:

median(x <- 1:10)
median(x = 1:10)

# but:
median(a <- 1:10)
median(a = 1:10)

# and also:
b
median(b <- 1:10)
b

Another thing to test to understand difference between = and <-

a <- b = 5
a = b = 5 ; a ; b
a = 5 = b
6 -> c ; c
c <- 6 -> d ; c ; d

When you create a data.frame, or a list, you have to use = inside, while <- does not return an error:

myDf1 <- data.frame(e <- 6) ; myDf1
##   e....6
## 1      6
myDf2 <- data.frame(e = 6) ; myDf2
##   e
## 1 6
myList1 <- list(e <- 6) ; myList1
## [[1]]
## [1] 6
myList2 <- list(e = 6) ; myList2
## $e
## [1] 6

Finally, just use space in a write way to avoid the following mistake:

a <- 3
a< -3
## [1] FALSE

Finally, I use <- operator for assignement, and = inside functions.

Spacing

Spacing is used to have a clear code. * Spaces around assignements: a = b, a <- b * A space after a coma: myMatrix[2, 2] * No space in brackets: c(1, 4, 9) * Change of line every 80 characters * Change of line when repetting an assignement

# GOOD
myList <- list(a = 1,
               b = "fourchette",
               c = letters[1:10],
               d = 1:10,
               e = matrix(1:9, ncol=3),
               f = "couteau",
               g = "chocolat")

myList[[5]][2, 2]
## [1] 5
## GOOD but, I would put this long text in a separate file, and then load it
text_Miserable_Review <- "Tant qu’il existera, par le fait des lois et des
moeurs, une damnation sociale créant artificiellement, en pleine civilisation,
des enfers, et compliquant d’une fatalité humaine la destinée qui est divine ;
tant que les trois problèmes du siècle, la dégradation de l’homme par le
prolétariat, la déchéance de la femme par la faim, l’atrophie de l’enfant par
la nuit, ne seront pas résolus ; tant que, dans de certaines régions,
l’asphyxie sociale sera possible ; en d’autres termes, et à un point de vue
plus étendu encore, tant qu’il y aura sur la terre ignorance et misère, des
livres de la nature de celui-ci pourront ne pas être inutiles."

# BAD
myList <- list(a = 1, b = "fourchette", c = letters[1:10], d = 1:10, e = matrix(1:9, ncol=3), f = "couteau", g = "chocolat")

text_Miserable_Review <- "Tant qu’il existera, par le fait des lois et des moeurs, une damnation sociale créant artificiellement, en pleine civilisation, des enfers, et compliquant d’une fatalité humaine la destinée qui est divine ; tant que les trois problèmes du siècle, la dégradation de l’homme par le prolétariat, la déchéance de la femme par la faim, l’atrophie de l’enfant par la nuit, ne seront pas résolus ; tant que, dans de certaines régions, l’asphyxie sociale sera possible ; en d’autres termes, et à un point de vue plus étendu encore, tant qu’il y aura sur la terre ignorance et misère, des livres de la nature de celui-ci pourront ne pas être inutiles."

# Very BAD
myList<-list(a=1,b="fourchette",c=letters[1:10],d=1:10,e=matrix(1:9,ncol=3),f="couteau",g="chocolat")

ifelse what else

# GOOD
if (y == 0) {
  log(x)
} else {
  y ^ x
}

# GOOD if less than 80 characters
if (y == 0) log(x)

# GOOD also
if (y == 0) {
  log(x)
} else stop("error message")

Functions

  • Deprecated: deprecate()
  • Errors: stop()
  • Warnings: warning()
  • Return: return()

Boolean

From this answer in stackexchange

Difference between & and &&

((-2:2) >= 0) & ((-2:2) <= 0)
## [1] FALSE FALSE  TRUE FALSE FALSE
((-2:2) >= 0) && ((-2:2) <= 0)
## [1] FALSE

Difference between | and ||

((-5:-1) >= 0) | ((-2:2) <= 0)
## [1]  TRUE  TRUE  TRUE FALSE FALSE
((-5:-1) >= 0) || ((-2:2) <= 0)
## [1] TRUE
a
# Erreur : objet 'a' introuvable
TRUE || a
# [1] TRUE
FALSE && a
# [1] FALSE
TRUE | a
# Erreur : objet 'a' introuvable
FALSE & a
# Erreur : objet 'a' introuvable