Course 2 - R Programming - Week 3 - Notes

Greg Foletta

2019-09-26

Loop Functions

  • lapply() - functionally loop over a list.
  • sapply() - same as lapply() but try to simplify the result.
  • apply() - apply a function over the margins of an array.
  • tapply() - apply a function over subsets of a vector.
  • mapply() - multivariate version of lapply().

split() is also useful in conjunction with lapply().

lapply

Takes a list, a function, and … which are passed to the function.

If the first argument is not a list, it is coerced into a list.

## $a
## [1] 10.5
## 
## $b
## [1] -0.08369768
## [[1]]
## [1] 98.2802
## 
## [[2]]
## [1]  99.65505 100.94736
## 
## [[3]]
## [1] 102.7982 100.9787 100.3612
## 
## [[4]]
## [1] 100.62897  99.85544  98.85271  99.18102

sapply

Will try to return to simplify the result of lapply() if possible.

  • If the result is a list where every length is 1, a vector is returned.
  • If the result is a list where every element is a vector of the same length (> 1), a matrix is returned.
  • Otherwise a list is returned.
##   a   b 
## 1.5 9.5
##      a  b
## [1,] 1  9
## [2,] 2 10
## [3,] 3 11
## [4,] 4 12

apply

Used to evaluate a function over the margins of an array.

  • Most often used to apply a function to the rows or columns of a matrix.
  • Can be used with general arrays e.g. taking the average of an array of matrices.
  • Not faster than a loop, but is one line.
## [1] -0.55420076 -1.09144455  0.15551881  0.01448826
## [1] -0.1854239 -0.5523952
##              [,1]        [,2]
## [1,]  0.222265026 -1.33066655
## [2,] -1.040484031 -1.14240508
## [3,]  0.083674911  0.22736271
## [4,] -0.007151469  0.03612798

The c(1,2) would be used if it was multi-dimensional matrix.

##              [,1]       [,2]
## [1,]  0.002001498 -0.2832860
## [2,] -0.570540246 -0.3387832

mapply

Multivariate apply - applies a function in parallel over a set of arguments.

## [[1]]
## [1] 1 1
## 
## [[2]]
## [1] 2 2 2 2 2 2
## 
## [[3]]
## [1] 3 3
## 
## [[4]]
## [1] 4 4 4 4 4 4

Can be used to vectorise functions.

tapply

Apply functions over subset of a vector.

##           1           2           3 
## -0.01922789 10.01989471 29.97146336

split

Takes a vector or other objects and splits into groups determined by a factor or list of factors.

## $`1`
## [1] 1 2 3 4
## 
## $`2`
## [1] 5 6 7 8
## 
## $`3`
## [1]  9 10 11 12
## 
## $`4`
## [1] 13 14 15 16

Can use lapply() or sapply() with the split.

## $`1`
## [1] 2.5
## 
## $`2`
## [1] 6.5
## 
## $`3`
## [1] 10.5
## 
## $`4`
## [1] 14.5

Can use it on data frames.

## $`5`
##     Ozone   Solar.R 
##  23.61538 181.29630 
## 
## $`6`
##     Ozone   Solar.R 
##  29.44444 190.16667 
## 
## $`7`
##     Ozone   Solar.R 
##  59.11538 216.48387 
## 
## $`8`
##     Ozone   Solar.R 
##  59.96154 171.85714 
## 
## $`9`
##     Ozone   Solar.R 
##  31.44828 167.43333

Can use sapply() - as the returned values are the same length, you can get the results in a matrix:

##                 5         6         7         8         9
## Ozone    23.61538  29.44444  59.11538  59.96154  31.44828
## Solar.R 181.29630 190.16667 216.48387 171.85714 167.43333

You can split on more than one level - e.g. male / female and another eye colour. You can use drop = T to drop the empty levels.

## [1] 1.1 1.1 1.2 2.2 2.3 2.3
## Levels: 1.1 2.1 1.2 2.2 1.3 2.3
## $`1.1`
## [1]  1  2  7  8 13 14
## 
## $`2.1`
## integer(0)
## 
## $`1.2`
## [1]  3  9 15
## 
## $`2.2`
## [1]  4 10 16
## 
## $`1.3`
## integer(0)
## 
## $`2.3`
## [1]  5  6 11 12 17 18

Debugging

  • message(): generic notification
  • warning(): indication someting is wrong but not fatal.
  • error(): indication a fatal error has ocurred.
  • condition generic concept for indicating something unexpected can occur.

Aside: To return from a function without printing, can use invisible().

Tools

  • traceback() prints the function call stack after an error ocurrs.
  • debug() - flags a function for “debug” mode which allows you to step through one line at a time.
  • browser() suspends the execution and moves into debug mode.
  • trace() allows you to insert debugging code into a function.
  • recover() - allows you to modify the error behaviour so that you can browse the function call stack.