Creating an R Package ��

Formulas and Their Model Matrix��

Chapter 9 ■ advanCed r programming

There are a few additional parameters to the Reduce function—to give it an additional initial value instead of just the leftmost elements in the first function call, or to make it apply the function from right to left instead of left to right—but you can check its documentation for details.

Function Operations: Functions as Input and Output

Functions can, of course, take functions as input and return functions as output.

This lets you modify functions and create new functions from existing functions.

First, consider two old friends, the factorial and the Fibonacci numbers. You have computed those recursively and using tables. What if you could build a generic function for caching results?

Here is an attempt:

cached <- function(f) { force(f) table <- list()

function(n) { key <- as.character(n) if (key %in% names(table)) { print(paste("I have already computed the value for", n)) table[[key]]

} else { print(paste("Going to compute the value for", n)) res <- f(n) print(paste("That turned out to be", res)) table[key] <<- res print(table) res

I added some output so it is easier to see what it does below.

It takes a function f and will give you another function back that works like f but remembers functions it has already computed. First, it remembers what the input function was by forcing it. This is necessary for the way we intend to use this cached function. The plan is to replace the function in the global scope with a cached version so the function out there will refer to the cached version. If you don’t force f here, the lazy evaluation means that when you eventually evaluate f, you are referring to the cached version and will end up in an infinite recursion. You can try removing the force(f) call and see what happens.

Next, we create a table—we are using a list, which is the best choice for tables in R in general. A list lets us use strings for indices, and doing that you don’t need to have all values between 1 and n stored to have an element with key n in the table.

The rest of the code builds a function that first looks in the table to see if the key is there. If it is, you have already computed the value you want and can get it from the table. If the key is not there, you compute it, put it in the table, and return.

250

You can try it out on the factorial function:

factorial <- function(n) { if (n == 1) { 1 } else { n * factorial(n - 1)

factorial <- cached(factorial) factorial(4) ## [1] "Going to compute the value for 4" ## [1] "Going to compute the value for 3" ## [1] "Going to compute the value for 2" ## [1] "Going to compute the value for 1" ## [1] "That turned out to be 1" ## $`1` ## [1] 1 ## ## [1] "That turned out to be 2" ## $`1` ## [1] 1 ## ## $`2` ## [1] 2 ## ## [1] "That turned out to be 6" ## $`1` ## [1] 1 ## ## $`2` ## [1] 2 ## ## $`3` ## [1] 6 ## ## [1] "That turned out to be 24" ## $`1` ## [1] 1 ## ## $`2` ## [1] 2 ## ## $`3` ## [1] 6 ## ## $`4` ## [1] 24 ## [1] 24

Chapter 9 ■ advanCed r programming

251

Chapter 9 ■ advanCed r programming

factorial(1) ## [1] "I have already computed the value for 1" ## [1] 1 factorial(2) ## [1] "I have already computed the value for 2" ## [1] 2 factorial(3) ## [1] "I have already computed the value for 3" ## [1] 6 factorial(4) ## [1] "I have already computed the value for 4" ## [1] 24

And on fibonacci:

fibonacci <- function(n) { if (n == 1 || n == 2) { 1 } else { fibonacci(n-1) + fibonacci(n-2)

fibonacci <- cached(fibonacci) fibonacci(4) ## [1] "Going to compute the value for 4" ## [1] "Going to compute the value for 3" ## [1] "Going to compute the value for 2" ## [1] "That turned out to be 1" ## $`2` ## [1] 1 ## ## [1] "Going to compute the value for 1" ## [1] "That turned out to be 1" ## $`2` ## [1] 1 ## ## $`1` ## [1] 1 ## ## [1] "That turned out to be 2" ## $`2` ## [1] 1 ## ## $`1` ## [1] 1 ## ## $`3` ## [1] 2 ## ## [1] "I have already computed the value for 2"

252

## [1] "That turned out to be 3" ## $`2` ## [1] 1 ## ## $`1` ## [1] 1 ## ## $`3` ## [1] 2 ## ## $`4` ## [1] 3 ## [1] 3 fibonacci(1) ## [1] "I have already computed the value for 1" ## [1] 1 fibonacci(2) ## [1] "I have already computed the value for 2" ## [1] 1 fibonacci(3) ## [1] "I have already computed the value for 3" ## [1] 2 fibonacci(4) ## [1] "I have already computed the value for 4" ## [1] 3

Chapter 9 ■ advanCed r programming

Ellipsis Parameters

Before you see any more examples of function operations, you need to know about a special function parameter, the ellipsis or “three-dots” parameter.

This is a magical parameter that lets you write a function that can take any number of named arguments and pass them on to other functions.

Without it, you would get an error if you provided a parameter to a function that it doesn’t know about.

f <- function(a, b) NULL f(a = 1, b = 2, c = 3) ## Error in f(a = 1, b = 2, c = 3): unused argument (c = 3)

With it, you can provide any named parameter you want.

g <- function(a, b, ...) NULL g(a = 1, b = 2, c = 3) ## NULL

Of course, it isn’t much of a feature to allow a function to take arguments that it doesn’t know what to do with. But you can pass those arguments to other functions that maybe do know what to do with them, and that is the purpose of the ... parameter.

253

Chapter 9 ■ advanCed r programming

You can see this in effect with a very simple function that just passes the ... parameter on to list. This works exactly like calling list directly with the same parameters, so nothing magical is going on here, but it shows how the named parameters are being passed along.

tolist <- function(...) list(...)

tolist() ## list() tolist(a = 1) ## $a ## [1] 1 tolist(a = 1, b = 2) ## $a ## [1] 1 ## ## $b ## [1] 2

This parameter has some uses in itself because it lets you write a function that calls other functions, and you can provide those functions parameters without explicitly passing them along. It is particularly important for generic functions (a topic we cover in the next chapter) and for modifying functions in function operators.

Most of what you can do with function operators is beyond the scope of this book, so if you are interested in learning more, you should check out the chapter about them in Hadley Wickham’s Advanced R Programming book (see http://adv-r.had.co.nz/Function-operators.html).

Here we will just have a quick second example, taken from Advanced R Programming, that modifies a function. It wraps a function to time how long it takes to run.

The following function wraps the function f into a function that times it and returns the time usage rather than the result of the function. It will work for any function since it just passes all parameters from the closure we create to the function we wrap (although the error profile will be different since the wrapping function will accept any named parameter while the original function f might not allow that).

time_it <- function(f) { force(f) function(...) { system.time(f(...))

You can try it out like this:

ti_mean <- time_it(mean) ti_mean(runif(1e6)) ## user system elapsed ## 0.025 0.002 0.026

254

Chapter 9 ■ advanCed r programming

Exercises

Try the following exercises to become more comfortable with the concepts discussed in this chapter.

between

Write a vectorized function that takes a vector x and two numbers, lower and upper, and replaces all elements in x smaller than lower or greater than upper with NA.

apply_if

Consider the function apply_if you implemented in this chapter. There we use a loop. Implement it using Filter and Map instead.

For the specific instance we used in the example:

apply_if(v, function(x) x %% 2 == 0, function(x) x^2)

We only have vectorized functions. Rewrite this function call using a vectorized expression.

power

We previously defined the generic power function and the instances square and cube this way:

power <- function(n) function(x) x^n square <- power(2) cube <- power(3)

If you instead defined this:

power <- function(x, n) x^n

How would you then define square and cube?

Row and Column Sums

Using apply, write the rowsum and colsum functions to compute the row sums and column sums, respectively, of a matrix.

Factorial Again

Write a vectorized factorial function. It should take a vector as input and compute the factorial of each element in the vector.

Try to make a version that remembers factorials it has already computed so you don’t need to recompute them (without using the cached function, of course).

255

Creating an R Package ��

Next Article

Formulas and Their Model Matrix��

Ellipsis Parameters

Exercises

between

apply_if

power

Row and Column Sums

Factorial Again

More articles from this publication:

Formulas and Their Model Matrix��

Bayesian Linear Regression��

Parallel Execution��

Switching to C++ ��

Speeding Up Your Code ��

Exercises��

Using git in RStudio��

Version Control and Repositories ��

Collaborating on GitHub��

This article is from:

Beginning of Data Science in R

Next Article

Formulas and Their Model Matrix���������������������������������������������������������������������������������

Ellipsis Parameters

Exercises

between

apply_if

power

Row and Column Sums

Factorial Again

More articles from this publication:

Formulas and Their Model Matrix���������������������������������������������������������������������������������

Bayesian Linear Regression�����������������������������������������������������������������������������������������

Version Control and Repositories ���������������������������������������������������������������������������������

This article is from:

Beginning of Data Science in R

Formulas and Their Model Matrix��

Formulas and Their Model Matrix��

Bayesian Linear Regression��

Version Control and Repositories ��