Superassignment (`<<‐`)

An Evil Little Bastard or Just Misunderstood?

BB

fortunes::fortune("<<-")
  
##
  ## I wish <<- had never been invented, as it makes an esoteric and dangerous
  ## feature of the language *seem* normal and reasonable. If you want to dumb
  ## down R/S into a macro language, this is the operator for you.
  ##    -- Bill Venables
  ##       R-help (July 2001)
  

Environments have been covered ad nauseumAdvanced R: Environments so I see little point in doing that. However, given how frequently the superassignment operator is either misused or misunderstood I figure I'll try and write about it a little...

Prior paragraph notwithstanding, it is important to use a common parlance when discussing environments. As you are well aware, the main working environment in R is the global environment.

environment()
  
## <environment: R_GlobalEnv>
  
identical(environment(),globalenv())
  
## [1] TRUE
  

You can find the parent an environment with `parent.env()`. The parentRStudio Environment PicEnvironments in RStudio Pane of the global environment is the most recently attached package.

# No third-party packages loaded so far except what RStudio autoloads...
  parent.env(globalenv())
  
## <environment: package:knitr>
  ## attr(,"name")
  ## [1] "package:knitr"
  ## attr(,"path")
  ## [1] "C:/Users/Ben/Documents/R/win-library/3.4/knitr"
  
search()
  
##  [1] ".GlobalEnv"        "package:knitr"     "package:stats"
  ##  [4] "package:graphics"  "package:grDevices" "package:utils"
  ##  [7] "package:datasets"  "package:methods"   "Autoloads"
  ## [10] "package:base"
  

Note that the base environment has a parent which is not listed on the search path because it contains nothing at all...

parent.env(baseenv())
  
## <environment: R_EmptyEnv>
  
# It, however, has no parent...
  parent.env(emptyenv())
  
## Error in parent.env(emptyenv()): the empty environment has no parent
  

Create a new environment by using the `?new.env()` function. Note that as a default it sets the parent as (surprise, surprise!) the 'parent frame' (e.g., where it was called).

parent.env(new.env())
  
## <environment: R_GlobalEnv>
  

One final note about about environments is that if you set the parent high up the calling stack, the air gets mighty thin. Or, in other words, if you want to run a function in the new environment, you won't have access to all the usual functions you are used to...

# Use `?local` or one of the variants of `?eval` to run code in a specified environment...
  test_env <- new.env(parent=emptyenv())

  parent.env(test_env)
  
## <environment: R_EmptyEnv>
  
test_env$x <- 10

  local(x*5,envir = test_env)
  
## Error in x * 5: could not find function "*"
  
local(x <- 50,envir = test_env)
  
## Error in x <- 50: could not find function "<-"
  

Although the naming strategy above is somewhat ridiculous and obnoxious, it serves the purpose of making each's environment 'lineage' clear which may make things more clear...

Instead of the empty environment, let's set the base environment as the parent of our new environment as we will have access to most, but not all, functions there...

base_is_my_daddy <- new.env(parent=baseenv())

  parent.env(base_is_my_daddy)
  
## <environment: base>
  
# Global environment is NOT the daddy!
  identical(parent.env(.GlobalEnv),parent.env(base_is_my_daddy))
  
## [1] FALSE
  

OK, enough background on environments, let's play with the `<<‐` operator.

# It does nothing special in the global environment...
  x <- 10
  rm(x)
  x <<- 10
  x
  
## [1] 10
  

What if 'x' is assigned in a different environment?

local(x <- c(1,2,3),envir = base_is_my_daddy)
  base_is_my_daddy$x
  
## [1] 1 2 3
  
# Now, what about `<<-`?
  local(x <<- c('a','b','c'),envir = base_is_my_daddy)

  base_is_my_daddy$x
  
## [1] 1 2 3
  
# Still the same above, but now check 'x' in the global environment...

  x
  
## [1] "a" "b" "c"
  

This result might lead one to believe that `<<‐` is just a simple way to assign in the global environment, but, as the documentation clearly states "The operators <<‐ and ‐>> are normally only used in functions, and cause a search to be made through parent environments for an existing definition of the variable being assigned. If such a variable is found (and its binding is not locked) then its value is redefined, otherwise assignment takes place in the global environment. Note that their semantics differ from that in the S language, but are useful in conjunction with the scoping rules of R. See ‘The R Language Definition’ manual for further details and examples.", that is not always the case. Only if there is no other object with the same name (is our case 'x') in one of the parents of the environment where `<<‐` is assigned, will the variable be deposited in the global environment.

The code above illustrates an important point regarding `<<‐`: that the search literally begins in parent environments and if no name-match is found, the object is assigned to the global environment. What may not be intuitive, is the fact that despite their being an object 'x' in 'base_is_my_daddy', this remains unchanged as `<<‐` does not find and replace in the current environment, unless, of course, the current environment is the global environment.

Some additional examples may make this more clear...

First, consider what happens if we create a new environment with 'base_is_my_daddy' as its parent...

base_is_my_granddaddy <- new.env(parent = base_is_my_daddy)

  parent.env(base_is_my_granddaddy)
  
## <environment: 0x00000000111f4470>
  
# Important aside: do note that there is no true 'family tree'. It is really only appropriate to speak about parents and ancestors, not siblings and what not...

  parent.env(globalenv())
  
## <environment: package:knitr>
  ## attr(,"name")
  ## [1] "package:knitr"
  ## attr(,"path")
  ## [1] "C:/Users/Ben/Documents/R/win-library/3.4/knitr"
  
identical(globalenv(),parent.env(base_is_my_granddaddy))
  
## [1] FALSE
  

Let's remind ourselves of the value of 'x' in 'base_is_my_daddy' as well as the global environment and then use `<<‐` in 'base_is_my_granddaddy'...

x
  
## [1] "a" "b" "c"
  
base_is_my_daddy$x
  
## [1] 1 2 3
  
local(x <<- c('x','y','z'),envir = base_is_my_granddaddy)

  x
  
## [1] "a" "b" "c"
  
base_is_my_daddy$x
  
## [1] "x" "y" "z"
  

Now it becomes more clear. This behavior holds true 'below'Note that this terminology isn't totally correct either... the global environment as well...

global_is_my_daddy <- new.env(parent=.GlobalEnv)

  local(x<<-c('a','b','c'),envir = global_is_my_daddy)

  global_is_my_daddy$x
  
## NULL
  
x
  
## [1] "a" "b" "c"
  
global_is_my_granddaddy <- new.env(parent=global_is_my_daddy)

  global_is_my_daddy$x <- c(1,2,3)

  local(x<<-c('a','b','c'),envir = global_is_my_granddaddy)

  global_is_my_daddy$x
  
## [1] "a" "b" "c"
  

"The good use of superassignment is in conjuction with lexical scope, where an environment stores state for a function or set of functions that modify the state by using superassignment...

...The Evil and Wrong use is to modify variables in the global environment.

-thomas"

Finally, I would be remiss if I did not mention the actual, intended purpose of `<<‐` as related to use in functions. Specifcally, `<<‐` is one way to create 'closures' in R, which can capture, and store, function results in a local scope and then return them to the user. A detailed discussion is beyond the scope of this write-up, and again, others have covered this in considerable depthAdvanced R: Closures, but the basic technique is as follows...

closure_function <- function() {

      x <- list()
      i <- 1

      function() {

          x[i] <<- runif(1)
          i <<- i+1
          x
      }


  }

  x <- closure_function()

  x()
  
## [[1]]
  ## [1] 0.9987704
  
x()
  
## [[1]]
  ## [1] 0.9987704
  ##
  ## [[2]]
  ## [1] 0.6794095
  
x()
  
## [[1]]
  ## [1] 0.9987704
  ##
  ## [[2]]
  ## [1] 0.6794095
  ##
  ## [[3]]
  ## [1] 0.451239