Where for (loop) ARt Thou?

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2022/04/22/where-for-loop-art-thou/

I’ve long been interested in exactly how R works – not quite enough for me to learn all the
internals, but I was surprised that I could not find a clear guide towards exactly how vectorization works at the deepest level.

Let’s say we want to add two vectors which we’ve defined as x and y

x <- c(2, 4, 6)
y <- c(1, 3, 2)

One way to do this (the verbose, elementwise way) would be to add each pair of elements

c(x[1] + y[1], x[2] + y[2], x[3] + y[3])
## [1] 3 7 8

but if you are familiar with not repeating yourself, you might write this in a loop. Best practice
involves pre-filling the result to the correct size

res <- c(NA, NA, NA)
for (i in 1:3) {
  res[i] <- x[i] + y[i]
}
res
## [1] 3 7 8

There, we wrote a for loop.

Now, if you’ve ever seen R tutorials or discussions, you’ve probably had it drilled into you that
this is “bad” and should be replaced with something else. One of those options is an apply family
function

plus <- function(i) {
  x[i] + y[i]
}
sapply(1:3, plus)
## [1] 3 7 8

or something ‘tidier’

purrr::map_dbl(1:3, plus)
## [1] 3 7 8

(yes, yes, these functions are ‘bad’ because they don’t take x or y as arguments) but this stil
performs the operation elementwise. If you’ve seen even more R tutorais/discussions you’ve
probably been seen that vectorization is very handy – The R function + knows what to do with objects that
aren’t just a single value, and does what you might expect

x + y
## [1] 3 7 8

Now, if you’ve really read a lot about R, you’ll know that ‘under the hood’ a for-loop
is involved in every one of these, but it’s “lower down”, “at the C level”. Jenny Bryan makes the
point that “Of course someone has to write loops / It doesn’t have to be you” and for this reason,
vectorization in R is of great benefit.

So, there is a loop, but where exactly does that happen?

At some point, the computer needs to add the elements of x to the elements of y, and the simplest versions of this
happens one element at a time, in a loop. There’s a big sidetrack here about SIMD
which I’ll try to avoid, but I will mention that the Microsoft fork of R (artist, formerly known as Revolution R) running on Intel chips can do SIMD in MKL.

So, let’s start at the operator.

`+`
## function (e1, e2)  .Primitive("+")

Digging into primitives is a little tricky, but {pryr} can help

pryr::show_c_source(.Primitive("+"))
+ is implemented by do_arith with op = PLUSOP

We can browse a copy of the source for do_arith (in arithmetic.c) here where we see some
logic paths for scalars and vectors. Let’s assume we’re working with our example which has length(x) == length(y) > 1. With two non-scalar arguments

if !IS_SCALAR and argc == length(arg) == 2

This leads us to call R_binary

Depending on the class of the arguments, we need to call different functions, but for the sake of our example let’s say we have non-integer real numbers so we fork to real_binary. This takes a code argument for which type of operation we’re performing, and in our case it’s PLUSOP (noted above). There’s a case branch for this in which case, provided the arguments are of the same length (n1 == n2) we call

R_ITERATE_CHECK(NINTERRUPT, n, i, da[i] = dx[i] + dy[i];);

That’s starting to look a lot like a loop – there’s an iterator i and we’re going to call another function.

This jumps us over to a different file where we see LOOP_WITH_INTERRUPT_CHECK definitely performs some sort of loop. This takes the body above and the argument LOOP_ITERATE_CORE which is finally the actual loop!

#define R_ITERATE_CORE(n, i, loop_body) do {    \
   for (; i < n; ++i) { loop_body } \
} while (0)

so, that’s where the actual loop in a vectorized R call happens! ALL that sits behind the innocent-looking +.

That was thoroughly satisfying, but I did originally have in mind comparing R to another language – one where loops aren’t frowned upon because of performance, but rather encouraged… How do Julia loops differ?

Julia is not a vectorized language per se, but it has a neat ability to “vectorize” any operation, though in Julia syntax it’s “broadcasting”.

Simple addition can combine scalar values

3+4
## 7

Julia actually has scalar values (in R, even a single value is just a vector of length 1) so a single value could be

typeof(3)
## Int64

whereas several values need to be an Array, even if it only has 1 dimension

Vector{Int64}([1, 2, 3])
## 3-element Array{Int64,1}:
##  1
##  2
##  3

Trying to add two Arrays does work

[1, 2, 3] + [4, 5, 6]
## 3-element Array{Int64,1}:
##  5
##  7
##  9

but only because a specific method has been written for this case, i.e.

methods(+, (Array, Array))
## # 1 method for generic function "+":
## [1] +(A::Array, Bs::Array...) in Base at arraymath.jl:43

One thing I particularly like is that we can see exactly which method was called using the @which macro

@which [1, 2, 3, 4] + [1, 2, 3, 4]
+(A::Array, Bs::Array...) in Base at arraymath.jl:43

something that I really wish was easier to do in R. The @edit macro even jumps us right into the actual code for this dispatched call.

This ‘add vectors’ problem can be solved through broadcasting, which performs an operation elementwise

[1, 2, 3] .+ [4, 5, 6]
## 3-element Array{Int64,1}:
##  5
##  7
##  9

The fun fact about this I recently learned was that broadcasting works on any operation, even if that’s the pipe itself

["a", "list", "of", "strings"] .|> [uppercase, reverse, titlecase, length]
## 4-element Array{Any,1}:
##   "A"
##   "tsil"
##   "Of"
##  7

Back to our loops, the method for + on two Arrays points us to arraymath.jl (linked to current relevant line) which contains

function +(A::Array, Bs::Array...)
    for B in Bs
        promote_shape(A, B) # check size compatibility
    end
    broadcast_preserving_zero_d(+, A, Bs...)
end

The last part of that is the meaningful part, and that leads to Broadcast.broadcast_preserving_zero_d.

This starts to get out of my depth, but essentially

@inline function broadcast_preserving_zero_d(f, As...)
    bc = broadcasted(f, As...)
    r = materialize(bc)
    return length(axes(bc)) == 0 ? fill!(similar(bc, typeof(r)), r) : r
end

@inline broadcast(f, t::NTuple{N,Any}, ts::Vararg{NTuple{N,Any}}) where {N} = map(f, t, ts...)

involves a map operation to achieve the broadcasting.

Perhaps that’s a problem to tackle when I’m better at digging through Julia.

As always, comments, suggestions, and anything else welcome!

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 4.1.2 (2021-11-01)
##  os       Pop!_OS 21.04               
##  system   x86_64, linux-gnu           
##  ui       X11                         
##  language en_AU:en                    
##  collate  en_AU.UTF-8                 
##  ctype    en_AU.UTF-8                 
##  tz       Australia/Adelaide          
##  date     2022-04-22                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date       lib source        
##  blogdown      1.8     2022-02-16 [1] CRAN (R 4.1.2)
##  bookdown      0.24    2021-09-02 [1] CRAN (R 4.1.2)
##  brio          1.1.1   2021-01-20 [3] CRAN (R 4.0.3)
##  bslib         0.3.1   2021-10-06 [1] CRAN (R 4.1.2)
##  cachem        1.0.3   2021-02-04 [3] CRAN (R 4.0.3)
##  callr         3.7.0   2021-04-20 [1] CRAN (R 4.1.2)
##  cli           3.2.0   2022-02-14 [1] CRAN (R 4.1.2)
##  crayon        1.5.0   2022-02-14 [1] CRAN (R 4.1.2)
##  desc          1.4.1   2022-03-06 [1] CRAN (R 4.1.2)
##  devtools      2.4.3   2021-11-30 [1] CRAN (R 4.1.2)
##  digest        0.6.27  2020-10-24 [3] CRAN (R 4.0.3)
##  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.2)
##  evaluate      0.14    2019-05-28 [3] CRAN (R 4.0.1)
##  fastmap       1.1.0   2021-01-25 [3] CRAN (R 4.0.3)
##  fs            1.5.0   2020-07-31 [3] CRAN (R 4.0.2)
##  glue          1.6.1   2022-01-22 [1] CRAN (R 4.1.2)
##  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.1.2)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.1.2)
##  jsonlite      1.7.2   2020-12-09 [3] CRAN (R 4.0.3)
##  JuliaCall   * 0.17.4  2021-05-16 [1] CRAN (R 4.1.2)
##  knitr         1.37    2021-12-16 [1] CRAN (R 4.1.2)
##  lifecycle     1.0.1   2021-09-24 [1] CRAN (R 4.1.2)
##  magrittr      2.0.1   2020-11-17 [3] CRAN (R 4.0.3)
##  memoise       2.0.0   2021-01-26 [3] CRAN (R 4.0.3)
##  pkgbuild      1.2.0   2020-12-15 [3] CRAN (R 4.0.3)
##  pkgload       1.2.4   2021-11-30 [1] CRAN (R 4.1.2)
##  prettyunits   1.1.1   2020-01-24 [3] CRAN (R 4.0.1)
##  processx      3.5.2   2021-04-30 [1] CRAN (R 4.1.2)
##  ps            1.5.0   2020-12-05 [3] CRAN (R 4.0.3)
##  purrr         0.3.4   2020-04-17 [3] CRAN (R 4.0.1)
##  R6            2.5.0   2020-10-28 [3] CRAN (R 4.0.2)
##  Rcpp          1.0.6   2021-01-15 [3] CRAN (R 4.0.3)
##  remotes       2.4.2   2021-11-30 [1] CRAN (R 4.1.2)
##  rlang         1.0.1   2022-02-03 [1] CRAN (R 4.1.2)
##  rmarkdown     2.13    2022-03-10 [1] CRAN (R 4.1.2)
##  rprojroot     2.0.2   2020-11-15 [3] CRAN (R 4.0.3)
##  rstudioapi    0.13    2020-11-12 [3] CRAN (R 4.0.3)
##  sass          0.4.0   2021-05-12 [1] CRAN (R 4.1.2)
##  sessioninfo   1.1.1   2018-11-05 [3] CRAN (R 4.0.1)
##  stringi       1.5.3   2020-09-09 [3] CRAN (R 4.0.2)
##  stringr       1.4.0   2019-02-10 [3] CRAN (R 4.0.1)
##  testthat      3.1.2   2022-01-20 [1] CRAN (R 4.1.2)
##  usethis       2.1.5   2021-12-09 [1] CRAN (R 4.1.2)
##  withr         2.5.0   2022-03-03 [1] CRAN (R 4.1.2)
##  xfun          0.30    2022-03-02 [1] CRAN (R 4.1.2)
##  yaml          2.2.1   2020-02-01 [3] CRAN (R 4.0.1)
## 
## [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
## [2] /usr/local/lib/R/site-library
## [3] /usr/lib/R/site-library
## [4] /usr/lib/R/library