Author Archives: Jonathan Carroll

Digits Dilemma

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2024/06/15/digits-dilemma/

Another day, another short riddle to be solved with several programming
languages! This one is nice because solving it doesn’t need a lot of code, but
it uses some interesting aspects of evaluation.

I saw this post
which isn’t new (it’s from 2022) that poses a nice problem to solve:

With the numbers 123456789, make them add up to 100.
They must stay in the same order but you can use addition, subtraction,
multiplication, division, brackets etc. All numbers must be used exactly once.

and demonstrates a solution in Haskell

import Control.Monad (forM)
import Language.Haskell.Interpreter

expressions :: [String]
expressions =
  let ops = [ '+', '-', '/', '*' ]
   in [ [ '1', a, '2', b, '3', c, '4', d, '5', e, '6', f, '7', g, '8', h, '9' ]
        | a <- ops, b <- ops
        , c <- ops, d <- ops
        , e <- ops, f <- ops
        , g <- ops, h <- ops
        ]

result = runInterpreter $ do
  setImports ["Prelude"]
  exprs <- forM expressions evaluate
  pure $ filter (\(_, a) -> a == "100") $ fromRight [] exprs
  where

I’m trying to learn Haskell this year, so it was a great opportunity try to
follow along. I’m still working my way towards being able to run short snippets
– the ghci tool for interactive use has a bit of a learning curve and doesn’t
immediately let me use the imports (or I’m doing something wrong) so while I
think I can follow the steps as presented, I can’t yet dig into them as
interactively as I’d like.

The general idea, though, is to use a comprehension to expand all combinations of
the allowed operators (+, -, /, and *) between the values 1 to 9. I’m
somewhat familiar with comprehensions and played with them in my post
Pythagorean Triples with Comprehensions
in several languages, including Haskell.

I wanted to see how I might go about this problem in R, and I knew I’d need to make
some adjustments because R does not have comprehensions.

One way to get all the combinations of operators between the values is to use
expand.grid() which generates all combinations of its inputs

expand.grid(1:3, letters[1:3])
##   Var1 Var2
## 1    1    a
## 2    2    a
## 3    3    a
## 4    1    b
## 5    2    b
## 6    3    b
## 7    1    c
## 8    2    c
## 9    3    c

Defining the operators as strings, I can generate a data.frame of the values and
all combinations of operators between them

ops <- c("*", "+", "-", "/")
combos <- expand.grid(1, ops, 2, ops, 3, ops, 4, ops, 5, ops, 6, ops, 7, ops, 8, ops, 9)
head(combos)
##   Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11 Var12 Var13 Var14
## 1    1    *    2    *    3    *    4    *    5     *     6     *     7     *
## 2    1    +    2    *    3    *    4    *    5     *     6     *     7     *
## 3    1    -    2    *    3    *    4    *    5     *     6     *     7     *
## 4    1    /    2    *    3    *    4    *    5     *     6     *     7     *
## 5    1    *    2    +    3    *    4    *    5     *     6     *     7     *
## 6    1    +    2    +    3    *    4    *    5     *     6     *     7     *
##   Var15 Var16 Var17
## 1     8     *     9
## 2     8     *     9
## 3     8     *     9
## 4     8     *     9
## 5     8     *     9
## 6     8     *     9

This generates a lot of combinations – with 4 possible operators in 8 possible
positions there are \(4^8\) = 65,536 combinations.

Pasting these numbers and operators together into expressions

exprs <- apply(combos, 1, \(x) paste0(x, collapse = ""))
head(exprs)
## [1] "1*2*3*4*5*6*7*8*9" "1+2*3*4*5*6*7*8*9" "1-2*3*4*5*6*7*8*9"
## [4] "1/2*3*4*5*6*7*8*9" "1*2+3*4*5*6*7*8*9" "1+2+3*4*5*6*7*8*9"

I get something I can evaluate as if I typed 1*2*3 into a console. I can get
the results of evaluating those with

results <- sapply(exprs, \(x) eval(parse(text = x)))
head(results)
## 1*2*3*4*5*6*7*8*9 1+2*3*4*5*6*7*8*9 1-2*3*4*5*6*7*8*9 1/2*3*4*5*6*7*8*9 
##            362880            362881           -362879             90720 
## 1*2+3*4*5*6*7*8*9 1+2+3*4*5*6*7*8*9 
##            181442            181443

Now, I just need to see which of those produces a value of 100. Because sapply
produced a vector with the expression itself as the name, I can extract the names
of the results which are equal to 100

answers <- names(which(results == 100))
answers
##  [1] "1*2*3-4*5+6*7+8*9" "1+2+3-4*5+6*7+8*9" "1+2-3*4-5+6*7+8*9"
##  [4] "1-2*3-4-5+6*7+8*9" "1+2-3*4+5*6+7+8*9" "1-2*3-4+5*6+7+8*9"
##  [7] "1-2*3+4*5+6+7+8*9" "1*2*3+4+5+6+7+8*9" "1+2+3+4+5+6+7+8*9"
## [10] "1+2*3+4*5-6+7+8*9" "1+2*3*4*5/6+7+8*9" "1*2*3*4+5+6-7+8*9"
## [13] "1*2*3*4+5+6+7*8+9" "1-2+3*4*5-6+7*8-9" "1-2+3*4*5+6*7+8-9"

Any of those can easily be verified manually

1*2*3*4+5+6+7*8+9
## [1] 100

One thing I noticed here was that I have one more result than the Haskell post
produces

length(answers)
## [1] 15

One of the answers stands out in that it contains a division, and sure enough this
is the one that doesn’t appear in the Haskell post. I’m not quite sure why – I
think the operator precedence is the same between R and Haskell, at least in terms
of these expressions

3 / 2 + 1
## [1] 2.5
ghci> 3 / 2 + 1
2.5

But since I haven’t yet been able to actually run that Haskell code myself, I can’t
verify those solutions.

My R solution to this puzzle is then

ops <- c("*", "+", "-", "/")
combos <- expand.grid(1, ops, 2, ops, 3, ops, 4, ops, 5, ops, 6, ops, 7, ops, 8, ops, 9)
exprs <- apply(combos, 1, \(x) paste0(x, collapse = ""))
results <- sapply(exprs, \(x) eval(parse(text = x)))
names(which(results == 100))

I did want to try a language which does have comprehensions – let’s try Julia!

Setting up the comprehension makes for a bit of a long line, but comes out okay

ops = ['+', '-', '*', '/']
## 4-element Vector{Char}:
##  '+': ASCII/Unicode U+002B (category Sm: Symbol, math)
##  '-': ASCII/Unicode U+002D (category Pd: Punctuation, dash)
##  '*': ASCII/Unicode U+002A (category Po: Punctuation, other)
##  '/': ASCII/Unicode U+002F (category Po: Punctuation, other)
exprs = ['1' * a * '2' * b * '3' * c * '4' * d * '5' * e * '6' * f * '7' * g * '8' * h * '9' 
         for a in ops, b in ops, c in ops, 
             d in ops, e in ops, f in ops, 
             g in ops, h in ops];
first(exprs, 10)
## 10-element Vector{String}:
##  "1+2+3+4+5+6+7+8+9"
##  "1-2+3+4+5+6+7+8+9"
##  "1*2+3+4+5+6+7+8+9"
##  "1/2+3+4+5+6+7+8+9"
##  "1+2-3+4+5+6+7+8+9"
##  "1-2-3+4+5+6+7+8+9"
##  "1*2-3+4+5+6+7+8+9"
##  "1/2-3+4+5+6+7+8+9"
##  "1+2*3+4+5+6+7+8+9"
##  "1-2*3+4+5+6+7+8+9"

That produces an array with many dimensions

size(exprs)
## (4, 4, 4, 4, 4, 4, 4, 4)

so it needs to be flattened into a vector with vec(). From there, it’s similar
to the R approach and I can use a eval(Meta.parse()) pattern, keeping in mind
that one can ‘broadcast’ scalar operations to vector operations with the dot
(.) operator

results = eval.(Meta.parse.(vec(exprs)));
first(results, 10)
## 10-element Vector{Real}:
##  45
##  41
##  44
##  42.5
##  39
##  35
##  38
##  36.5
##  46
##  34

Finding the values equal to 100 is similar to the R approach

exprs[findall(results .== 100)]
## 15-element Vector{String}:
##  "1*2*3*4+5+6+7*8+9"
##  "1-2+3*4*5+6*7+8-9"
##  "1-2+3*4*5-6+7*8-9"
##  "1+2+3+4+5+6+7+8*9"
##  "1*2*3+4+5+6+7+8*9"
##  "1-2*3+4*5+6+7+8*9"
##  "1+2*3+4*5-6+7+8*9"
##  "1-2*3-4+5*6+7+8*9"
##  "1+2-3*4+5*6+7+8*9"
##  "1+2*3*4*5/6+7+8*9"
##  "1*2*3*4+5+6-7+8*9"
##  "1-2*3-4-5+6*7+8*9"
##  "1+2-3*4-5+6*7+8*9"
##  "1+2+3-4*5+6*7+8*9"
##  "1*2*3-4*5+6*7+8*9"

and again we see the 15 answers including the one with a division, confirming the
R result.

This was a fun exploration – I don’t think I would want to try to solve it without
code, but the code solutions were a great opportunity to use a few different
languages.

I suspect there are a few different ways of solving this apart from this brute-force
expanding of every combination, perhaps with a solver or something. If you have one, I’d
love to see it. I thought @coolbutuseless had done something like this but the closest
I could find was this post
which is slightly different.

If you have comments, suggestions, or improvements, as always, feel free to use
the comment section below, or hit me up on
Mastodon.

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.3 (2024-02-29)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  tz       Australia/Adelaide
##  date     2024-07-06
##  pandoc   3.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.18    2023-06-19 [1] CRAN (R 4.3.2)
##  bookdown      0.36    2023-10-16 [1] CRAN (R 4.3.2)
##  bslib         0.6.1   2023-11-28 [3] CRAN (R 4.3.2)
##  cachem        1.0.8   2023-05-01 [3] CRAN (R 4.3.0)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.3)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.3.2)
##  digest        0.6.34  2024-01-11 [3] CRAN (R 4.3.2)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.23    2023-11-01 [3] CRAN (R 4.3.2)
##  fastmap       1.1.1   2023-02-24 [3] CRAN (R 4.2.2)
##  fs            1.6.3   2023-07-20 [3] CRAN (R 4.3.1)
##  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.3)
##  htmltools     0.5.7   2023-11-03 [3] CRAN (R 4.3.2)
##  htmlwidgets   1.6.2   2023-03-17 [1] CRAN (R 4.3.2)
##  httpuv        1.6.12  2023-10-23 [1] CRAN (R 4.3.2)
##  icecream      0.2.1   2023-09-27 [1] CRAN (R 4.3.2)
##  jquerylib     0.1.4   2021-04-26 [3] CRAN (R 4.1.2)
##  jsonlite      1.8.8   2023-12-04 [3] CRAN (R 4.3.2)
##  JuliaCall     0.17.5  2022-09-08 [1] CRAN (R 4.3.3)
##  knitr         1.45    2023-10-30 [3] CRAN (R 4.3.2)
##  later         1.3.1   2023-05-02 [1] CRAN (R 4.3.2)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.3)
##  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.3)
##  memoise       2.0.1   2021-11-26 [3] CRAN (R 4.2.0)
##  mime          0.12    2021-09-28 [3] CRAN (R 4.2.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.3.2)
##  pkgbuild      1.4.2   2023-06-26 [1] CRAN (R 4.3.2)
##  pkgload       1.3.3   2023-09-22 [1] CRAN (R 4.3.2)
##  prettyunits   1.2.0   2023-09-24 [3] CRAN (R 4.3.1)
##  processx      3.8.3   2023-12-10 [3] CRAN (R 4.3.2)
##  profvis       0.3.8   2023-05-02 [1] CRAN (R 4.3.2)
##  promises      1.2.1   2023-08-10 [1] CRAN (R 4.3.2)
##  ps            1.7.6   2024-01-18 [3] CRAN (R 4.3.2)
##  purrr         1.0.2   2023-08-10 [3] CRAN (R 4.3.1)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.3)
##  Rcpp          1.0.11  2023-07-06 [1] CRAN (R 4.3.2)
##  remotes       2.4.2.1 2023-07-18 [1] CRAN (R 4.3.2)
##  rlang         1.1.4   2024-06-04 [1] CRAN (R 4.3.3)
##  rmarkdown     2.25    2023-09-18 [3] CRAN (R 4.3.1)
##  rstudioapi    0.15.0  2023-07-07 [3] CRAN (R 4.3.1)
##  sass          0.4.8   2023-12-06 [3] CRAN (R 4.3.2)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.2)
##  shiny         1.7.5.1 2023-10-14 [1] CRAN (R 4.3.2)
##  stringi       1.8.3   2023-12-11 [3] CRAN (R 4.3.2)
##  stringr       1.5.1   2023-11-14 [3] CRAN (R 4.3.2)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.3.2)
##  usethis       2.2.2   2023-07-06 [1] CRAN (R 4.3.2)
##  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.3)
##  xfun          0.41    2023-11-01 [3] CRAN (R 4.3.2)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.3.2)
##  yaml          2.3.8   2023-12-11 [3] CRAN (R 4.3.2)
## 
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.3
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────

My First Julia Package – TriangulArt.jl

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2024/02/04/my-first-julia-package-triangulart-jl/

I’ve tried to get this same image transformation working at least three times now, but
I can finally celebrate that it’s working! I’ve been (re-)learning Julia and I still
love the language, so it was time to take my learning to the next level and actually
build a package.

I’ve tried to get this same image transformation working at least three times now, but
I can finally celebrate that it’s working! I’ve been (re-)learning Julia and I still
love the language, so it was time to take my learning to the next level and actually
build a package.

For those not familiar, Julia is a much newer language than my daily-driver R, and with
that comes the freedom to take a lot of good features from other languages and implement
them. There are some features that R just won’t ever get, but they’re available in Julia and
they’re very nice to use.

I’ve written solutions to the first 20
or so Project Euler problems in Julia … wow, 5 years ago.

More recently, I have solved the first 18 days of Advent of Code 2023 in Julia
(my solutions
are in a fork of a package that I’m not using, so they more or less run independently).

With those under my belt, I revisited a project I’ve tried to implement several times. I like
the low-poly look and wanted to recreate it – it’s just an image transformation,
right? I’m even somewhat familiar with Delaunay Triangulation, or at least its dual the
Voronoi Tesselation from my days building spatial maps
of fishing areas.

It sounds like a simple enough problem; choose some points all over the image, triangulate between
all of them, then shade the resulting triangles with the average colour of the pixels they enclose.

I found this nice image of a rainbow lorikeet (these frequent my backyard)

Rainbow Lorikeet

so I got to work trying to chop it up into triangles.

Well, the naive approach is simple enough, but it produces some terrible results. I’ve built that version into what I did eventually get working, and it’s… not what I want

It works, but at what cost?

The problem is that by randomly selecting points across the image, you lose all the structure. With
enough triangles you might recover some of that, but then you have a lot of triangles and lose that
low-poly vibe.

After much searching for a better way to do this, I found this article from 2017. It’s
a python approach, but I figured I knew enough Julia and Python now that I could try to make
a 1:1 translation.

The first step is to get the random sampling working, because it allows me to start testing
the triangulation parts quickly. Generating those is pretty clean

function generate_uniform_random_points(image::Matrix{RGB{N0f8}}, n_points::Integer=100)
    ymax, xmax = size(image)[1:2]
    rand(n_points, 2) .* [xmax ymax]
end

The triangulation itself is handled by DelaunayTriangulation::triangulate()
for once I’m happy that there’s so much scientific/statistical support in Julia

rng = StableRNG(2)
tri = triangulate(all_points; rng)

Slightly trickier is figuring out which points are in which triangle. For that, I am
thankful for PolygonOps::inpolygon(). With the pixels for each triangle identified,
it was only a matter of averaging the R, G, and B channels to get the median colour.

I got that working, but with the results above – far from pleasant. The next, much harder step,
was to weight the points towards the ’edges’ of the image. I couldn’t find an easy
way to translate the python code for locally sampling the entropy (via skimage)

filters.rank.entropy(im2, morphology.disk(entropy_width))

so I tried to build something of my own. I tried edge-detection algorithms but
I was clearly doing something wrong with it. Partly, I suspect, not doing the down-weighting
that the python version includes.

Since the pixels we want to up-weight are all along lines, choosing these at random can
end up with several right next to each other, which we don’t want. The python version does
something a little clever – it selects one point, then reduces the weighting of the entire image
with a Gaussian around that point, so that nearby points are unlikely to also be selected.

In the end, I failed to find a good Julia alternative, but calling python code is (almost) as
simple as using PyCall; @pyimport skimage as skimage (with slight modifications to use in a
package, as I would later discover).

With that in place, I was able to successfully weight towards high-entropy regions; regions
where a larger number of bytes are required to encode a histogram of the grayscale pixels, i.e.
where there’s a lot going on. The results are much more pleasing

Entropy-weighted Delaunay Triangulation

Along the way I added some debug features, such as plotting the vertices and edges of the
triangulation on top of the image

Debug mode

With the workflow more or less working, I ran some profiling to see if I could
speed it up. Unsurprisingly, generating the weighted points was one area where a
lot of time was spent, though it’s not yet clear if that’s because it’s python
code or because that’s genuinely one of the most complex parts of the code – my
best Julia alternative was to write my own short Shannon entropy function and
make it search locally with ImageFiltering::mapwindow

function shannon(w::AbstractMatrix)
   cm = collect(values(countmap(w))) / sum(size(w)) / 256
   sum([-x * log2(x) for x in cm])
end

mapwindow(shannon, image, (19, 19))

though, this creates a square subsampling, whereas the python version uses a nicer disk.

The profiling shows a lot of blank areas, and I’m not sure how to interpret those

Profile&rsquo;s @profview output in VS Code

I realised at this point that I actually didn’t know how long the python version takes
to run. I grabbed the original source code
and tried running it (after installing the relevant python packages) but it failed –
some packages had changed their arguments and signatures since this was written. A couple
of small updates later, my fork
now runs the code. It doesn’t take terribly long to run – it doesn’t display the image,
it saves it, and I’m not sure if that’s a factor. I (naively?) expected that the Julia
version would be a lot faster, and I’m hopeful that there’s performance I’ve left on the
table.

If anyone is interested in playing with a small-ish Julia package, feel free to poke at
this – it’s definitely not being used for anything critical.

For now, I’m enjoying throwing images at this and getting some nice looking results

If you’re interested in having a play with this package or helping to improve it,
it’s on GitHub – I’m not planning
to publish it to the registry any time soon, but that’s perhaps something to look
forward to in the future. For now, the main issues I see with this package are:

  • The white border around the produced image remains – I have tried setting
    margin=0mm but that doesn’t appear to help

  • Performance is not as good as it can be, I suspect; the entropy calculation
    (calling python) is definitely a bottleneck.

  • To speed up the processing, only every 10th pixel is used to determine the
    average colour of the triangle – this may fail to identify an entire triangle.

  • CI – I generated this package in VSCode using PkgTemplates and it is the
    first Julia package I’ve built. CI failed immediately, so I’ve probably done
    something wrong.

  • I am still somewhat of a beginner in Julia, so there are probably many places
    in which improvements can be made – feel free to suggest them!

As always, I can be found on Mastodon and
the comment section below.

devtools::session_info()

“`{r sessionInfo, echo = FALSE}
devtools::session_info()
“`

Now You’re Thinking with Arrays

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2023/08/29/now-you-re-thinking-with-arrays/

I keep hearing the assertion that “writing APL/Haskell/etc… makes you think
differently” and I kept wondering why I agreed with the statement but at the same time
didn’t think too much of it. I believe I’ve figured out that it’s because I happened to have
been using Array-aware languages this whole time! It turns out R is an even better
language for beginners than I thought.

Let’s start with some basics. A “scalar” value is just a number by itself. That might
have some units that may or may not be represented well in what you’re doing, but it’s
a single value on its own, like 42. A “vector” in R is just a collection of these “scalar”
values and is constructed with the c() operator

c(3, 4, 5, 6)
## [1] 3 4 5 6

Going right back to basics, the [1] output at the start of the line indicates the index
of the element directly to its right, in this case the first element. If we had more elements,
then the newline starts with the index of the first element on that line. Here I’ve set the line width smaller than usual so that it wraps sooner

1:42
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
## [13] 13 14 15 16 17 18 19 20 21 22 23 24
## [25] 25 26 27 28 29 30 31 32 33 34 35 36
## [37] 37 38 39 40 41 42

The quirk of how R works with vectors is the there aren’t actually any scalar values – if
you try to create a vector with only a single element, it’s still a vector

x <- c(42)
x
## [1] 42
is.vector(x)
## [1] TRUE

(note the [1] indicating the first index of the vector x and the vector TRUE). Even if you don’t try to make it a vector, it still is one

x <- 42
x
## [1] 42
is.vector(x)
## [1] TRUE

Mike Mahoney has
a great post detailing the term
“vector” and how it relates to an R vector as well as the more mathematical definition
which involves constructing an “arrow” in some space so that you describe both “magnitude”
and “direction” at the same time.

“Direction and magnitude”

“Direction and magnitude”

So, we always have a vector if we have a 1-dimensional collection of data. But wait,
you say, there’s also list!

x <- list(a = 42)
x
## $a
## [1] 42
is.vector(x)
## [1] TRUE

A nice try, but lists are also vectors, it’s just that they’re “recursive”

is.recursive(x)
## [1] TRUE
is.recursive(c(42))
## [1] FALSE

Fine, what about a matrix, then?

x <- matrix(1:9, nrow = 3, ncol = 3)
x
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
is.vector(x)
## [1] FALSE

No, but that still makes sense – a matrix isn’t a vector. It is however, an “array” – the
naming convention in R is a bit messy because, while “matrix” and “array” are often the
same thing, as the dimensions increase, more things expect an “array” class, so R tags
a “matrix” with both

class(matrix())
## [1] "matrix" "array"

This recently surprised Josiah Parry leading to this post explaining some of the internal
inconsistencies of this class of object.

Now that we have vectors figured out, I can get to the point of this post – that
thinking about data with a “vector” or even “array” mindset works differently.

I started learning some APL because I loved some videos by code_report. The person
behind those is Conor Hoekstra. I didn’t realise that I’d actually heard a
CoRecursive
podcast episode interviewing him, so now I need to go back and re-listen to that one. Conor
also hosts The Array Cast podcast that I heard him
mention in yet another of his podcasts (how do
people have the time to make all of these!?!). I was listening to the latest of these;
an interview with Rob Pike
, one
of the co-creators of Go and UTF-8 – it’s a really interesting interview full of history,
insights, and a lot of serious name dropping.

Anyway, Rob is describing what it is he really likes about APL and says

“I saw a talk some time ago, I wish I could remember where it was, where somebody said, this is why programming languages are so difficult for people. Let’s say that I have a list of numbers and I want to add seven to every number on that list. And he went through about a dozen languages showing how you create a list of numbers and then add seven to it. Right? And it went on and on and on. And he said,”Wouldn’t it be nice if you could just type 7+ and then write the list of numbers?” And he said, “Well, you know what? There’s one language that actually does that, and that’s APL.” And I think there’s something really profound in that, that there’s no ceremony in APL. If you want to add two numbers together in any language, you can add two numbers together. But if you want to add a matrix and a matrix, or a matrix and a vector, or a vector and a scaler or whatever, there’s no extra ceremony involved. You just write it down.

(link to shownotes)

The talk he mentions is linked in the shownotes as “Stop Writing Dead Programs” by Jack Rusher (Strange Loop 2022) (linked to the relevant timestamp, and
which I’m pretty sure I’ve watched before – it’s a great talk!) where
Jack shows how to add 1 to a vector of values in a handful of languages. He demonstrates that in
some languages there’s lots you need to write that has nothing to do with the problem itself;
allocating memory, looping to some length, etc… then leads to demonstrating that the way to
do this in APL is

1 + 1 2 3 4

with none of the overhead – just the declaration of what operation should occur.

The excitement with which Rob explains this in the podcast spoke to how important this
idea is; that you can work with more than just scalar values in the mathematical sense
without having to explain to the language what you mean and write a loop around a vector.

Two questions were buzzing at the back of my mind, though:

  1. Why isn’t this such a revelation to me?

  2. Is this not a common feature?

I know R does work this way because I’m very familiar with it, and perhaps that is
the answer to the first question – I know R better than any other language I know, and
perhaps I’ve just become accustomed to being able to do things like “add two vectors”.

a <- c(1, 2, 3, 4, 5) # or 1:5
a + 7
## [1]  8  9 10 11 12
Now you’re thinking with portals vectors!

Now you’re thinking with portals vectors!

The ideas of “add two vectors” and “add a number to a vector” are one in the same, as discussed above. The ability to do so is called “rank polymorphism” and
R has a weak version of it – not everything works for every dimension, but single values, vectors, and matrices do generalise for many functions. I can add a value to every element of a matrix, too

m <- matrix(1:12, nrow = 3, ncol = 4)
m
##      [,1] [,2] [,3] [,4]
## [1,]    1    4    7   10
## [2,]    2    5    8   11
## [3,]    3    6    9   12
m + 7
##      [,1] [,2] [,3] [,4]
## [1,]    8   11   14   17
## [2,]    9   12   15   18
## [3,]   10   13   16   19

and adding a vector to a matrix repeats the operation over rows

m <- matrix(1, nrow = 3, ncol = 4)
m
##      [,1] [,2] [,3] [,4]
## [1,]    1    1    1    1
## [2,]    1    1    1    1
## [3,]    1    1    1    1
v <- c(11, 22, 33)
m + v
##      [,1] [,2] [,3] [,4]
## [1,]   12   12   12   12
## [2,]   23   23   23   23
## [3,]   34   34   34   34

Sidenote: the distinction between “repeat over rows” and “repeat over columns” is also
discussed in the Array Cast episode – if you want to know more, there’s “leading axis theory”. R uses column-major order which
is why the matrix m filled the sequential values down the first column, and why you need
to specify byrow = TRUE if you want to fill the other way. It’s also why m + v repeats
over rows, although if you are expecting it to repeat over columns and try to use a v with 4
elements it will (silently) work, recycling the vector v, and giving you something you
didn’t expect

v <- c(11, 22, 33, 44)
m + v
##      [,1] [,2] [,3] [,4]
## [1,]   12   45   34   23
## [2,]   23   12   45   34
## [3,]   34   23   12   45

{reticulate} has a really nice explainer
of the differences between R (column-major) and python (row-major), and importantly,
the interop between these two.

So, is working with arrays actually so uncommon? I first thought of Julia,
and since it’s much newer than R and took a lot of inspiration from a variety
of languages, perhaps it works

a = [1, 2, 3, 4, 5]
a + 7
ERROR: MethodError: no method matching +(::Vector{Int64}, ::Int64)
For element-wise addition, use broadcasting with dot syntax: array .+ scalar

Not quite, but the error message is extremely helpful. Julia wants to perform
element-wise addition using the broadcasting operator . so it needs to be

a .+ 7
5-element Vector{Int64}:
  8
  9
 10
 11
 12

Still, that’s a “know the language” thing that’s outside of “add a number to a vector”,
so no credit.

Well, what about python?

>>> a = [1, 2, 3, 4, 5]
>>> a + 7
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "int") to list

The canonical way, I believe, is to use a list comprehension

a = [1, 2, 3, 4, 5]
[i + 7 for i in a]
[8, 9, 10, 11, 12]

and we’re once more using a language feature that’s outside of “add a number to a
vector” so again, no credit. For the pedants: there is library support for this
if you use numpy

import numpy as np

a = [1, 2, 3, 4, 5]
np.array(a) + 7
array([ 8,  9, 10, 11, 12])

but I wouldn’t call that a success.

I asked ChatGPT what other languages could do this and it suggested MATLAB. Now,
that’s a proprietary language I don’t have access to, but octave is an open-source
alternative that is more or less the same, so I tried that

a = [1, 2, 3, 4, 5];
a
a =

   1   2   3   4   5
a + 7
ans =

    8    9   10   11   12

Okay, yeah – a win for MATLAB. I remember using MATLAB back at Uni in
an early (second year?) Maths (differential equations?) course and it was
probably the very first time I was actually
introduced to a programming language. IIRC, “everything is a matrix” (which works
out okay for engineering and maths use-cases) so this a) probably isn’t surprising that
it works, and b) makes sense that it gets lumped in with the “array languages”.

Thinking back to the other programming languages I’ve learned sufficiently, I wondered how Fortran
dealt with this – I used Fortran (90) for all of my PhD and postdoc calculations.
I loved that Fortran had vectors (and n-dimensional arrays) without
having to do any manual memory allocation, and for that reason alone it was well-suited to
theoretical physics modeling. I’ve been re-learning some Fortran via Exercism, so
I gave that a go

$ cat array.f90
program add_to_array

  implicit none
  integer, dimension(5) :: a

  a = (/1, 2, 3, 4, 5/)
  print *, a + 7

end program add_to_array

Compiling and running this…

$ gfortran -o array array.f90
$ ./array
           8           9          10          11          12

A win! Okay, a little ceremony to declare the vector itself, but that’s strict
typing for you.

With these results at hand, I think back to the question

Why isn’t this such a revelation to me?

I learned MATLAB, Fortran, then R, over the course of about a decade, and barely
touched other languages with any seriousness while doing so… I’ve been using
array languages more or less exclusively all this time.

“You merely learned to use arrays, I was born in them, molded by them”

“You merely learned to use arrays, I was born in them, molded by them”

Better still, they’re all column-major array languages.

I think this is why APL seems to beautiful to me – it does what I know I want and
it does it with the least amount of ceremony.

I wrote a bit about this in a previous post – that a language
can hide some complexity for you, like the fact that it does need to internally do a loop
over some elements in order to add two vectors, but when the language itself provides
an interface where you don’t have to worry about that, things get beautiful.

At PyConAU this year there was a keynote “The Complexity of Simplicity” which reminded me a lot of another post
“Complexity Has to Live Somewhere”.
I think APL really nailed removing a lot of the syntax complexity of a language, leaving
mainly just the operations you wish to perform. Haskell does similar but adds back in
(albeit, useful) language features that involve syntax.

Of the languages I did learn first, I would have to say that R wins over MATLAB and Fortran
in terms of suitability as a first programming language, but now that I recognise that the
“array” way of thinking comes along with that, I really do think it has a big advantage
over, say, python in terms of shaping that mindset. Sure, if you start out with numpy you
may gain that same advantage, but either way I would like to think there’s a lot to be
gained from starting with an “array-aware” language.

Did I overlook another language that can work so nicely with arrays? Have you reflected on
how you think in terms of arrays and programming in general? Let me know in the comments or
on Mastodon.

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  tz       Australia/Adelaide
##  date     2023-08-29
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.17    2023-05-16 [1] CRAN (R 4.1.2)
##  bookdown      0.29    2022-09-12 [1] CRAN (R 4.1.2)
##  bslib         0.5.0   2023-06-09 [3] CRAN (R 4.3.1)
##  cachem        1.0.8   2023-05-01 [3] CRAN (R 4.3.0)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.6.1   2023-03-23 [3] CRAN (R 4.2.3)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.1.2)
##  digest        0.6.33  2023-07-07 [3] CRAN (R 4.3.1)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.21    2023-05-05 [3] CRAN (R 4.3.0)
##  fastmap       1.1.1   2023-02-24 [3] CRAN (R 4.2.2)
##  fs            1.6.3   2023-07-20 [3] CRAN (R 4.3.1)
##  glue          1.6.2   2022-02-24 [3] CRAN (R 4.2.0)
##  htmltools     0.5.6   2023-08-10 [3] CRAN (R 4.3.1)
##  htmlwidgets   1.5.4   2021-09-08 [1] CRAN (R 4.1.2)
##  httpuv        1.6.6   2022-09-08 [1] CRAN (R 4.1.2)
##  jquerylib     0.1.4   2021-04-26 [3] CRAN (R 4.1.2)
##  jsonlite      1.8.7   2023-06-29 [3] CRAN (R 4.3.1)
##  knitr         1.43    2023-05-25 [3] CRAN (R 4.3.0)
##  later         1.3.0   2021-08-18 [1] CRAN (R 4.1.2)
##  lifecycle     1.0.3   2022-10-07 [3] CRAN (R 4.2.1)
##  magrittr      2.0.3   2022-03-30 [3] CRAN (R 4.2.0)
##  memoise       2.0.1   2021-11-26 [3] CRAN (R 4.2.0)
##  mime          0.12    2021-09-28 [3] CRAN (R 4.2.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.1.2)
##  pkgbuild      1.4.0   2022-11-27 [1] CRAN (R 4.1.2)
##  pkgload       1.3.0   2022-06-27 [1] CRAN (R 4.1.2)
##  prettyunits   1.1.1   2020-01-24 [3] CRAN (R 4.0.1)
##  processx      3.8.2   2023-06-30 [3] CRAN (R 4.3.1)
##  profvis       0.3.7   2020-11-02 [1] CRAN (R 4.1.2)
##  promises      1.2.0.1 2021-02-11 [1] CRAN (R 4.1.2)
##  ps            1.7.5   2023-04-18 [3] CRAN (R 4.3.0)
##  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.1.2)
##  R6            2.5.1   2021-08-19 [3] CRAN (R 4.2.0)
##  Rcpp          1.0.9   2022-07-08 [1] CRAN (R 4.1.2)
##  remotes       2.4.2   2021-11-30 [1] CRAN (R 4.1.2)
##  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.1.2)
##  rmarkdown     2.23    2023-07-01 [3] CRAN (R 4.3.1)
##  rstudioapi    0.15.0  2023-07-07 [3] CRAN (R 4.3.1)
##  sass          0.4.7   2023-07-15 [3] CRAN (R 4.3.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
##  shiny         1.7.2   2022-07-19 [1] CRAN (R 4.1.2)
##  stringi       1.7.12  2023-01-11 [3] CRAN (R 4.2.2)
##  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.1.2)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.1.2)
##  usethis       2.1.6   2022-05-25 [1] CRAN (R 4.1.2)
##  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.1.2)
##  xfun          0.40    2023-08-09 [3] CRAN (R 4.3.1)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.1.2)
##  yaml          2.3.7   2023-01-23 [3] CRAN (R 4.2.2)
## 
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────