Author Archives: Jonathan Carroll

These Languages are Accumulating

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2024/11/28/these-languages-are-accumulating/

I keep saying that the more programming languages you know, the more you will
understand all the others you know – I’m now at the point where I want to solve
every problem I see in a handful of different languages. They all offer
different functionality, and some are certainly more suited to particular
problems than others, but there’s a world of difference between two characters
and importing from two libraries.

A newsletter I follow (and can’t find online copies of) that demonstrates neat
things in python (gotta learn it, despite not loving it) recently covered
accumulate, showing that sum and accumulate were sort of related

>>> my_list = [42, 73, 0, 16, 10]
>>> sum(my_list)
141
>>> from itertools import accumulate
>>> list(accumulate(my_list))
[42, 115, 115, 131, 141]

sum adds up all the elements of the list, while accumulate does the same but
keeps each successive partial sum.

It rounds out the demo with an alternative function being used in accumulate

>>> from itertools import accumulate
>>> from operator import mul  # mul(2, 3) == 6
>>> initial_investment = 1000
>>> rates = [1.01, 1.01, 1.02, 1.025, 1.035, 1.035, 1.06]
>>> list(
...     accumulate(rates, mul, initial=initial_investment)
... )
[1000, 1010.0, 1020.1, 1040.502, 1066.515, 1103.843, 1142.477, 1211.026]

Now, firstly… from operator import mul??? It looks like there’s no way to
pass * as an argument to a function. I could define a function that performs
the same on known arguments, e.g. lambda x, y: x * y

>>> list(accumulate(rates, lambda x, y: x*y, initial=initial_investment))
[1000, 1010.0, 1020.1, 1040.502, 1066.5145499999999, 1103.8425592499998, 1142.4770488237498, 1211.0256717531747]

but… ew.

It’s possible that there’s a different way to approach this. A list
comprehension comes to mind, e.g. something like

>>> [sum(my_list[0:i]) for i in range(1, len(my_list)+1)]
[42, 115, 115, 131, 141]

but that requires performing a sum for each sub-interval, so performance would
not scale well (admittedly, that was not a consideration here at all). I also
don’t believe there’s a built-in prod so one must import math in order to do
similar

>>> import math
>>> x = [initial_investment] + rates
>>> [math.prod(x[0:i]) for i in range(1, len(x)+1)]
[1000, 1010.0, 1020.1, 1040.502, 1066.5145499999999, 1103.8425592499998, 1142.4770488237498, 1211.0256717531747]

In R that could use the built-in cumprod for the cumulative product

initial_investment <- 1000
rates = c(1.01, 1.01, 1.02, 1.025, 1.035, 1.035, 1.06)

cumprod(c(initial_investment, rates))
## [1] 1000.000 1010.000 1020.100 1040.502 1066.515 1103.843 1142.477 1211.026

but that has the ‘multiply’ operation hardcoded. cumsum uses + as the
function… hmmm. Maybe R doesn’t have a generalised accumulate?

I’ve been playing around with Haskell lately, so recursive functions to the
rescue! One feature of recursive functions in R that I really like is Recall
which calls the function in which it is defined with a new set of arguments –
perfect for recursion!

accumulate_recall <- function(x, f, i=x[1]) {
  if (!length(x)) return(NULL)
  c(i, Recall(tail(x, -1), f, f(i, x[2])))
}

It’s also robust against renaming the function; the body doesn’t actually call
accumulate_recall by name at all.

This might be inefficient, though – it’s not uncommon to blow out the stack, so
a new Tailcall function (which doesn’t have the same elegance of being robust
against renaming) helps with flagging this as something that can be optimised

accumulate <- function(x, f, i=x[1]) {
  if (!length(x)) return(NULL)
  c(i, Tailcall(accumulate, tail(x, -1), f, f(i, x[2])))
}

With this, I can emulate the cumsum() and cumprod() functions

cumprod(1:6)
## [1]   1   2   6  24 120 720
accumulate(1:6, `*`)
## [1]   1   2   6  24 120 720
cumsum(2:6)
## [1]  2  5  9 14 20
accumulate(2:6, `+`)
## [1]  2  5  9 14 20

unless I try to calculate something too big…

cumprod(5:15)
##  [1]           5          30         210        1680       15120      151200
##  [7]     1663200    19958400   259459200  3632428800 54486432000
accumulate(5:15, `*`)
## Warning in f(i, x[2]): NAs produced by integer overflow
##  [1]         5        30       210      1680     15120    151200   1663200
##  [8]  19958400 259459200        NA        NA

It appears that the built-in functions convert to numeric. That’s easily fixed
on input

accumulate(as.numeric(5:15), `*`)
##  [1]           5          30         210        1680       15120      151200
##  [7]     1663200    19958400   259459200  3632428800 54486432000

In any case, there’s a generalised accumulate that takes the bare functions as
arguments.

But it can be so much cleaner than this!

In APL you won’t find any function named “sum” because it is just a reduction
(Reduce in R) with the function +

      sum←+/
      
      sum ⍳6 ⍝ sum the values 1:6
21

      sum 1↓⍳6 ⍝ sum the values 2:6
20

which in R is

sum(1:6)
## [1] 21
sum(2:6)
## [1] 20

Why would you write sum if you can just use +/? It’s fewer
characters to write out the implementation than the name!

For accumulate the terminology in APL is scan which uses a very similar
glyph because the operation itself is very similar; a reduce (/) is just the
last value of a scan (\) which keeps the progressive values. In both cases,
the operator (either slash) takes a binary function as the left argument and
produces a modified function – in these examples, effectively sum and prod
which is then applied to values on the right. The scan version does the same

      +\⍳6
1 3 6 10 15 21

      ×\⍳6
1 2 6 24 120 720
accumulate(1:6, `+`)
## [1]  1  3  6 10 15 21
accumulate(1:6, `*`)
## [1]   1   2   6  24 120 720

As for the rates example above, we concatenate the initial value with catenate
(,) just like the R example, but otherwise this works fine

      rates ← 1.01 1.01 1.02 1.025 1.035 1.035 1.06
      inv ← 1000
      
      ×/inv, rates
1211.025672

      ×\inv, rates
1000 1010 1020.1 1040.502 1066.51455 1103.842559 1142.477049 1211.025672

So all of that recursive R code made to generalise the cumulative application of
a function provided as an argument is boiled down to just the single glyph \.
Outstanding!

What’s more, there are lots of binary functions one would use this with, all
of which have spelled-out names in other languages

      +/ ⍝ sum (add)
      ×/ ⍝ prod (multiply)
      ∧/ ⍝ all (and)
      ∨/ ⍝ any (or)
      ⌈/ ⍝ maximum (max)
      ⌊/ ⍝ minimum (min)

In summary, it seems that looking across these languages, the available options
range from a single glyph for scan along with the bare binary operator, e.g.
×/; a cumprod() function which isn’t well-generalised but works out of the
box; and then there’s whatever mess this is (once you’ve installed these)

>>> from itertools import accumulate
>>> from operator import mul
>>> list(accumulate(rates, mul, initial=initial_investment))

Where did we go so wrong?

For what it’s worth, Julia has a reduce and an accumulate that behave very
nicely; generalised for the binary function as an argument

julia> reduce(+, 1:6)
21

julia> reduce(*, 1:6)
720

julia> accumulate(+, 1:6)
6-element Vector{Int64}:
  1
  3
  6
 10
 15
 21

julia> accumulate(*, 1:6)
6-element Vector{Int64}:
   1
   2
   6
  24
 120
 720

This is extremely close to the APL approach, but with longer worded names for
the reduce and scan operators. It also defines the more convenient sum,
prod, cumsum, and cumprod; no shortage of ways to do this in Julia!

In Haskell, foldl and scanl are the (left-associative) version of reduce
and accumulate, and passing an infix as an argument necessitates wrapping it
in parentheses

ghci> foldl (+) 0 [1..6]
21

ghci> scanl (+) 0 [1..6]
[0,1,3,6,10,15,21]

ghci> foldl (*) 1 [1..6]
720

ghci> scanl (*) 1 [1..6]
[1,1,2,6,24,120,720]

This requires an explicit starting value, unless one uses the specialised
versions which use the first value as an initial value

ghci> foldl1 (+) [1..6]
21

ghci> scanl1 (+) [1..6]
[1,3,6,10,15,21]

ghci> foldl1 (*) [1..6]
720

ghci> scanl1 (*) [1..6]
[1,2,6,24,120,720]

I started this post hoping to demonstrate how nice the APL syntax was for this,
but the detour through generalising the R function was a lot of unexpected fun
as well.

Comments, improvements, or your own solutions are most welcome. I can be found
on Mastodon or use the comments below.

Addendums

It should probably be noted that R does have a function scan but it’s for
reading data into a vector – if you ever spot someone using it for that… run.
I have war stories about that function.

I’d love to hear how this is accomplished in some other languages, too – does it
have a built-in accumulate that takes a binary function?

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.1 (2024-06-14)
##  os       macOS Sonoma 14.6
##  system   aarch64, darwin20
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Australia/Adelaide
##  date     2024-11-28
##  pandoc   3.5 @ /opt/homebrew/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version    date (UTC) lib source
##  blogdown      1.19       2024-02-01 [1] CRAN (R 4.4.0)
##  bookdown      0.41       2024-10-16 [1] CRAN (R 4.4.1)
##  bslib         0.8.0      2024-07-29 [1] CRAN (R 4.4.0)
##  cachem        1.1.0      2024-05-16 [1] CRAN (R 4.4.0)
##  cli           3.6.3      2024-06-21 [1] CRAN (R 4.4.0)
##  devtools      2.4.5      2022-10-11 [1] CRAN (R 4.4.0)
##  digest        0.6.37     2024-08-19 [1] CRAN (R 4.4.1)
##  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.4.0)
##  evaluate      1.0.1      2024-10-10 [1] CRAN (R 4.4.1)
##  fastmap       1.2.0      2024-05-15 [1] CRAN (R 4.4.0)
##  fs            1.6.5      2024-10-30 [1] CRAN (R 4.4.1)
##  glue          1.8.0      2024-09-30 [1] CRAN (R 4.4.1)
##  htmltools     0.5.8.1    2024-04-04 [1] CRAN (R 4.4.0)
##  htmlwidgets   1.6.4      2023-12-06 [1] CRAN (R 4.4.0)
##  httpuv        1.6.15     2024-03-26 [1] CRAN (R 4.4.0)
##  jquerylib     0.1.4      2021-04-26 [1] CRAN (R 4.4.0)
##  jsonlite      1.8.9      2024-09-20 [1] CRAN (R 4.4.1)
##  knitr         1.48       2024-07-07 [1] CRAN (R 4.4.0)
##  later         1.3.2      2023-12-06 [1] CRAN (R 4.4.0)
##  lifecycle     1.0.4      2023-11-07 [1] CRAN (R 4.4.0)
##  magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.4.0)
##  memoise       2.0.1      2021-11-26 [1] CRAN (R 4.4.0)
##  mime          0.12       2021-09-28 [1] CRAN (R 4.4.0)
##  miniUI        0.1.1.1    2018-05-18 [1] CRAN (R 4.4.0)
##  pkgbuild      1.4.5      2024-10-28 [1] CRAN (R 4.4.1)
##  pkgload       1.4.0      2024-06-28 [1] CRAN (R 4.4.0)
##  profvis       0.4.0      2024-09-20 [1] CRAN (R 4.4.1)
##  promises      1.3.0      2024-04-05 [1] CRAN (R 4.4.0)
##  purrr         1.0.2      2023-08-10 [1] CRAN (R 4.4.0)
##  R6            2.5.1      2021-08-19 [1] CRAN (R 4.4.0)
##  Rcpp          1.0.13-1   2024-11-02 [1] CRAN (R 4.4.1)
##  remotes       2.5.0.9000 2024-11-03 [1] Github (r-lib/remotes@5b7eb08)
##  rlang         1.1.4      2024-06-04 [1] CRAN (R 4.4.0)
##  rmarkdown     2.28       2024-08-17 [1] CRAN (R 4.4.0)
##  rstudioapi    0.17.1     2024-10-22 [1] CRAN (R 4.4.1)
##  sass          0.4.9      2024-03-15 [1] CRAN (R 4.4.0)
##  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.4.0)
##  shiny         1.9.1      2024-08-01 [1] CRAN (R 4.4.0)
##  urlchecker    1.0.1      2021-11-30 [1] CRAN (R 4.4.0)
##  usethis       3.0.0      2024-07-29 [1] CRAN (R 4.4.0)
##  vctrs         0.6.5      2023-12-01 [1] CRAN (R 4.4.0)
##  xfun          0.49       2024-10-31 [1] CRAN (R 4.4.1)
##  xtable        1.8-4      2019-04-21 [1] CRAN (R 4.4.0)
##  yaml          2.3.10     2024-07-26 [1] CRAN (R 4.4.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────

Polyglot Maxxie and Minnie

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2024/10/26/polyglot-maxxie-and-minnie/

Continuing my theme of learning all the languages, I took the opportunity of a
programming puzzle to try out the same approach in a handful of different
languages to compare how they work.

For an upcoming APL’ers meetup the challenge was set as posed at the end of
in this post, namely

Maxxie and Minnie

The maxxie of a number n is the largest number you can achieve by swapping two
of its digits (in decimal) (or choosing not to swap if it is already the largest
possible). The minnie is the smallest with one swap (though you can’t swap a
zero digit into the most significant position).

Your task is to write a function that takes an integer and returns a tuple of
the maxxie and minnie.

Notes

  • Swap any two decimal digits
  • No leading zeroes
  • Don’t swap if you can’t make it bigger/smaller

with the example solutions given in Clojure

(swapmaxmin 213) ;=> [312, 123]
(swapmaxmin 12345) ;=> [52341, 12345] ;; the number was already the smallest
(swapmaxmin 100) ;=> [100, 100] ;; no swap possible because of zeroes

This seems like fun – and I wanted to see how solutions might look across some
of the different languages I know (including an APL, for the sake of the upcoming
meetup).

I ended up using R, (Dyalog) APL, Julia, Haskell, Python, and Rust; someone
provided a J solution; and I’ll add in any others shared with me. The site
linked above collected Clojure solutions in
this gist.

The common approach I used in all of these cases was:

  • split the number into a vector of digits
  • generate all possible combinations of indices to be swapped
  • apply a swap function to perform all of those swaps
  • append the unswapped vector, if not already included
  • filter out any vectors which start with a 0
  • recombine each vector into a single number
  • return the maximum and minimum numbers

Here are my solutions in each language; it’s not so much for side-by-side
comparison, but you can switch between the different ones. The full set of files
is here if you’re interested.

  • R

    I’m most familiar with R, so I like to start there. I created a swap
    function that swaps a vector at some indices, along with some helpers
    so that I could use pmap_int() really cleanly.

    swap <- function(x, y, v) {
       xx <- v[x]
       yy <- v[y]
       v[x] <- yy
       v[y] <- xx
       v
    }
    
    chr_swap <- function(x, y, v) {
       paste0(swap(x, y, v), collapse = "")
    }
    
    toInt_swap <- function(x, y, v) {
       as.integer(chr_swap(x, y, v))
    }
    
    maxmin <- function(num) {
      chars <- strsplit(as.character(num), "")[[1]]
      n <- nchar(num)
      s <- seq_len(n)
      opts <- expand.grid(x = s, y = s)
      opts$v <- list(chars)
      vals <- purrr::pmap_int(opts, toInt_swap)
      keeps <- vals[nchar(vals) == n]
      c(max(keeps), min(keeps))
    }
    
    maxmin(213)
    [1] 312 123  
    maxmin(12345)  
    [1] 52341 12345  
    maxmin(100)
    [1] 100 100  
    maxmin(11321)
    [1] 31121 11123 

    The expand.grid() does create some redundant combinations, but these fall
    out naturally so I didn’t bother filtering them out. Also, since this
    includes no-op swaps (e.g. swapping index 2 and 2) it already contains
    the original vector. Rather than filtering to the vectors of integers not
    starting with 0, I filtered to those which contain the right number of digits
    after converting back to integer, which is equivalent.

    Try pasting the code into the
    {webr} online editor here; I’m not sure
    if it’s possible to link to an existing state, but when it asks if you want
    to install {purrr} to the interface, respond that you do.

  • APL

    In Dyalog APL it’s easier to define a swap function; the @ operator takes a
    function (reverse) so s here performs a swap. The outer product is super
    handy for finding all the combinations of x and y: x ∘., y.

    maxmin←{
      ⎕IO←1  ⍝ so that x[1] is subset not x[0]
      n←⍎¨⍕⍵  ⍝ convert int to vec 
      s←{⌽@⍵⊢⍺}  ⍝ swap two elements
      swaps←{n s ⍵}  ⍝ apply swaps to a vec n
      opts←,(⍳≢n)∘.,⍳≢n ⍝ combinations of 1..n
      new←swaps¨opts  ⍝ perform the swaps
      keep←(~0=⊃¨new)/new  ⍝ filter out values starting with 0
      (⌈/,⌊/)10⊥¨keep  ⍝ max and min of ints
    }
    
         maxmin 213 
    312 123
         maxmin 12345 
    52341 12345
         maxmin 100 
    100 100
         maxmin 11321
    31121 11123

    I’m quite pleased with this solution; performing a map is as simple as
    using each (¨) and performing both max and min concatenated together
    with a fork ((⌈/,⌊/)) is just so aesthetic. Conversion from a vector of
    numbers to a single number uses a base-10 decode (10⊥) which is how one
    might need to do that in other languages, but with a loop.

    If I was to take some liberties with what one calls a ‘line’, I could say
    that this is a 1-line solution

    maxmin←{⎕IO←1 ⋄ n←⍎¨⍕⍵ ⋄ s←{⌽@⍵⊢⍺} ⋄ swaps←{n s ⍵} ⋄ opts←,(⍳≢n)∘.,⍳≢n ⋄ new←swaps¨opts ⋄ keep←(~0=⊃¨new)/new ⋄ (⌈/,⌊/)10⊥¨keep }

    You can
    try this out yourself at tryapl.org

  • Julia

    In Julia the swap function can use destructuring which is nice, but since
    the language uses pass-by-reference semantics, I need to make a copy of the
    vector being swapped, otherwise I’ll just keep swapping it over and over.
    Note: this recent post of mine.

    using Combinatorics
    
    function swap(x, i, j)
      y = copy(x)
      y[i], y[j] = y[j], y[i]
      y
    end
    
    function maxmin(x)
        nvec = parse.(Int64, split(string(x), ""))
        opts = collect(combinations(1:length(nvec), 2))
        new = [[nvec]; map(x -> swap(nvec, x...), opts)]
        keep = filter(x -> x[1] != 0, new)
        vals = parse.(Int64, join.(keep))
        (maximum(vals), minimum(vals))
    end
    
    maxmin(213)
    (312, 123)
    maxmin(12345)
    (52341, 12345)  
    maxmin(100)
    (100, 100)  
    maxmin(11321)
    (31121, 11123)    

    The part I probably had the most trouble with here was concatenating together
    the original vector with its swapped versions; it looks clean now, but
    figuring out how to get those all into the same vector-of-vectors took me a
    while.

    The splatting of opts variants in the map was nice; no need to define the
    swap in terms of a tuple. Overall, this is a very clean solution, in my
    opinion – Julia really does make for a lovely language.

  • Haskell

    Continuing my Haskell-learning journey, I figured it would be best to have a
    go at this. As a heavily functional language, one doesn’t do a lot of
    defining of variables, instead one writes a lot of functions which will pass
    data around. This makes it a bit tricky for testing, but I got there
    eventually. I did have to borrow the swapElts function, and nub was a new
    one for me (essentially unique()).

    import Data.List
    import Data.Digits
    
    uniq_pairs l = nub [(x,y) | x <- l, y <- l, x < y]
    opts n = uniq_pairs [0..n-1]
    -- https://gist.github.com/ijt/2010183
    swapElts i j ls = [get k x | (k, x) <- zip [0..length ls - 1] ls]
        where get k x | k == i = ls !! j
                      | k == j = ls !! i
                      | otherwise = x
    doswap t v = swapElts (fst t) (snd t) v
    newlist v = v : map (\x ->  doswap x v) (opts (length v))
    keep v = filter (\x -> (head x /= 0)) (newlist v)
    maxmin n = (maximum(x), minimum(x)) where 
      x = map (unDigits 10) (keep (digits 10 n))
    
    maxmin 213
    (312,123)
    maxmin 12345
    (52341,12345)
    maxmin 100
    (100,100)
    maxmin 11321
    (31121,11123)

    The Data.Digits package was very helpful here – having digits and
    unDigits, though if I was going to use these more I would have curried
    the required base 10 into something like digits10 and unDigits10.

    There are likely improvements to be made here, and I’m interested in any you
    can spot!

  • Python

    “Everyone” uses it, so I gotta learn it… is what I keep telling myself. I’m
    no stranger to the quirks of different languages, but every time I try
    to do something functional in python I end up angry that the print method for
    generators shows the memory address instead of, say, the first few elements.
    Printing a value and seeing <map at 0x7fb928d4a2c0> gets me every. single.
    time. Yes, yes, list(value) “collects” it, but grrr…

    Python has the destructuring syntax which is nice in the swap function, but
    again it’s pass-by-reference so I need to make a copy first.

    import itertools
    
    def swap(x, t):
        y = x.copy()
        i, j = t
        y[i], y[j] = y[j], y[i]
        return y
    
    def minmax(num): 
        nums = [int(i) for i in str(num)]
        opts = itertools.combinations(range(len(nums)), 2)
        new = map(lambda x: swap(nums, x), list(opts))
        keeps = list(filter(lambda x: x[0] != 0, new))
        keeps.append(nums)
        vals = list(map(lambda x: int(''.join(map(str, x))), keeps))
        return (max(vals), min(vals))
    
    minmax(213)
    (312, 123)
    minmax(12345)
    (52341, 12345)
    minmax(100)
    (100, 100)
    minmax(11321)
    (31121, 11123)

    Aside from my grumbles while writing it, the solution is still pretty clean.
    The calls to list() interspersed throughout might be avoidable, but the
    need to do that while developing at least slowed me down.

  • Rust

    I almost didn’t do a Rust solution because I thought I’d done enough. It
    ended up being the most complicated, though – I’m not sure if that’s because
    of me, or Rust.

    use itertools::Itertools;
    
    fn swap(v: Vec<u32>, t1: usize, t2: usize) -> Vec<u32> {
        let mut vv = v;
        let tmp1 = vv[t1];
        let tmp2 = vv[t2];
        vv[t1] = tmp2;
        vv[t2] = tmp1;
        return vv;
    }
    
    fn maxmin(num: u32) -> (u32, u32) {
        let numc = num.to_string();
        let n = numc.len();
        let numv: Vec<u32> = numc
            .to_string()
            .chars()
            .map(|c| c.to_digit(10).unwrap())
            .collect();
        let mut opts = Vec::new();
        for (a, b) in (0..n).tuple_combinations() {
            opts.push((a, b));
        }
        let mut new: Vec<Vec<u32>> = Vec::new();
        new.push(numv.clone());
        for o in opts {
            new.push(swap(numv.clone(), o.0, o.1));
        }
        let keeps: Vec<Vec<u32>> = new.into_iter().filter(|x| x[0] != 0).collect();
        let mut vals = Vec::new();
        for v in keeps {
            let tmp: u32 = v
                .clone()
                .into_iter()
                .map(|x| x.to_string())
                .collect::<String>()
                .parse()
                .unwrap();
            vals.push(tmp);
        }
        let min = *vals.iter().min().unwrap();
        let max = *vals.iter().max().unwrap();
        (max, min)
    }
    
    fn main() {
        println!("{:?}", maxmin(213));
        println!("{:?}", maxmin(12345));
        println!("{:?}", maxmin(100));
        println!("{:?}", maxmin(11321))
    }
    (312, 123)
    (52341, 12345)
    (100, 100)
    (31121, 11123)

    This solution reminded me why I like working with array (or
    at least vector-supporting) languages; not needing to explicitly loop over
    every element of a vector to do something. I had to write a lot of push()
    loops to move data around. max() doesn’t work on a vector (in the sense of
    finding the maximum of n elements); it works that way on an iterator, and may
    fail, hence the longer min and max lines.

    Having to clone() various values explicitly because they can’t be re-used
    was a bit annoying, but I understand why it complains about those.

    This took longer than I would have liked, but of course I learned more by
    doing it.

  • J

    At the APL meetup we discussed one partial J solution which used a slightly
    different approach to the ‘swap’ algorithm. I’m not sure that there is a
    way in J that’s as elegant as the APL solution, but I’d be interested if
    there is.

    Justus Perlwitz offered
    this
    solution, the essence of which is

    digits =: 10&#.^:_1
    
    sd =: {{
      amend =. (|.y)}
      swap =. (y { ]) amend ]
      swap &.: digits x
    }}
    
    cart =: {{
      all =. ,/ (,"0)/~ y
      uniq =. ~. /:~"1 all
      l =. 0{"1 uniq
      r =. 1{"1 uniq
      (l ~: r) # uniq
    }}
    
    swapmaxmin =: {{
      ndigits =. [: # digits
      combs =. cart i. ndigits y
      constr =. ((ndigits y) <: [: ndigits"0 ]) # ]
      swaps =. constr y, y sd"1 combs
      (>./ , <./) swaps
    }}
    
    swapmaxmin 213
    312 123
    swapmaxmin 12345
    52341 12345
    swapmaxmin 100
    100 100
    swapmaxmin 11321
    31121 11123

    and which you can run in
    the J playground

    There’s a lot I want to learn about J, so I’ll be digging through this
    solution myself.

Summary

I was most pleased with the APL solution; it does what it says on the box
without ambiguity because it’s constructed entirely from primitives (or utility
functions defined in terms of those). The Julia solution also feels very clean,
while the Haskell solution, defined entirely from functions, nicely demonstrates
the functional principle.

I found it to be an interesting example of where pass-by-reference is not so
helpful. For packaged Julia functions that distinction is made clear with the
! suffix to denote mutating functions, and it’s common to write both a
mutating and non-mutating version wherever possible.

Writing these taught me more and more about using each of these languages, and
I’m of the opinion that just reading solutions is no substitute for getting your
hands dirty in some actual code.


Comments, improvements, or your own solutions are most welcome. I can be found on
Mastodon or use the comments below.

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.1 (2024-06-14)
##  os       macOS Sonoma 14.6
##  system   aarch64, darwin20
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Australia/Adelaide
##  date     2024-10-26
##  pandoc   3.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.19    2024-02-01 [1] CRAN (R 4.4.0)
##  bookdown      0.41    2024-10-16 [1] CRAN (R 4.4.1)
##  bslib         0.8.0   2024-07-29 [1] CRAN (R 4.4.0)
##  cachem        1.1.0   2024-05-16 [1] CRAN (R 4.4.0)
##  cli           3.6.3   2024-06-21 [1] CRAN (R 4.4.0)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.4.0)
##  digest        0.6.37  2024-08-19 [1] CRAN (R 4.4.1)
##  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.4.0)
##  evaluate      1.0.1   2024-10-10 [1] CRAN (R 4.4.1)
##  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
##  fs            1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
##  glue          1.8.0   2024-09-30 [1] CRAN (R 4.4.1)
##  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
##  htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.4.0)
##  httpuv        1.6.15  2024-03-26 [1] CRAN (R 4.4.0)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.4.0)
##  jsonlite      1.8.9   2024-09-20 [1] CRAN (R 4.4.1)
##  knitr         1.48    2024-07-07 [1] CRAN (R 4.4.0)
##  later         1.3.2   2023-12-06 [1] CRAN (R 4.4.0)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
##  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
##  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.4.0)
##  mime          0.12    2021-09-28 [1] CRAN (R 4.4.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0)
##  pkgbuild      1.4.4   2024-03-17 [1] CRAN (R 4.4.0)
##  pkgload       1.4.0   2024-06-28 [1] CRAN (R 4.4.0)
##  profvis       0.4.0   2024-09-20 [1] CRAN (R 4.4.1)
##  promises      1.3.0   2024-04-05 [1] CRAN (R 4.4.0)
##  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
##  Rcpp          1.0.13  2024-07-17 [1] CRAN (R 4.4.0)
##  remotes       2.5.0   2024-03-17 [1] CRAN (R 4.4.0)
##  rlang         1.1.4   2024-06-04 [1] CRAN (R 4.4.0)
##  rmarkdown     2.28    2024-08-17 [1] CRAN (R 4.4.0)
##  rstudioapi    0.17.0  2024-10-16 [1] CRAN (R 4.4.1)
##  sass          0.4.9   2024-03-15 [1] CRAN (R 4.4.0)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
##  shiny         1.9.1   2024-08-01 [1] CRAN (R 4.4.0)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.4.0)
##  usethis       3.0.0   2024-07-29 [1] CRAN (R 4.4.0)
##  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.4.0)
##  xfun          0.48    2024-10-03 [1] CRAN (R 4.4.1)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.4.0)
##  yaml          2.3.10  2024-07-26 [1] CRAN (R 4.4.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────

In-Place Modifications

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2024/09/25/in-place-modifications/

In this post I explore some differences between R, python, julia, and APL in
terms of mutability, and try to make something that probably shouldn’t exist.

I watched this code_report video which describes
a leetcode problem;

You are given an integer array nums, an integer k, and an integer multiplier.

You need to perform k operations on nums. In each operation:

  • Find the minimum value x in nums. If there are multiple occurrences of the minimum value, select the one that appears first.
  • Replace the selected minimum value x with x * multiplier.

Return an integer array denoting the final state of nums after performing all k operations.

Conor’s python solution in the video was

def getFinalState(nums, k, m): 
  for _ in range(k): 
    i = nums.index(min(nums)) 
    nums[i] *= m
  return nums

x = [2, 1, 3, 5, 6]
k = 5
mult = 2

getFinalState(x, k, mult)
## [8, 4, 6, 5, 6]

and, as always, I wanted to see how I’d do that in R. I came up with this

getFinalState = function(nums, k, m) {
  for (i in 1:k) {
    m <- which.min(nums)[1]
    nums[m] <- mult * nums[m]
  }
  nums
}

x <- c(2, 1, 3, 5, 6)
k <- 5
mult <- 2

getFinalState(x, k, mult)
## [1] 8 4 6 5 6

It’s worth noting that I can’t use a map in this function because iterations
are dependent; the minimum value at any iteration depends on the previous
values.

I also had a chance to discuss this solution with some APL’ers at a meetup and
a J solution was presented, but I don’t think I wrote it down.

My solution is nearly word-for-word the same as the python solution with a
couple of notable exceptions arising from the difference between the two
languages:

First, R has which.min() as a built-in rather than needing to query the index
of the minimum value (and two references to nums). Also, R has no compound
assignment like x *= 2 which modifies in-place – the closest thing I can think
of is the %<>% operator in {magrittr} (not re-exported in {dplyr} because this
behaviour is considered bad practice in R, despite not really being “in-place”)

library(magrittr)

m <- data.frame(x = 1:6, y = letters[1:6])
m
##   x y
## 1 1 a
## 2 2 b
## 3 3 c
## 4 4 d
## 5 5 e
## 6 6 f
m %<>% head(2)
m
##   x y
## 1 1 a
## 2 2 b

although I can certainly see the case for it – this operator avoids repeating
the variable being used and assigned, because the alternative using the
traditional pipe is

m <- data.frame(x = 1:6, y = letters[1:6])
m
##   x y
## 1 1 a
## 2 2 b
## 3 3 c
## 4 4 d
## 5 5 e
## 6 6 f
m <- m %>% head(2)
m
##   x y
## 1 1 a
## 2 2 b

One could argue that writing out even a longer variable name twice still makes
it clear that shadowing is taking place; the value is being overwritten with
a new value, but it does feel a little frustrating to have to type it out twice

important_variable <- important_variable * 2

Back to my R solution, the indexing at a specific set of values got me thinking
that it would be clean if we could pass a function to [ so that we could
write

nums[which.min] <- value

(maybe not so much for this example where m is used twice, but it piqued my
interest)

Let’s say I want to set all the even values of a vector to some other value.
That’s easy enough to do

x[x %% 2 == 0] <- 0

but I don’t love that it requires two references to x, which may (should?) be
a much longer name

important_variable[important_variable %% 2 == 0] <- 0

I want something like x[f] <- y to set the values of x where f(x) is
TRUE to y. This seemed like it might be possible, maybe with a function
method to [<-, but [<- dispatches on the class of x, not what’s inside
[, so no dice. In theory (which will never happen) the built-in [<- could
have some branch logic for dealing with a function passed as the indices to be
modified, but I’m not about to go rebuilding R from source myself just to play
with that idea.

Nonetheless, if I define some functions that do accomplish this

is_even <- function(z) z %% 2 == 0

set_if <- function(x, f, value) {
  x[f(x)] <- value
  x
}

then I can try this out on a vector

a <- 1:10
a
##  [1]  1  2  3  4  5  6  7  8  9 10
set_if(a, is_even, 0)
##  [1] 1 0 3 0 5 0 7 0 9 0
a # unchanged
##  [1]  1  2  3  4  5  6  7  8  9 10

It works, but I’m back to having to write a <- do_stuff(a) because a isn’t
actually modified by this function.

Ideally, my function would operate the same as this does

a <- 1:10
a[is_even(a)] <- 0
a
##  [1] 1 0 3 0 5 0 7 0 9 0

which does modify a in-place; R is not entirely pure, and does occasionally
allow what looks like direct mutation, though under the hood, it’s not – a new
object is actually created

# not using a range e.g. 1:n because that's internally 
# a "compact" representation
a <- c(2, 3, 4)
.Internal(inspect(a))
## @63a4d9b05be8 14 REALSXP g0c3 [REF(2)] (len=3, tl=0) 2,3,4
a[2] <- 9
.Internal(inspect(a))
## @63a4d9b0fbf8 14 REALSXP g0c3 [REF(1)] (len=3, tl=0) 2,9,4

Note that the memory address has changed.

If I was working with a language which did support (enable?) modify-in-place
then that might look like

def is_even(x):
   return x % 2 == 0

def set_if(x, f, value):
     for i in range(len(x)):
         if f(x[i]):
             x[i] = value

a = list(range(10))
a
## [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
set_if(a, is_even, 0)
a
## [0, 1, 0, 3, 0, 5, 0, 7, 0, 9]

Now, that’s not always a great thing. In such a language with mutable structures
(e.g. lists) we can do maddening things like this

x = [3, 4, 5]
y = x
y is x
## True
y[1] = 9
x # still 'bound' to y
## [3, 9, 5]

Here, is means “are these two things identical in the sense of referring to
the same block of memory”, noting that literals (e.g. single numbers) are
referenced that way, but tuples aren’t

abc = (11, 99)
xyz = (11, 99)
abc is xyz
## False
abc == xyz
## True

The big question is can I hack together some solution that does work in-place
in R? Yeah, with some ill-advised calls

set_if <- function(x, f, value) {
  # can't use <<- because the value passed in as the x argument isn't 
  # necessarily named 'x' in the parent scope
  .x <- x
  .x[f(.x)] <- value
  e <- parent.env(environment())
  assign(deparse(substitute(x)), .x, pos = e)
  invisible(.x)
}


a <- 1:10
a
##  [1]  1  2  3  4  5  6  7  8  9 10
set_if(a, is_even, 0)
a
##  [1] 1 0 3 0 5 0 7 0 9 0

As I note in the comment there, I can’t use the super-assignment arrow <<-
inside this function because I don’t know the name of the variable I’m updating;
it needs to be deparsed from the incoming argument.

This means that it works regardless of the name of the variable being modified

b <- 10:20
b
##  [1] 10 11 12 13 14 15 16 17 18 19 20
set_if(b, is_even, 0)
b
##  [1]  0 11  0 13  0 15  0 17  0 19  0

I tried to think of some other languages which might support this sort of in-place
set_if(x, f, value) modification and (Dyalog) APL was worth a thought.

    ⍝ create a vector from 1 to 10
    x←⍳10
    x
1 2 3 4 5 6 7 8 9 10

    ⍝ the function {0=2|⍵} calculates a boolean vector with 
    ⍝ 1 where the value is even
    {0=2|⍵} x
0 1 0 1 0 1 0 1 0 1

    ⍝ the `@` operator takes a value (or function) on the left and 
    ⍝ a function (or boolean values) on the right and applies it to the 
    ⍝ other argument on the right
    0@{0=2|⍵} x 
1 0 3 0 5 0 7 0 9 0

    ⍝ alternatively a point-free function defined as the negation (`~`) of a 
    ⍝ binding (`∘`) of the value 2 to modulo (`|`); the negation is needed
    ⍝ otherwise this returns the result of the modulo, not where it is 0
    0@(~2∘|)⍳10
1 0 3 0 5 0 7 0 9 0

    ⍝ x is, however, unchanged as APL is typically immutable
    x
1 2 3 4 5 6 7 8 9 10

So there’s no way to do the in-place modification. it is nice, though, that
0@(~2∘|)x only refers to x once.

Julia makes a nice distinction between functions which mutate arguments and
those which don’t; (by convention) the former are named ending with an
exclamation mark, e.g.

vec = collect(1:5)
## 5-element Vector{Int64}:
##  1
##  2
##  3
##  4
##  5
# non-mutating
reverse(vec)
## 5-element Vector{Int64}:
##  5
##  4
##  3
##  2
##  1
vec
## 5-element Vector{Int64}:
##  1
##  2
##  3
##  4
##  5
# mutating
reverse!(vec)
## 5-element Vector{Int64}:
##  5
##  4
##  3
##  2
##  1
vec
## 5-element Vector{Int64}:
##  5
##  4
##  3
##  2
##  1

In julia, the iseven() function is already built-in, but vectorisation is
explicit via a broadcast operator . and the setting of even values to 0
looks like

x = collect(1:10);
x[iseven.(x)] .= 0;
x
## 10-element Vector{Int64}:
##  1
##  0
##  3
##  0
##  5
##  0
##  7
##  0
##  9
##  0

which looks very much like the R version with some dots where scalar functions
are vectorised. If I don’t use the last . to perform vectorised assignment,
the error tells me that the failure involved the setindex! function which does
sound like what I want, but this doesn’t work

setindex!(x, 0, iseven.(x))

because it’s trying to assign the value 0 multiple times and I only provided one
of them. Instead,

x = collect(1:10);
setindex!(x, zeros(Int8, 5), iseven.(x));
x
## 10-element Vector{Int64}:
##  1
##  0
##  3
##  0
##  5
##  0
##  7
##  0
##  9
##  0

does work, but I had to manually count how many 0 entries this requires, so the
[ approach seems cleaner. Either way, I’ve had to explicitly calculate
iseven(x) and pass that result somewhere.

Since Julia allows users to extend methods, I could do that modification myself!

import Base.setindex! 
  
function setindex!(A::Vector{Int64}, v::Int64, f::Function) 
  A[f.(A)] .= v
end
## setindex! (generic function with 240 methods)
x = collect(1:10);
setindex!(x, 0, iseven);
x
## 10-element Vector{Int64}:
##  1
##  0
##  3
##  0
##  5
##  0
##  7
##  0
##  9
##  0

which I could just as easily call set_if!

set_if! = setindex!;
x = collect(1:10);
set_if!(x, 0, iseven);
x
## 10-element Vector{Int64}:
##  1
##  0
##  3
##  0
##  5
##  0
##  7
##  0
##  9
##  0

Nice! I do wonder if I can “hack” (ahem, extend) Julia’s [ to get my prized
x[f] = 0 solution but I doubt it’s worth it when the above does the right
thing.

I don’t imagine I’ll package up my set_if() anywhere, and I should probably
even avoid using it myself, but it’s been an interesting journey thinking about
this stuff. Maybe there’s a better way to do it? Maybe there’s a language which
better supports something like that? If you know, or you have comments or
suggestions, I can be found on
Mastodon or use the comment section below.

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.3 (2024-02-29)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  tz       Australia/Adelaide
##  date     2024-09-25
##  pandoc   3.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.19    2024-02-01 [1] CRAN (R 4.3.3)
##  bookdown      0.36    2023-10-16 [1] CRAN (R 4.3.2)
##  bslib         0.8.0   2024-07-29 [1] CRAN (R 4.3.3)
##  cachem        1.1.0   2024-05-16 [1] CRAN (R 4.3.3)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.3)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.3.2)
##  digest        0.6.37  2024-08-19 [1] CRAN (R 4.3.3)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.24.0  2024-06-10 [1] CRAN (R 4.3.3)
##  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.3.3)
##  fs            1.6.4   2024-04-25 [1] CRAN (R 4.3.3)
##  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.3)
##  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.3.3)
##  htmlwidgets   1.6.2   2023-03-17 [1] CRAN (R 4.3.2)
##  httpuv        1.6.12  2023-10-23 [1] CRAN (R 4.3.2)
##  icecream      0.2.1   2023-09-27 [1] CRAN (R 4.3.2)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.3.3)
##  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.3)
##  JuliaCall     0.17.5  2022-09-08 [1] CRAN (R 4.3.3)
##  knitr         1.48    2024-07-07 [1] CRAN (R 4.3.3)
##  later         1.3.1   2023-05-02 [1] CRAN (R 4.3.2)
##  lattice       0.22-5  2023-10-24 [4] CRAN (R 4.3.1)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.3)
##  magrittr    * 2.0.3   2022-03-30 [1] CRAN (R 4.3.3)
##  Matrix        1.6-5   2024-01-11 [4] CRAN (R 4.3.3)
##  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.3.3)
##  mime          0.12    2021-09-28 [1] CRAN (R 4.3.3)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.3.2)
##  pkgbuild      1.4.2   2023-06-26 [1] CRAN (R 4.3.2)
##  pkgload       1.3.3   2023-09-22 [1] CRAN (R 4.3.2)
##  png           0.1-8   2022-11-29 [1] CRAN (R 4.3.2)
##  prettyunits   1.2.0   2023-09-24 [3] CRAN (R 4.3.1)
##  processx      3.8.3   2023-12-10 [3] CRAN (R 4.3.2)
##  profvis       0.3.8   2023-05-02 [1] CRAN (R 4.3.2)
##  promises      1.2.1   2023-08-10 [1] CRAN (R 4.3.2)
##  ps            1.7.6   2024-01-18 [3] CRAN (R 4.3.2)
##  purrr         1.0.2   2023-08-10 [3] CRAN (R 4.3.1)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.3)
##  Rcpp          1.0.11  2023-07-06 [1] CRAN (R 4.3.2)
##  remotes       2.4.2.1 2023-07-18 [1] CRAN (R 4.3.2)
##  reticulate    1.34.0  2023-10-12 [1] CRAN (R 4.3.2)
##  rlang         1.1.4   2024-06-04 [1] CRAN (R 4.3.3)
##  rmarkdown     2.28    2024-08-17 [1] CRAN (R 4.3.3)
##  rstudioapi    0.15.0  2023-07-07 [3] CRAN (R 4.3.1)
##  sass          0.4.9   2024-03-15 [1] CRAN (R 4.3.3)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.2)
##  shiny         1.7.5.1 2023-10-14 [1] CRAN (R 4.3.2)
##  stringi       1.8.4   2024-05-06 [1] CRAN (R 4.3.3)
##  stringr       1.5.1   2023-11-14 [1] CRAN (R 4.3.3)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.3.2)
##  usethis       3.0.0   2024-07-29 [1] CRAN (R 4.3.3)
##  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.3)
##  xfun          0.47    2024-08-17 [1] CRAN (R 4.3.3)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.3.2)
##  yaml          2.3.10  2024-07-26 [1] CRAN (R 4.3.3)
## 
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.3
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
## 
## ─ Python configuration ───────────────────────────────────────────────────────
##  python:         /home/jono/.virtualenvs/r-reticulate/bin/python
##  libpython:      /usr/lib/python3.10/config-3.10-x86_64-linux-gnu/libpython3.10.so
##  pythonhome:     /home/jono/.virtualenvs/r-reticulate:/home/jono/.virtualenvs/r-reticulate
##  version:        3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]
##  numpy:           [NOT FOUND]
##  
##  NOTE: Python version was forced by VIRTUAL_ENV
## 
## ──────────────────────────────────────────────────────────────────────────────