Frames White 2024-12-08 04:55:10

By: Frames White

Re-posted from: https://www.oxinabox.net/2024/12/08/2024-12-08-bsky_api.html

I have been on bluesky since it was invite only.
Recently a lot of the JuliaLang community showed up there.
Which is interesting times for me.
Since it results in me having a very split following in demographics.
I wanted to analyse that, and I thought I would take the opertunity to write a bit about how to access the bsky API using Julia.

Continue reading

Querying the Bluesky API

By: Frames White

Re-posted from: https://www.oxinabox.net/2024/12/08/bsky_api.html

Recently a lot of the JuliaLang community showed up on BlueSky.
Which is interesting times for me, who has been on there for quite a while.
Since it results in me having a very split following in demographics.
I wanted to analyse that, and I thought I would take the opertunity to write a bit about how to access the bsky API using Julia.

Continue reading

These Languages are Accumulating

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2024/11/28/these-languages-are-accumulating/

I keep saying that the more programming languages you know, the more you will
understand all the others you know – I’m now at the point where I want to solve
every problem I see in a handful of different languages. They all offer
different functionality, and some are certainly more suited to particular
problems than others, but there’s a world of difference between two characters
and importing from two libraries.

A newsletter I follow (and can’t find online copies of) that demonstrates neat
things in python (gotta learn it, despite not loving it) recently covered
accumulate, showing that sum and accumulate were sort of related

>>> my_list = [42, 73, 0, 16, 10]
>>> sum(my_list)
141
>>> from itertools import accumulate
>>> list(accumulate(my_list))
[42, 115, 115, 131, 141]

sum adds up all the elements of the list, while accumulate does the same but
keeps each successive partial sum.

It rounds out the demo with an alternative function being used in accumulate

>>> from itertools import accumulate
>>> from operator import mul  # mul(2, 3) == 6
>>> initial_investment = 1000
>>> rates = [1.01, 1.01, 1.02, 1.025, 1.035, 1.035, 1.06]
>>> list(
...     accumulate(rates, mul, initial=initial_investment)
... )
[1000, 1010.0, 1020.1, 1040.502, 1066.515, 1103.843, 1142.477, 1211.026]

Now, firstly… from operator import mul??? It looks like there’s no way to
pass * as an argument to a function. I could define a function that performs
the same on known arguments, e.g. lambda x, y: x * y

>>> list(accumulate(rates, lambda x, y: x*y, initial=initial_investment))
[1000, 1010.0, 1020.1, 1040.502, 1066.5145499999999, 1103.8425592499998, 1142.4770488237498, 1211.0256717531747]

but… ew.

It’s possible that there’s a different way to approach this. A list
comprehension comes to mind, e.g. something like

>>> [sum(my_list[0:i]) for i in range(1, len(my_list)+1)]
[42, 115, 115, 131, 141]

but that requires performing a sum for each sub-interval, so performance would
not scale well (admittedly, that was not a consideration here at all). I also
don’t believe there’s a built-in prod so one must import math in order to do
similar

>>> import math
>>> x = [initial_investment] + rates
>>> [math.prod(x[0:i]) for i in range(1, len(x)+1)]
[1000, 1010.0, 1020.1, 1040.502, 1066.5145499999999, 1103.8425592499998, 1142.4770488237498, 1211.0256717531747]

In R that could use the built-in cumprod for the cumulative product

initial_investment <- 1000
rates = c(1.01, 1.01, 1.02, 1.025, 1.035, 1.035, 1.06)

cumprod(c(initial_investment, rates))
## [1] 1000.000 1010.000 1020.100 1040.502 1066.515 1103.843 1142.477 1211.026

but that has the ‘multiply’ operation hardcoded. cumsum uses + as the
function… hmmm. Maybe R doesn’t have a generalised accumulate?

I’ve been playing around with Haskell lately, so recursive functions to the
rescue! One feature of recursive functions in R that I really like is Recall
which calls the function in which it is defined with a new set of arguments –
perfect for recursion!

accumulate_recall <- function(x, f, i=x[1]) {
  if (!length(x)) return(NULL)
  c(i, Recall(tail(x, -1), f, f(i, x[2])))
}

It’s also robust against renaming the function; the body doesn’t actually call
accumulate_recall by name at all.

This might be inefficient, though – it’s not uncommon to blow out the stack, so
a new Tailcall function (which doesn’t have the same elegance of being robust
against renaming) helps with flagging this as something that can be optimised

accumulate <- function(x, f, i=x[1]) {
  if (!length(x)) return(NULL)
  c(i, Tailcall(accumulate, tail(x, -1), f, f(i, x[2])))
}

With this, I can emulate the cumsum() and cumprod() functions

cumprod(1:6)
## [1]   1   2   6  24 120 720
accumulate(1:6, `*`)
## [1]   1   2   6  24 120 720
cumsum(2:6)
## [1]  2  5  9 14 20
accumulate(2:6, `+`)
## [1]  2  5  9 14 20

unless I try to calculate something too big…

cumprod(5:15)
##  [1]           5          30         210        1680       15120      151200
##  [7]     1663200    19958400   259459200  3632428800 54486432000
accumulate(5:15, `*`)
## Warning in f(i, x[2]): NAs produced by integer overflow
##  [1]         5        30       210      1680     15120    151200   1663200
##  [8]  19958400 259459200        NA        NA

It appears that the built-in functions convert to numeric. That’s easily fixed
on input

accumulate(as.numeric(5:15), `*`)
##  [1]           5          30         210        1680       15120      151200
##  [7]     1663200    19958400   259459200  3632428800 54486432000

In any case, there’s a generalised accumulate that takes the bare functions as
arguments.

But it can be so much cleaner than this!

In APL you won’t find any function named “sum” because it is just a reduction
(Reduce in R) with the function +

      sum←+/
      
      sum ⍳6 ⍝ sum the values 1:6
21

      sum 1↓⍳6 ⍝ sum the values 2:6
20

which in R is

sum(1:6)
## [1] 21
sum(2:6)
## [1] 20

Why would you write sum if you can just use +/? It’s fewer
characters to write out the implementation than the name!

For accumulate the terminology in APL is scan which uses a very similar
glyph because the operation itself is very similar; a reduce (/) is just the
last value of a scan (\) which keeps the progressive values. In both cases,
the operator (either slash) takes a binary function as the left argument and
produces a modified function – in these examples, effectively sum and prod
which is then applied to values on the right. The scan version does the same

      +\⍳6
1 3 6 10 15 21

      ×\⍳6
1 2 6 24 120 720
accumulate(1:6, `+`)
## [1]  1  3  6 10 15 21
accumulate(1:6, `*`)
## [1]   1   2   6  24 120 720

As for the rates example above, we concatenate the initial value with catenate
(,) just like the R example, but otherwise this works fine

      rates ← 1.01 1.01 1.02 1.025 1.035 1.035 1.06
      inv ← 1000
      
      ×/inv, rates
1211.025672

      ×\inv, rates
1000 1010 1020.1 1040.502 1066.51455 1103.842559 1142.477049 1211.025672

So all of that recursive R code made to generalise the cumulative application of
a function provided as an argument is boiled down to just the single glyph \.
Outstanding!

What’s more, there are lots of binary functions one would use this with, all
of which have spelled-out names in other languages

      +/ ⍝ sum (add)
      ×/ ⍝ prod (multiply)
      ∧/ ⍝ all (and)
      ∨/ ⍝ any (or)
      ⌈/ ⍝ maximum (max)
      ⌊/ ⍝ minimum (min)

In summary, it seems that looking across these languages, the available options
range from a single glyph for scan along with the bare binary operator, e.g.
×/; a cumprod() function which isn’t well-generalised but works out of the
box; and then there’s whatever mess this is (once you’ve installed these)

>>> from itertools import accumulate
>>> from operator import mul
>>> list(accumulate(rates, mul, initial=initial_investment))

Where did we go so wrong?

For what it’s worth, Julia has a reduce and an accumulate that behave very
nicely; generalised for the binary function as an argument

julia> reduce(+, 1:6)
21

julia> reduce(*, 1:6)
720

julia> accumulate(+, 1:6)
6-element Vector{Int64}:
  1
  3
  6
 10
 15
 21

julia> accumulate(*, 1:6)
6-element Vector{Int64}:
   1
   2
   6
  24
 120
 720

This is extremely close to the APL approach, but with longer worded names for
the reduce and scan operators. It also defines the more convenient sum,
prod, cumsum, and cumprod; no shortage of ways to do this in Julia!

In Haskell, foldl and scanl are the (left-associative) version of reduce
and accumulate, and passing an infix as an argument necessitates wrapping it
in parentheses

ghci> foldl (+) 0 [1..6]
21

ghci> scanl (+) 0 [1..6]
[0,1,3,6,10,15,21]

ghci> foldl (*) 1 [1..6]
720

ghci> scanl (*) 1 [1..6]
[1,1,2,6,24,120,720]

This requires an explicit starting value, unless one uses the specialised
versions which use the first value as an initial value

ghci> foldl1 (+) [1..6]
21

ghci> scanl1 (+) [1..6]
[1,3,6,10,15,21]

ghci> foldl1 (*) [1..6]
720

ghci> scanl1 (*) [1..6]
[1,2,6,24,120,720]

I started this post hoping to demonstrate how nice the APL syntax was for this,
but the detour through generalising the R function was a lot of unexpected fun
as well.

Comments, improvements, or your own solutions are most welcome. I can be found
on Mastodon or use the comments below.

Addendums

It should probably be noted that R does have a function scan but it’s for
reading data into a vector – if you ever spot someone using it for that… run.
I have war stories about that function.

I’d love to hear how this is accomplished in some other languages, too – does it
have a built-in accumulate that takes a binary function?

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.1 (2024-06-14)
##  os       macOS Sonoma 14.6
##  system   aarch64, darwin20
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Australia/Adelaide
##  date     2024-11-28
##  pandoc   3.5 @ /opt/homebrew/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version    date (UTC) lib source
##  blogdown      1.19       2024-02-01 [1] CRAN (R 4.4.0)
##  bookdown      0.41       2024-10-16 [1] CRAN (R 4.4.1)
##  bslib         0.8.0      2024-07-29 [1] CRAN (R 4.4.0)
##  cachem        1.1.0      2024-05-16 [1] CRAN (R 4.4.0)
##  cli           3.6.3      2024-06-21 [1] CRAN (R 4.4.0)
##  devtools      2.4.5      2022-10-11 [1] CRAN (R 4.4.0)
##  digest        0.6.37     2024-08-19 [1] CRAN (R 4.4.1)
##  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.4.0)
##  evaluate      1.0.1      2024-10-10 [1] CRAN (R 4.4.1)
##  fastmap       1.2.0      2024-05-15 [1] CRAN (R 4.4.0)
##  fs            1.6.5      2024-10-30 [1] CRAN (R 4.4.1)
##  glue          1.8.0      2024-09-30 [1] CRAN (R 4.4.1)
##  htmltools     0.5.8.1    2024-04-04 [1] CRAN (R 4.4.0)
##  htmlwidgets   1.6.4      2023-12-06 [1] CRAN (R 4.4.0)
##  httpuv        1.6.15     2024-03-26 [1] CRAN (R 4.4.0)
##  jquerylib     0.1.4      2021-04-26 [1] CRAN (R 4.4.0)
##  jsonlite      1.8.9      2024-09-20 [1] CRAN (R 4.4.1)
##  knitr         1.48       2024-07-07 [1] CRAN (R 4.4.0)
##  later         1.3.2      2023-12-06 [1] CRAN (R 4.4.0)
##  lifecycle     1.0.4      2023-11-07 [1] CRAN (R 4.4.0)
##  magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.4.0)
##  memoise       2.0.1      2021-11-26 [1] CRAN (R 4.4.0)
##  mime          0.12       2021-09-28 [1] CRAN (R 4.4.0)
##  miniUI        0.1.1.1    2018-05-18 [1] CRAN (R 4.4.0)
##  pkgbuild      1.4.5      2024-10-28 [1] CRAN (R 4.4.1)
##  pkgload       1.4.0      2024-06-28 [1] CRAN (R 4.4.0)
##  profvis       0.4.0      2024-09-20 [1] CRAN (R 4.4.1)
##  promises      1.3.0      2024-04-05 [1] CRAN (R 4.4.0)
##  purrr         1.0.2      2023-08-10 [1] CRAN (R 4.4.0)
##  R6            2.5.1      2021-08-19 [1] CRAN (R 4.4.0)
##  Rcpp          1.0.13-1   2024-11-02 [1] CRAN (R 4.4.1)
##  remotes       2.5.0.9000 2024-11-03 [1] Github (r-lib/remotes@5b7eb08)
##  rlang         1.1.4      2024-06-04 [1] CRAN (R 4.4.0)
##  rmarkdown     2.28       2024-08-17 [1] CRAN (R 4.4.0)
##  rstudioapi    0.17.1     2024-10-22 [1] CRAN (R 4.4.1)
##  sass          0.4.9      2024-03-15 [1] CRAN (R 4.4.0)
##  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.4.0)
##  shiny         1.9.1      2024-08-01 [1] CRAN (R 4.4.0)
##  urlchecker    1.0.1      2021-11-30 [1] CRAN (R 4.4.0)
##  usethis       3.0.0      2024-07-29 [1] CRAN (R 4.4.0)
##  vctrs         0.6.5      2023-12-01 [1] CRAN (R 4.4.0)
##  xfun          0.49       2024-10-31 [1] CRAN (R 4.4.1)
##  xtable        1.8-4      2019-04-21 [1] CRAN (R 4.4.0)
##  yaml          2.3.10     2024-07-26 [1] CRAN (R 4.4.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────