By: Jonathan Carroll
Re-posted from: https://jcarroll.com.au/2023/08/06/argument-matching-across-languages/
With Functional Programming, we write functions which take arguments and do something with
or based on those arguments. You might not think there’s much to learn about given that
tiny description of “an argument to a function” but the syntax and mechanics of different
languages is actually widely variable and intricate.
Let’s say I have some function in R that takes three arguments, x
, y
, and z
,
and just prints them out in a string in that order.
r_fun <- function(x, y, z) {
sprintf("arguments are: %s, %s, %s", x, y, z)
}
Calling this function with good practices (specifying all the argument names in full)
would look like this
r_fun(x = "a", y = "b", z = "c")
## [1] "arguments are: a, b, c"
I said “in full” because by default, R will happily do partial matching, so long
as it can uniquely figure out which argument you mean
long_args <- function(alphabet = "a to z", altitude = 100) {
print(sprintf("alphabet: %s", alphabet))
print(sprintf("altitude: %d", altitude))
}
long_args(alphabet = "[A-Z]", altitude = 50)
## [1] "alphabet: [A-Z]"
## [1] "altitude: 50"
In this case, both arguments start with "al"
so it’s ambiguous up to there
long_args(al = "letters")
## Error in long_args(al = "letters"): argument 1 matches multiple formal arguments
but we only need to specify enough letters to disambiguate
long_args(alpha = "LETTERS", alt = 200)
## [1] "alphabet: LETTERS"
## [1] "altitude: 200"
Relying on this behaviour is dangerous, and it’s recommended to turn on warnings
when this happens with
options(warnPartialMatchArgs = TRUE)
long_args(alpha = "LETTERS", alt = 200)
## Warning in long_args(alpha = "LETTERS", alt = 200): partial argument match of
## 'alpha' to 'alphabet'
## Warning in long_args(alpha = "LETTERS", alt = 200): partial argument match of
## 'alt' to 'altitude'
## [1] "alphabet: LETTERS"
## [1] "altitude: 200"
You don’t have to use argument names when calling the function, though – you can just rely on positional arguments
r_fun("a", "b", "c")
## [1] "arguments are: a, b, c"
and this is very commonly done, despite it being less clear to what any of those
refer, and runs the risk that the function changes argument ordering in an updated
version. It works, though.
Extensive sidenote: square-bracket matrix subsetting officially uses the (poorly? traditionally?)
named arguments i
and j
as [i, j]
but it actually entirely ignores them and uses
positional arguments. The documentation (?`[`
) does warn about this
“Note that these operations do not match their index arguments in the standard way:
argument names are ignored and positional matching only is used. So m[j = 2, i = 1] is
equivalent to m[2, 1] and not to m[1, 2].”
but it would be very easy to get bitten by it if one tried to use the names directly
m <- matrix(1:9, 3, 3, byrow = TRUE)
m
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
m[i = 1, j = 2]
## [1] 2
m[j = 2, i = 1]
## [1] 4
Thomas Lumley
notes that
“it used to be that no primitive functions did argument matching by name.”/” and “-’
and switch() and some others still don’t. I’m not sure why”[” wasn’t changed in 2.11
when a bunch of primitives got normal argument matching.”
Worse still, perhaps – the seq()
function creates a sequence of values. It has the
formal arguments with defaults from = 1
and to = 1
so you can calculate
seq(from = 2, to = 5)
## [1] 2 3 4 5
or you can leverage the default of from = 1
seq(to = 5)
## [1] 1 2 3 4 5
However, there are five “forms” in which
you can provide arguments to this function and they behave differently. If you only
specify the first argument unnamed, it treats this as to
despite the first argument being from
seq(5)
## [1] 1 2 3 4 5
which is extra strange, because if you do specify to
with its ostensibly default value 1
, the sequence is backwards
seq(5, to = 1)
## [1] 5 4 3 2 1
Back to our function – a feature that makes R really neat is that you can specify
the named arguments in any order
r_fun(z = "c", x = "a", y = "b")
## [1] "arguments are: a, b, c"
If you don’t specify them by name, R will default to positions, so specifying just
one (e.g. z
) but leaving the rest unspecified, R will presume you want the others
in positional order
r_fun(z = "c", "a", "b")
## [1] "arguments are: a, b, c"
Where it gets really interesting is you can go back to named arguments further along
and again, R will figure out that you mean the remaining unnamed argument
r_fun(z = "c", "b", x = "a")
## [1] "arguments are: a, b, c"
This only holds if the function doesn’t use the ellipses ...
which
captures “any other arguments” when calling the function, often to be passed
on to another function. If the function signature has ...
then all the
unnamed arguments are captured. This example function just
combines any other arguments into a comma-separated string, if there
are any (tested with the under-documented ...length()
which returns the number
of arguments captured via ...
)
dot_f <- function(a = 1, b = 2, ...) {
print(sprintf("named arguments: %s, %s", a, b))
if (...length()) {
print(sprintf("additional arguments: %s", toString(list(...))))
}
}
You can call this with just the named arguments
dot_f(a = 3, b = 4)
## [1] "named arguments: 3, 4"
or you can add more argument (no name required)
dot_f(a = 3, b = 4, 5)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5"
As before, none of the names are really required, and we can add as
many as we want
dot_f(3, 4, 5, 6, 7)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5, 6, 7"
We can name them if we want
dot_f(a = 3, b = 4, blah = 5)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5"
but here be danger, because those names can be anything and aren’t matched
to the actual function, so this works (say, I misspelled an argument name a
as A
)
dot_f(A = 3, B = 4, 5)
## [1] "named arguments: 5, 2"
## [1] "additional arguments: 3, 4"
Notice that the additional arguments are the ones I named (not those in
the function definition); the 5
has been positionally matched to a
; and b
has taken its default value of 2
because no other arguments were provided.
We can still mix up the ordering of positions, provided everything else matches up
dot_f(3, b = 4, 5)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5"
dot_f(3, b = 4, 5, a = 2)
## [1] "named arguments: 2, 4"
## [1] "additional arguments: 3, 5"
The flexibility in all of this is what encouraged Joe Cheng to use R as an
interface to HTML in the form of shiny, what he calls
“a bizzarely good host language” (should link
to the right timestamp) and he notes that other languages don’t let you do
this sort of mixing up of named and positional arguments.
Okay, that’s R – weird and fun, but a lot of flexibility.
I saw this post mentioned in the #rust
hashtag on Mastodon and had a look – it surprised me
at first because I thought “what do you mean Rust doesn’t have named arguments?”…
I’ve become so used to the inline help from VSCode when I’m writing Rust that I
didn’t realise I wasn’t using named arguments.
Here’s a function I wrote for my toy rock-paper-scissors game in Rust
fn play(a: Throw, b: Throw) -> GameResult {
let result = match a.cmp(&b) {
Ordering::Equal => GameResult::Tie,
Ordering::Greater => GameResult::YouWin,
Ordering::Less => GameResult::YouLose,
};
println!("{} {}", "Result:".purple().bold(), result);
result
}
It has arguments a
and b
because I did a terrible job naming them – I knew
exactly how I planned to use them, so bad luck to anyone else.
Calling that function further down in the code I have
let user = val.user();
let computer = Throw::computer();
play(user, computer);
BUT what I see in the editor has the argument names, unless I switch off
hints (which I have bound to holding Ctrl+Alt at the moment)
So, I can’t just rearrange arguments in Rust?
If I define a function with two arguments
>> fn two_args(a: f64, b: &str) -> String {
let res = format!("all arguments: {a}, {b}");
res
}
then I can call it
>> two_args(42.0, "forty-two")
"all arguments: 42, forty-two"
Just swapping the arguments obviously fails because 42.0
isn’t a &str
and
"forty-two"
isn’t a f64
. But there isn’t a way to say “the value for that
argument is this”; I can’t use any of these
two_args(a = 42.0, b = "forty-two")
two_args(a: 42.0, b: "forty-two")
two_args(b = "forty-two", a = 42.0)
two_args(b: "forty-two", a: 42.0)
I suspect the fact that this was a surprise to me means I’m earlier in my Rust
learning than I had thought – I clearly haven’t built anything that has
functionality I didn’t directly need, because I haven’t had to worry about
calling functions in strange ways yet.
There is one loophole… time to break out another cool toy: {rextendr}
library(rextendr)
rust_function(
'fn two_args(a: f64, b: &str) -> String {
let res = format!("all arguments: {a}, {b}");
res
}'
)
This produces an R function that takes two arguments, a
and b
which I can call
as if it was an R function
two_args(a = 42, b = "forty-two")
## [1] "all arguments: 42, forty-two"
I can call it without argument names
two_args(42, "forty-two")
## [1] "all arguments: 42, forty-two"
and I can swap them
two_args(b = "forty-two", a = 42)
## [1] "all arguments: 42, forty-two"
This is just because the argument matching happens before the values get
sent down to the Rust code – the function here is an R function that calls
other code internally
two_args
## function (a, b)
## .Call("wrap__two_args", a, b, PACKAGE = "librextendr1")
## <bytecode: 0x55d873cff7b8>
I somewhat started out the idea for this blogpost as I was learning some Typescript and
came across this https://github.com/gibbok/typescript-book#typescript-fundamental-comparison-rules
“Function parameters are compared by types, not by their names:”
type X = (a: number) => void;
type Y = (a: number) => void;
let x: X = (j: number) => undefined;
let y: Y = (k: number) => undefined;
y = x; // Valid
x = y; // Valid
which initially struck me as strange, and I needed to work through some examples in a live
setting. On reflection, I think I see that this is exactly what I would specify in
e.g. Haskell – “a function that takes a number”, not “a function with an argument named a
which
is a number”
x :: Float -> Nothing
Because technically all functions in Haskell actually only take a single argument (the notation Int -> Int -> Int
reveals this
fact nicely, but in practice the notation makes it feel like multiple arguments can be used)
there is no way to “pass arguments by name” but there is a neat way to swap the order
of arguments that a function expects to receive; flip
flip :: (a -> b -> c) -> b -> a -> c
>>> flip (++) "hello" "world"
"worldhello"
-- or
>>> "hello" ++ "world"
"helloworld
Those of you familiar with R’s S3 dispatch functionality will perhaps note that
the ‘first’ argument has a special role; it controls exactly which method will
be called. If we had some function which was flexible in the sense that it could
take several different ‘classes’ and do something different with them, we would
write that as
flexi <- function(a, b) {
UseMethod("flexi")
}
flexi.matrix <- function(a, b) {
paste0("a is a matrix, b = ", b)
}
flexi.data.frame <- function(a, b) {
paste0("a is a data.frame, b = ", b)
}
flexi.default <- function(a, b) {
paste0("a is something else, b = ", b)
}
Now, depending on whether a
is a matrix
, a data.frame
, or something else, one
of the ‘methods’ will be called
flexi(a = matrix(), b = 7)
## [1] "a is a matrix, b = 7"
flexi(a = data.frame(), b = 8)
## [1] "a is a data.frame, b = 8"
flexi(a = 1, b = 9)
## [1] "a is something else, b = 9"
even if we swap the order of the arguments in the call
flexi(b = 3, a = matrix())
## [1] "a is a matrix, b = 3"
S4 dispatch goes even further and dispatches based on more than just the class of
the first argument. Stuart Lee has a great guide on S4. The point is, you can do something
different depending on what you pass to multiple arguments
s4flexi(matrix(), data.frame(), 7)
s4flexi(matrix(), data.frame(), list())
s4flexi(matrix(), data.frame(), NULL)
Julia has some of the most interesting argument parsing. I love the Haskell-like
function declarations – so little boilerplate! We define some function f
that
takes two arguments
f(a, b) = a + b
## f (generic function with 1 method)
f(4, 5)
## 9
Similar to the Rust situation, though – these aren’t named outside of the function body,
so we can’t refer to them either in that order or reversed
f(a = 4, b = 5)
MethodError: no method matching f(; a=4, b=5)
Closest candidates are:
f(!Matched::Any, !Matched::Any) at none:3 got unsupported keyword arguments "a", "b"
The reason is that Julia uses the python-esque keyword argument syntax, where unnamed
arguments appear first, followed by any keyword arguments following a ;
, so we can specify
these correctly as
f(; a, b) = a + b
## f (generic function with 2 methods)
f(a = 4, b = 6)
## 10
Julia is optionally typed, which means we can be flippant with the types here, or we
can be very specific – we can specify that a
should be an integer and b
should be
a string, and that produces a different method compared to what we already defined. In
this case, I want to return a string with the two values
f(; a::Int, b::String) = "$a; $b"
## f (generic function with 2 methods)
f(a = 42, b = "life, universe, everything")
## "42; life, universe, everything"
Since these are now named, we can swap them
f(b = "L, U, E", a = 42)
## "42; L, U, E"
but what’s even more powerful is we can define a general method, and add type-specific methods
for whatever combination of argument types we want; the first of these returns an integer,
while the other two return strings
g(a, b) = a + b
## g (generic function with 1 method)
g(a::Int, b::String) = "unnamed int, string: $a; $b"
## g (generic function with 2 methods)
g(a::String, b::Int) = "unnamed string, int: $a; $b"
## g (generic function with 3 methods)
Then, depending on what types we provide in each argument, a different method is called
g(3, 2)
## 5
g("abc", 123)
## "unnamed string, int: abc; 123"
g(123, "abc")
## "unnamed int, string: 123; abc"
Similar to S4, but so easy to declare and use! Of course, this doesn’t work if we want these
to be named since that would be ambiguous.
As I’m slowly learning APL, I’ve found it interesting that there’s a well-known approach of
writing “point-free” (“tacit”) functions which don’t specify arguments at all.
Last of all, I’ve had the pleasure of dealing with C this week including passing a pointer
to some object into a function, in which case the value outside of the function is updated.
That’s a whole other post I’m working on.
How does your favourite language use arguments? Let me know! I can be found on Mastodon or use the comments below.
devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.1.2 (2021-11-01)
## os Pop!_OS 22.04 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_AU.UTF-8
## ctype en_AU.UTF-8
## tz Australia/Adelaide
## date 2023-08-06
## pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## assertthat 0.2.1 2019-03-21 [3] CRAN (R 4.0.1)
## blogdown 1.17 2023-05-16 [1] CRAN (R 4.1.2)
## bookdown 0.29 2022-09-12 [1] CRAN (R 4.1.2)
## brio 1.1.3 2021-11-30 [1] CRAN (R 4.1.2)
## bslib 0.4.1 2022-11-02 [3] CRAN (R 4.2.2)
## cachem 1.0.6 2021-08-19 [3] CRAN (R 4.2.0)
## callr 3.7.3 2022-11-02 [3] CRAN (R 4.2.2)
## cli 3.4.1 2022-09-23 [3] CRAN (R 4.2.1)
## crayon 1.5.2 2022-09-29 [3] CRAN (R 4.2.1)
## DBI 1.1.3 2022-06-18 [3] CRAN (R 4.2.1)
## devtools 2.4.5 2022-10-11 [1] CRAN (R 4.1.2)
## digest 0.6.30 2022-10-18 [3] CRAN (R 4.2.1)
## dplyr 1.0.10 2022-09-01 [3] CRAN (R 4.2.1)
## ellipsis 0.3.2 2021-04-29 [3] CRAN (R 4.1.1)
## evaluate 0.18 2022-11-07 [3] CRAN (R 4.2.2)
## fansi 1.0.3 2022-03-24 [3] CRAN (R 4.2.0)
## fastmap 1.1.0 2021-01-25 [3] CRAN (R 4.2.0)
## fs 1.5.2 2021-12-08 [3] CRAN (R 4.1.2)
## generics 0.1.3 2022-07-05 [3] CRAN (R 4.2.1)
## glue 1.6.2 2022-02-24 [3] CRAN (R 4.2.0)
## htmltools 0.5.3 2022-07-18 [3] CRAN (R 4.2.1)
## htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.1.2)
## httpuv 1.6.6 2022-09-08 [1] CRAN (R 4.1.2)
## jquerylib 0.1.4 2021-04-26 [3] CRAN (R 4.1.2)
## jsonlite 1.8.3 2022-10-21 [3] CRAN (R 4.2.1)
## JuliaCall 0.17.5 2022-09-08 [1] CRAN (R 4.1.2)
## knitr 1.40 2022-08-24 [3] CRAN (R 4.2.1)
## later 1.3.0 2021-08-18 [1] CRAN (R 4.1.2)
## lifecycle 1.0.3 2022-10-07 [3] CRAN (R 4.2.1)
## magrittr 2.0.3 2022-03-30 [3] CRAN (R 4.2.0)
## memoise 2.0.1 2021-11-26 [3] CRAN (R 4.2.0)
## mime 0.12 2021-09-28 [3] CRAN (R 4.2.0)
## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.1.2)
## pillar 1.8.1 2022-08-19 [3] CRAN (R 4.2.1)
## pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.1.2)
## pkgconfig 2.0.3 2019-09-22 [3] CRAN (R 4.0.1)
## pkgload 1.3.0 2022-06-27 [1] CRAN (R 4.1.2)
## prettyunits 1.1.1 2020-01-24 [3] CRAN (R 4.0.1)
## processx 3.8.0 2022-10-26 [3] CRAN (R 4.2.1)
## profvis 0.3.7 2020-11-02 [1] CRAN (R 4.1.2)
## promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.1.2)
## ps 1.7.2 2022-10-26 [3] CRAN (R 4.2.2)
## purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.2)
## R6 2.5.1 2021-08-19 [3] CRAN (R 4.2.0)
## Rcpp 1.0.9 2022-07-08 [1] CRAN (R 4.1.2)
## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.1.2)
## rextendr * 0.3.0 2023-05-30 [1] CRAN (R 4.1.2)
## rlang 1.0.6 2022-09-24 [1] CRAN (R 4.1.2)
## rmarkdown 2.18 2022-11-09 [3] CRAN (R 4.2.2)
## rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.1.2)
## rstudioapi 0.14 2022-08-22 [3] CRAN (R 4.2.1)
## sass 0.4.2 2022-07-16 [3] CRAN (R 4.2.1)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
## shiny 1.7.2 2022-07-19 [1] CRAN (R 4.1.2)
## stringi 1.7.8 2022-07-11 [3] CRAN (R 4.2.1)
## stringr 1.5.0 2022-12-02 [1] CRAN (R 4.1.2)
## tibble 3.1.8 2022-07-22 [3] CRAN (R 4.2.2)
## tidyselect 1.2.0 2022-10-10 [3] CRAN (R 4.2.1)
## urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.1.2)
## usethis 2.1.6 2022-05-25 [1] CRAN (R 4.1.2)
## utf8 1.2.2 2021-07-24 [3] CRAN (R 4.2.0)
## vctrs 0.5.2 2023-01-23 [1] CRAN (R 4.1.2)
## withr 2.5.0 2022-03-03 [3] CRAN (R 4.2.0)
## xfun 0.34 2022-10-18 [3] CRAN (R 4.2.1)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.1.2)
## yaml 2.3.6 2022-10-18 [3] CRAN (R 4.2.1)
##
## [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
## [2] /usr/local/lib/R/site-library
## [3] /usr/lib/R/site-library
## [4] /usr/lib/R/library
##
## ──────────────────────────────────────────────────────────────────────────────