Mastering Efficient Array Operations with StaticArrays.jl in Julia

By: Steven Whitaker

Re-posted from:

The Julia programming languageis known for being a high-level languagethat can still compete with Cin terms of performance.As such,Julia already has performant data structures built-in,such as arrays.But what if arrays could be even faster?That’s where the StaticArrays.jl package comes in.

StaticArrays.jl provides drop-in replacements for Array,the standard Julia array type.These StaticArrays work just like Arrays,but they provide one additional piece of informationin the type:the size of the array.Consequently,you can’t insert or remove elements of a StaticArray;they are statically sized arrays(hence the name).However,this restriction allows more informationto be given to Julia’s compiler,which in turn results in more efficient machine code(for example, via loop unrolling and SIMD operations).The resulting speed-up can often be 10x or more!

In this post,we will learn how to use StaticArrays.jland compare the performance of StaticArraysto that of regular Arraysfor several different operations.

Note that the code examples in this postassume StaticArrays.jl has been installed and loaded:

# Press ] to enter the package prompt.pkg> add StaticArrays# Press Backspace to return to the Julia prompt.julia> using StaticArrays

(Check out our post on the Julia REPLfor more details about the package promptand navigating the REPL.)

How to Use StaticArrays.jl

When working with StaticArrays.jl,typically one will use the SVector typeor the SMatrix type.(There is also the SArray type for N-dimensional arrays,but we will focus on 1D and 2D arrays in this post.)SVectors and SMatrixes have both static sizeand static data,meaning the data contained in such objectscannot be modified.For statically sized arrayswhose contents can be modified,StaticArrays.jl provides MVector and MMatrix (and MArray).We will stick with SVectors and SMatrixes in this postunless we specifically need mutability.


There are three ways to construct StaticArrays.

  1. Convenience constructor SA:

    julia> SA[1, 2, 3]3-element SVector{3, Int64} with indices SOneTo(3): 1 2 3julia> SA[1 2; 3 4]22 SMatrix{2, 2, Int64, 4} with indices SOneTo(2)SOneTo(2): 1  2 3  4
  2. Normal constructor functions:

    julia> SVector(1, 2)2-element SVector{2, Int64} with indices SOneTo(2): 1 2julia> SMatrix{2,3}(1, 2, 3, 4, 5, 6)23 SMatrix{2, 3, Int64, 6} with indices SOneTo(2)SOneTo(3): 1  3  5 2  4  6
  3. Macros:

    julia> @SVector [1, 2, 3]3-element SVector{3, Int64} with indices SOneTo(3): 1 2 3julia> @SMatrix [1 2; 3 4]22 SMatrix{2, 2, Int64, 4} with indices SOneTo(2)SOneTo(2): 1  2 3  4

    Note that using macrosalso enables a convenient wayto create StaticArrays from common array-creation functions(eliminating the need to create an Array firstjust to convert it immediately to a StaticArray):

    @SVector [10 * i for i = 1:10]@SVector zeros(5)@SVector rand(7)@SMatrix [(i, j) for i = 1:2, j = 1:3]@SMatrix zeros(2, 2)@SMatrix randn(6, 6)

Conversion to/from Array

It may occasionally be necessaryto convert to or from Arrays.To convert from an Array to a StaticArray,use the appropriate constructor function.However, because Arrays do not have size information in the type,we ourselves must provide the size to the constructor:

SVector{3}([1, 2, 3])SMatrix{4,4}(zeros(4, 4))

To convert back to an Array, use the collect function:

julia> collect(SVector(1, 2))2-element Vector{Int64}: 1 2

Comparing StaticArrays to Arrays

Once a StaticArray is created,it can be operated on in the same wayas an Array.To illustrate,we will run a simple benchmark,both to compare the run-time speedsof the two types of arraysand to show that the same code can workwith either type of array.


Here’s the benchmark code,inspired by StaticArrays.jl’s benchmark:

using BenchmarkTools, StaticArrays, LinearAlgebra, Printfadd!(C, A, B) = C .= A .+ Bfunction run_benchmarks(N)    A = rand(N, N); A = A' * A    B = rand(N, N)    C = Matrix{eltype(A)}(undef, N, N)    D = rand(N)    SA = SMatrix{N,N}(A)    SB = SMatrix{N,N}(B)    MA = MMatrix{N,N}(A)    MB = MMatrix{N,N}(B)    MC = MMatrix{N,N}(C)    SD = SVector{N}(D)    speedup = [        @belapsed($A + $B) / @belapsed($SA + $SB),        @belapsed(add!($C, $A, $B)) / @belapsed(add!($MC, $MA, $MB)),        @belapsed($A * $B) / @belapsed($SA * $SB),        @belapsed(mul!($C, $A, $B)) / @belapsed(mul!($MC, $MA, $MB)),        @belapsed(norm($D)) / @belapsed(norm($SD)),        @belapsed(det($A)) / @belapsed(det($SA)),        @belapsed(inv($A)) / @belapsed(inv($SA)),        @belapsed($A \ $D) / @belapsed($SA \ $SD),        @belapsed(eigen($A)) / @belapsed(eigen($SA)),        @belapsed(map(abs, $A)) / @belapsed(map(abs, $SA)),        @belapsed(sum($D)) / @belapsed(sum($SD)),        @belapsed(sort($D)) / @belapsed(sort($SD)),    ]    return speedupendfunction main()    benchmarks = [        "Addition",        "Addition (in-place)",        "Multiplication",        "Multiplication (in-place)",        "L2 Norm",        "Determinant",        "Inverse",        "Linear Solve (A \\ b)",        "Symmetric Eigendecomposition",        "`map`",        "Sum of Elements",        "Sorting",    ]    N = [3, 5, 10, 30]    speedups = map(run_benchmarks, N)    fmt_header = Printf.Format("%-$(maximum(length.(benchmarks)))s" * " | %7s"^length(N))    header = Printf.format(fmt_header, "Benchmark", string.("N = ", N)...)    println(header)    println("="^length(header))    fmt = Printf.Format("%-$(maximum(length.(benchmarks)))s" * " | %7.1f"^length(N))    for i = 1:length(benchmarks)        println(Printf.format(fmt, benchmarks[i], getindex.(speedups, i)...))    endendmain()

Notice that all the functions calledwhen creating the array speedupin run_benchmarksare the same whether using Arrays or StaticArrays,illustrating that StaticArraysare drop-in replacements for standard Arrays.

Running the above codeprints the following results on my laptop(the numbers indicate the speedupof StaticArrays over normal Arrays;e.g., a value of 17.7 meansusing StaticArrays was 17.7 times fasterthan using Arrays):

Benchmark                    |   N = 3 |   N = 5 |  N = 10 |  N = 30====================================================================Addition                     |    17.7 |    14.5 |     7.9 |     2.0Addition (in-place)          |     1.6 |     1.3 |     1.4 |     0.7Multiplication               |     8.2 |     7.0 |     4.2 |     2.6Multiplication (in-place)    |     1.9 |     5.9 |     3.0 |     1.0L2 Norm                      |     4.2 |     4.0 |     5.4 |     9.7Determinant                  |    66.6 |     2.5 |     1.3 |     0.9Inverse                      |    54.8 |     5.9 |     1.8 |     0.9Linear Solve (A \ b)         |    65.5 |     3.7 |     1.8 |     0.9Symmetric Eigendecomposition |     3.7 |     1.0 |     1.0 |     1.0`map`                        |    10.6 |     8.2 |     4.9 |     2.1Sum of Elements              |     1.5 |     1.1 |     1.7 |     2.1Sorting                      |     7.1 |     2.9 |     1.5 |     1.1

There are two main conclusions from this table.First,using StaticArrays instead of Arrayscan result in some nice speed-ups!Second,the gains from using StaticArrays tend to diminishas the sizes of the arrays increase.So,you can’t expect StaticArrays.jlto always magically make your code faster,but if your arrays are small enough(the recommendation being fewer than about 100 elements)then you can expect to see some good speed-ups.

Of course,the above code timed just individual operations;how much faster a particular application would beis a different matter.

For example,consider a physical simulationwhere many 3D vectorsare manipulated over several time steps.Since 3D vectors are static in size(i.e., are 1D arrays with exactly three elements),such a situation is a prime exampleof where StaticArrays.jl is useful.To illustrate,here is an example(taken from the field of magnetic resonance imaging)of a physical simulationusing Arrays vs using StaticArrays:

using BenchmarkTools, StaticArrays, LinearAlgebrafunction sim_arrays(N)    M = Matrix{Float64}(undef, 3, N)    M[1,:] .= 0.0    M[2,:] .= 0.0    M[3,:] .= 1.0    M2 = similar(M)    (sin, cos) = sincosd(30)    R = [1 0 0; 0 cos sin; 0 -sin cos]    E1 = exp(-0.01)    E2 = exp(-0.1)    (sin, cos) = sincosd(1)    F = [E2 * cos E2 * sin 0; -E2 * sin E2 * cos 0; 0 0 E1]    FR = F * R    C = [0, 0, 1 - E1]    # Run for 100 time steps (each loop iteration does 2 time steps).    for t = 1:50        mul!(M2, FR, M)        M2 .+= C        mul!(M, FR, M2)        M .+= C    end    total = sum(M; dims = 2)    return complex(total[1], total[2])endfunction sim_staticarrays(N)    M = fill(SVector(0.0, 0.0, 1.0), N)    (sin, cos) = sincosd(30)    R = @SMatrix [1 0 0; 0 cos sin; 0 -sin cos]    E1 = exp(-0.01)    E2 = exp(-0.1)    (sin, cos) = sincosd(1)    F = @SMatrix [E2 * cos E2 * sin 0; -E2 * sin E2 * cos 0; 0 0 E1]    FR = F * R    C = @SVector [0, 0, 1 - E1]    # Run for 100 time steps (each loop iteration does 1 time step).    for t = 1:100        # Apply simulation dynamics to each 3D vector.        for i = 1:length(M)            M[i] = FR * M[i] + C        end    end    total = sum(M)    return complex(total[1], total[2])endfunction main(N)    r1 = @btime sim_arrays($N)    r2 = @btime sim_staticarrays($N)    @assert r1  r2 # Make sure the results are the same.end

The speed-ups on my laptopfor different values of Nwere as follows:

  • N = 10: 14.6x faster
  • N = 100: 7.1x faster
  • N = 1000: 5.2x faster

(Here, N is the number of 3D vectors in the simulation,not the size of the StaticArrays.)

Note also that I wrote sim_arraysto be as performant as possibleby doing in-place operations(like mul!),which has the unfortunate side effectof making the code a bit harder to read.Therefore,sim_staticarrays is both faster and easier to read!

As another exampleof how StaticArrays.jlcan speed up a more involved application,see the DifferentialEquations.jl docs.


In this post,we discussed StaticArrays.jl.We saw that StaticArrays are drop-in replacementsfor regular Julia Arrays.We also saw that using StaticArrayscan result in some nice speed-upsover using Arrays,at least when the sizes of the arraysare not too big.

Are array operations a bottleneck in your code?Try out StaticArrays.jland then comment below how it helps!

Additional Links

Cover image background from

Announcing JuliaSim Batteries: Advanced Electrochemical Lithium-Ion Battery Simulation

By: Jasmine Chokshi

Re-posted from:

JuliaHub is thrilled to announce a game-changing advancement in lithium-ion battery simulation: JuliaSim Batteries. This newly released tool extends state-of-the-art cell models with electrochemical, thermal, and degradation physics to the pack, enabling highly accurate simulations.

Transforming multiple columns in DataFrames.jl

By: Blog by Bogumił Kamiński

Re-posted from:


Today I want to comment on a recurring topic that DataFrames.jl users raise.
The question is how one should transform multiple columns of a data frame using
operation specification syntax.

The post was written under Julia 1.10.1 and DataFrames.jl 1.6.1.

What is operation specification syntax?

In DataFrames.jl the combine, select, and transform functions allow
users for passing the requests for data transformation using operation
specification syntax. This syntax is feature-rich, and you can find its
description for example here. Today I want to focus on its principal concept.

In a general form each request for making an operation on data has the (E)xtract-(T)ransform-(L)oad form.
That means that we need to specify:

  • source columns to get data from (the extract part);;
  • the operation to apply to these columns (the transform part);
  • the target columns where we want to store the result of the operation (the load part).

These tree parts are syntactically expressed using the following form:

[source columns specification] => [transformation function] => [target columns specification]

Let me give an example. Assume you have the following data:

julia> using DataFrames

julia> df = DataFrame(reshape(1:15, 5, 3), :auto)
5×3 DataFrame
 Row │ x1     x2     x3
     │ Int64  Int64  Int64
   1 │     1      6     11
   2 │     2      7     12
   3 │     3      8     13
   4 │     4      9     14
   5 │     5     10     15

We want to compute the sum of column "x1" and store it in column names "x1_sum"
Since the sum function performs the addition operation the syntax specification should be:

"x1" => sum => "x1_sum"

Let us check it with the combine function:

julia> combine(df, "x1" => sum => "x1_sum")
1×1 DataFrame
 Row │ x1_sum
     │ Int64
   1 │     15

In this syntax it is important to note two things:

  • the "x1" column as a whole was passed to the sum function (as we want to compute its sum);
  • the "x1" column is a single positional argument passed to the sum function.

Two natural questions that arise are the following:

  • What if I do not want to perform an operation on a whole column, but on its elements (a.k.a. vectorization of operation)?
  • What if I want to pass multiple columns as a source for computations?

We will now investigate these two dimensions.

Vectorization of operations

Vectorization in DataFrames.jl is easy. Just wrap the function you use in the ByRow object. Here is an example:

julia> combine(df, "x1" => string => "x1_str")
1×1 DataFrame
 Row │ x1_str
     │ String
   1 │ [1, 2, 3, 4, 5]

julia> combine(df, "x1" => ByRow(string) => "x1_strs")
5×1 DataFrame
 Row │ x1_strs
     │ String
   1 │ 1
   2 │ 2
   3 │ 3
   4 │ 4
   5 │ 5

Note that "x1" => string => "x1_str" passed the whole "x1" column to the string function so we got a single "[1, 2, 3, 4, 5]"
string in the output.

While writing "x1" => ByRow(string) => "x1_strs" passed each element of "x1" column to the string function individually,
so in the result we got a vector of five string representations of numbers of the numbers from the source.

Passing multiple columns

Now let us have a look at passing multiple columns. There are two ways you can do it.

The first is when your function accepts multiple positional arguments. An example of such function is string see:

julia> string(df.x1, df.x2)
"[1, 2, 3, 4, 5][6, 7, 8, 9, 10]"

If we pass a collection of columns as a source in operation specification syntax we get this behavior:

julia> combine(df, ["x1", "x2"] => string => "x1_x2_str")
1×1 DataFrame
 Row │ x1_x2_str
     │ String
   1 │ [1, 2, 3, 4, 5][6, 7, 8, 9, 10]

Naturally, the above combines with vectorization. Therefore since:

julia> string.(df.x1, df.x2)
5-element Vector{String}:

we also have:

julia> combine(df, ["x1", "x2"] => ByRow(string) => "x1_x2_strs")
5×1 DataFrame
 Row │ x1_x2_strs
     │ String
   1 │ 16
   2 │ 27
   3 │ 38
   4 │ 49
   5 │ 510

However, there are cases when we have a function that expects multiple columns to be passed as a single positional argument.
This is handled in DataFrames.jl with the AsTable wrapper, which you can apply to the source columns.
If you use it then instead of getting multiple positional arguments the function will get a single positional argument
that will be a NamedTuple holding the source columns.

To convince ourselves that this is indeed what happens let us create a helper function:

julia> function helper(x)
           @show x
           return string(x.x1, x.x2)
helper (generic function with 1 method)

This helper function first prints us its only argument x and next assumes that it has x1 and x2 fields and applies the string function to them.
Let us first check it in practice:

julia> helper((x1=[1, 2, 3, 4, 5], x2=[6, 7, 8, 9, 10]))
x = (x1 = [1, 2, 3, 4, 5], x2 = [6, 7, 8, 9, 10])
"[1, 2, 3, 4, 5][6, 7, 8, 9, 10]"

Now let us use the helper function with combine:

julia> combine(df, AsTable(["x1", "x2"]) => helper => "x1_x2_str")
x = (x1 = [1, 2, 3, 4, 5], x2 = [6, 7, 8, 9, 10])
1×1 DataFrame
 Row │ x1_x2_str
     │ String
   1 │ [1, 2, 3, 4, 5][6, 7, 8, 9, 10]

Indeed, we see that helper got a named tuple holding two columns of the source data frame.

Again, this syntax plays well with ByRow:

julia> combine(df, AsTable(["x1", "x2"]) => ByRow(helper) => "x1_x2_strs")
x = (x1 = 1, x2 = 6)
x = (x1 = 2, x2 = 7)
x = (x1 = 3, x2 = 8)
x = (x1 = 4, x2 = 9)
x = (x1 = 5, x2 = 10)
5×1 DataFrame
 Row │ x1_x2_strs
     │ String
   1 │ 16
   2 │ 27
   3 │ 38
   4 │ 49
   5 │ 510

We see that this time helper got a separate named tuple for each row of source data frame.


In summary today we discussed two special operations in DataFrames.jl operation specification syntax:

  • the ByRow which vectorizes the function passed to it;
  • the AsTable which allows us to pass source columns as a single named tuple to the transformation function
    (instead of passing them as consecutive positional arguments, which is the default).

I hope these examples were useful in helping you understand the design of operation specification syntax.