BYO-Closures For Performance

By: Jacob Quinn

Some may be familiar with the idea of closures, which, in short, are local functions that capture state from enclosing context. Closures are obviously supported in Julia, which often just look like anonymous functions; in those docs, it mentions that “Functions in Julia are first-class objects“, which means you can think about them as being defined in the language itself. Indeed, we could take the example of the exponent method for IEEFloats in Base, which is defined (slightly abbreviated) as:

function exponent(x::T) where T<:IEEEFloat
    xs = reinterpret(Unsigned, x) & ~sign_mask(T)
    k = Int(xs >> significand_bits(T))
    if k == 0 # x is subnormal
        m = leading_zeros(xs) - exponent_bits(T)
        k = 1 - m
    return k - exponent_bias(T)

And think of this method definition being “lowered” to:

struct exponentFunction <: Function

function (f::exponentFunction)(x::T) where T<:IEEEFloat
 xs = reinterpret(Unsigned, x) & ~sign_mask(T)
    k = Int(xs >> significand_bits(T))
    if k == 0 # x is subnormal
        m = leading_zeros(xs) - exponent_bits(T)
        k = 1 - m
    return k - exponent_bias(T)

const exponent = exponentFunction()

So here we’re defining a struct exponentFunction, which is a subtype of Function, that all function types inherit from (you can check this yourself by querying supertype(typeof(Base.exponent))). Then we’re defining a method with some unusual syntax to make instances of exponentFunction callable, like exponentFunction()(3.14), which is accomplished with the syntax function (f::exponentFunction)(x::T). Finally, we declare our const exponent to just be an instance of our exponentFunction, which is often known as a “functor”. (Functors are covered in more detail in the Julia manual).

Ok, so why start off a blog post going over a bunch of stuff in the JuliaLang docs manual? Well, in a recent refactoring, I ran into a decently well-known performance issue with closures, which suggested a few different solutions, but none which quite fit my use-case. Now, I have to admit to not fully understanding the fundamental language issue causing the performance hit here; what I do understand from my own code factorings/use is that when you try to use a variable that gets captured as closure state after the closure definition/use, it ends up creating a Core.Box object to put the variable’s value into (which massively affects performance because the variable’s inferred type is essentially Any and then relies on runtime/dynamic dispatch at every use).

Part of my aforementioned refactoring involved moving some common code into higher-order functions that applied functor arguments to each field of a struct, for example, which meant my code was now relying on closures passed to the higher-order functions. Luckily, I tend to use my favorite JuliaLang feature, its code-inspection tools (see @code_typed, for example) to just see what core functions are getting inferred/lowered to, and noticed a bunch of red-flags in the form of Core.Box for key variables. Digging a little further, it became clear that I was a victim of issue #15276 and would need to figure out a solution. One solution suggested in the issue thread was to use let blocks, declaring closure-capture state variables to make it explicit which variables will be captured. In my case, however, I needed the variables to be updated within the closure and then needed those updated values afterwards to pass along (so my parsing functions passed current parsing state down into the closures and need to then pass it along to parse the next object).

So my solution? Well, an extremely unique feature of the Julia language is how much of the language is written in itself, and how many major constructs are true, first-class citizens of the language. So I decided to write my own closure object!

mutable struct StructClosure{T, KW}

@inline function (f::StructClosure)(i, nm, TT)
    pos_i, x_i = readvalue(f.buf, f.pos, f.len, TT; f.kw...)
    f.pos = pos_i
    return x_i

So similar to our exponent example before, we define a StructClosure functor object, which this time has a few fields, which represent the closure-captured state variables that we need access to inside our actual function code. Also note that we made our functor mutable struct because in our function, we actually want to update our position variable after we’ve read a value (f.pos = pos_i).

We end up using our home-grown closure like:

@inline function read(::Struct, buf, pos, len, b, ::Type{T}; kw...) where {T}
    if b != UInt8('{')
        error = ExpectedOpeningObjectChar
        @goto invalid
    pos += 1
    b = getbyte(buf, pos)
    if b == UInt8('}')
        pos += 1
        return pos, T()
    elseif b != UInt8('"')
        error = ExpectedOpeningQuoteChar
        @goto invalid
    pos += 1
    c = StructClosure(buf, pos, len, kw)
    x = StructTypes.construct(c, T)
    return c.pos, x

@label invalid
    invalid(error, buf, pos, T)

So we first create an instance of our closure c = StructClosure(buf, pos, len, kw), and then pass it to the higher-order function like x = StructTypes.construct(c, T). Finally, you’ll notice how we return our closure variable at the end with return c.pos, x. How’s the performance? Back in-line with our fully-unrolled, pre-higher-order function code. Ultimately, this actually felt like a pretty simple, even clever solution in order to cleanup my code and use some common, well-tested higher-order functions to do some fancier code unrolling.

Recent advancements in differential equation solver software

By: Christopher Rackauckas

This was a talk given at the Modelica Jubilee Symposium – Future Directions of System Modeling and Simulation.

Recent Advancements in Differential Equation Solver Software

Since the time of the ancient Fortran methods like dop853 and DASSL were created, many advancements in numerical analysis, computational methods, and hardware have accelerated computing. However, many applications of differential equations still rely on the same older software, possibly to their own detriment. In this talk we will describe the recent advancements being made in differential equation solver software, focusing on the Julia-based DifferentialEquations.jl ecosystem. We will show how high order Rosenbrock and IMEX methods have been proven advantageous over traditional BDF implementations in certain problem domains, and the types of issues that give rise to general performance characteristics between the methods. Extensions of these solver methods to adaptive high order methods for stochastic differential-algebraic and delay differential-algebraic equations will be demonstrated, and the potential use cases of these new solvers will be discussed. Acceleration and generalization of adjoint sensitivity analysis through source-to-source reverse-mode automatic differentiation and GPU-compatibility will be demonstrated on neural differential equations, differential equations which incorporate trainable latent neural networks into their derivative functions to automatically learn dynamics from data.

Everyone’s Favorite Blogpost: CSV Benchmarks

By: Jacob Quinn

We all know ’em, we all hate ’em, let’s get a good benchmarking blogpost up in here. Most people who groggily glance at their phone at 5:30 AM just roll over and go back to sleep. For some of us, you also open up the email just to see if there’s anything interesting really quick. And for the very small minority out there, we have a “performance issue” opened on one of our precious darling open-source libraries and THAT’S IT! no more sleep until this internet stranger can be proven wrong! (for the record, the issue in question is here and xiaodiagh isn’t an internet stranger, but a great buddy I got to meet at JuliaCon 2019 in Baltimore this year who is doing some really cool work on grouping performance in Julia; but you know, “internet stranger” is a lot funnier).

Some of you may heard/seen that multithreaded csv parsing support recently landed in CSV.jl, my aforementioned precious darling. So naturally, it’s a good time to round up some benchmarks and show how competitive we are in the csv parsing landscape. I apologize for the lack of fancy graphics and pretty charts, but I’m more interested in the numbers; pretty chart PRs welcome to CSV.jl!

Hey! Look at that! CSV.jl is basically on par with R’s fread!

Now, I’ll add my personal opinion on these kind of benchmark comparisons: always take them with a grain of salt. Benchmarking tends to rely on contrived data that can sometimes be biased one way or another; it can be system dependent in a bunch of ways, they are only accurate for a given amount of time while packages continue to develop/decay, caveat, caveat, caveat. BUT, they also tend to be directionally accurate, and that’s what I’m most pleased with here, particularly with regards to fread‘s one remaining advantage over CSV.jl in terms of multithreading support. (For full feature comparison between CSV.jl, fread, and pandas, see the CSV.jl 0.5 release announcement).

The files + benchmark script can be found here; the single-typed files and mixed.csv are derived from this benchmark site. The benchmark numbers shown above were run on my system: 2019 MacBook Pro, 2.4 GHz Intel Core i9, 32 GB 2400 MHz DDR4 RAM. The benchmarks were run using CSV.jl#master branch, as the last few things get ironed out before a release which will include multithreading support for Julia versions 1.3+.

