Author Archives: Julia Developers

JSOC 2015 project – NullableArrays.jl

By: Julia Developers

Re-posted from: http://feedproxy.google.com/~r/JuliaLang/~3/qD26JOmniX0/nullablearrays

My project under the 2015 Julia Summer of Code program has been to develop the NullableArrays package, which provides the NullableArray data type and its respective interface. I first encountered Julia earlier this year as a suggestion for which language I ought to learn as a matriculating PhD student in statistics. This summer has been an incredible opportunity for me both to develop as a young programmer and to contribute to an open-source community as full of possibility as Julia’s. I’d be remiss not to thank Alan Edelman’s group at MIT, NumFocus, and the Gordon & Betty Moore Foundation for their financial support, John Myles White for his mentorship and guidance, and many others of the Julia community who have helped to contribute both to the package and to my edification as a programmer over the summer. Much of my work on this project was conducted at the Recurse Center, where I received the support of an amazing community of self-directed learners.

The NullableArray data structure

NullableArrays are array structures that efficiently represent missing values without incurring the performance difficulties that face DataArray objects, which have heretofore been used to store data that include missing values. The core issue responsible for DataArrays performance woes concerns the way in which the former represent missing values, i.e. through a token NA object of token type NAType. In particular, indexing into, say, a DataArray{Int} can return an object either of type Int or of type NAType. This design does not provide sufficient information to Julia’s type inference system at JIT-compilation time to support the sort of static analysis that Julia’s compiler can otherwise leverage to emit efficient machine code. We can illustrate as much through following example, in which we calculate the sum of five million random Float64s stored in a DataArray:

julia> using DataArrays
# warnings suppressed…

julia> A = rand(5_000_000);

julia> D = DataArray(A);

julia> function f(D::AbstractArray)
           x = 0.0
           for i in eachindex(D)
               x += D[i]
           end
           x
       end
f (generic function with 1 method)

julia> f(D);

julia> @time f(D)
  0.163567 seconds (10.00 M allocations: 152.598 MB, 9.21% gc time)
2.500102419334644e6

Looping through and summing the elements of D is over twenty times slower and allocates far more memory than running the same loop over A:

julia> f(A);

julia> @time f(A)
  0.007465 seconds (5 allocations: 176 bytes)
2.500102419334644e6

This is because the code generated for f(D) must assume that getindex(D, i) for an arbitrary index i may return an object either of type Float64 or of type NAType and hence must “box” every object returned from indexing into D. The performance penalty incurred by this requirement is reflected in the comparison above. (The interested reader can find more about these issues here.)

On the other hand, NullableArrays are designed to support the sort of static analysis used by Julia’s type inference system to generate efficient machine code. The crux of the strategy is to use a single type — Nullable{T} — to represent both missing and present values. Nullable{T} objects are specialized containers that hold precisely either one or zero values. A Nullable that wraps, say, 5 can be taken to represent a present value of 5, whereas an empty Nullable{Int} can represent a missing value that, if it had been present, would have been of type Int. Crucially, both such objects are of the same type, i.e. Nullable{Int}. Interested readers can hear a bit more on these design considerations in my JuliaCon 2015 lighting talk.

Here is the result of running the same loop over a comparable NullableArray:

julia> using NullableArrays

julia> X = NullableArray(A);

julia> function f(X::NullableArray)
           x = Nullable(0.0)
           for i in eachindex(X)
               x += X[i]
           end
           x
       end
f (generic function with 1 method)

julia> f(X);

julia> @time f(X)
  0.009812 seconds (5 allocations: 192 bytes)
Nullable(2.500102419334644e6)

As can be seen, naively looping over a NullableArray is on the same order of magnitude as naively looping over a regular Array in terms of both time elapsed and memory allocated. Below is a set of plots (drawn with Gadfly.jl) that visualize the results of running 20 benchmark samples of f over both NullableArray and DataArray arguments each consisting of 5,000,000 random Float64 values and containing either zero null entries or approximately half randomly chosen null entries.

Of course, it is possible to bring the performance of such a loop over a DataArray up to par with that of a loop over an Array. But such optimizations generally introduce additional complexity that oughtn’t to be required to achieve acceptable performance in such a simple task. Considerably more complex code can be required to achieve performance in more involved implementations, such as that of broadcast!. We intend for NullableArrays to to perform well under involved tasks involving missing data while requiring as little interaction with NullableArray internals as possible. This includes allowing users to leverage extant implementations without sacrificing performance. Consider for instance the results of relying on Base’s implementation of broadcast! for DataArray and NullableArray arguments (i.e., having omitted the respective src/broadcast.jl from each package’s source code). Below are plots that visualize the results of running 20 benchmark samples of broadcast!(dest, src1, src2), where dest and src2 are 5_000_000 x 2 Arrays, NullableArrays or DataArrays, and src1 is a 5_000_000 x 1 Array, NullableArray or DataArray. As above, the NullableArray and DataArray arguments are tested in cases with either zero or approximately half null entries:

We have designed the NullableArray type to feel as much like a regular Array as possible. However, that NullableArrays return Nullable objects is a significant departure from both Array and DataArray behavior. Arguably the most important issue is to support user-defined functions that lack methods for Nullable arguments as they interact with Nullable and NullableArray objects. Throughout my project I have also worked to develop interfaces that make dealing with Nullable objects user-friendly and safe.

Given a method f defined on an argument signature of types (U1, U2, …, UN), we would like to provide an accessible, safe and performant way for a user to call f on an argument of signature (Nullable{U1}, Nullable{U2}, …, Nullable{UN}) without having to extend f herself. Doing so should return Nullable(f(get(u1), get(u1), …, get(un))) if each argument is non-null, and should return an empty Nullable if any argument is null. Systematically extending an arbitrary method f over Nullable argument signatures is often referred to as “lifting” f over the Nullable arguments.

NullableArrays offers keyword arguments for certain methods such as broadcast and map that direct the latter methods to lift passed function arguments over NullableArray arguments:

julia> X = NullableArray(collect(1:10), rand(Bool, 10))
10-element NullableArray{Int64,1}:
 #NULL
 #NULL
 #NULL
     4
     5
     6
     7
     8
 #NULL
    10

julia> f(x::Int) = 2x
f (generic function with 2 methods)

julia> map(f, X)
ERROR: MethodError: `f` has no method matching f(::Nullable{Int64})
Closest candidates are:
  f(::Any, ::Any)
 [inlined code] from /Users/David/.julia/v0.4/NullableArrays/src/map.jl:93
 in _F_ at /Users/David/.julia/v0.4/NullableArrays/src/map.jl:124
 in map at /Users/David/.julia/v0.4/NullableArrays/src/map.jl:172

julia> map(f, X; lift=true)
10-element NullableArray{Int64,1}:
 #NULL
 #NULL
 #NULL
     8
    10
    12
    14
    16
 #NULL
    20

I also plan to release shortly a small package that will offer a more flexible “lift” macro, which will be able to lift function calls over Nullable arguments within a variety of expression types.

We hope that the new NullableArrays package will help to support not only Julia’s statistical computing ecosystem as it moves forward but also any endeavor that requires an efficient, developed interface for handling arrays of Nullable objects. Please do try the package, submit feature requests, report bugs, and, if you’re interested, submit a PR or two. Happy coding!

Julia 0.4 Release Announcement

By: Julia Developers

Re-posted from: http://feedproxy.google.com/~r/JuliaLang/~3/tv3e4d-IFiE/julia-0.4-release

We are pleased to announce the release of Julia 0.4.0. This release contains
major language refinements and numerous standard library improvements.
A summary of changes is available in the
NEWS log
found in our main repository. We will be making regular 0.4.x bugfix releases from
the release-0.4 branch of the codebase, and we recommend the 0.4.x line for users
requiring a more stable Julia environment.

The Julia ecosystem continues to grow, and there are now
over 700 registered packages! (highlights below).
JuliaCon 2015 was held in June, and >60 talks are available to view. JuliaCon India will be held in Bangalore on 9 and 10 October.

We welcome bug reports on our GitHub tracker, and general usage questions on the
users mailing list, StackOverflow, and several community forums.

Binaries are available from the
main download page, or visit JuliaBox
to try 0.4 from the comfort of your browser. Happy Coding!


Notable compiler and language news:


Upcoming work for 0.5

Nightly builds will use the versioning scheme 0.5.0-dev.

  • A major focus of 0.5 will be further (breaking) improvements to core array functionality, as detailed
    in this issue.
  • We plan to merge the threading branch,
    but the functionality will be considered experimental and only available as a compile-
    time flag for the near future.

Community News

The Julia ecosystem continues to grow, and there are now
over 700 registered packages! (highlights below)

The second JuliaCon was held in Cambridge (USA) in June, 2015.
Over 60 talks were recorded and
are available for viewing.

JuliaCon India will be held in Bangalore on 9 and 10 October.

JuliaBloggers is going strong! A notable recent feature is
the #MonthOfJulia series exploring the core language and a number of packages.

Topical Highlights

JuliaStats – statistical and machine learning community.
JuliaOpt – optimization community.
JuliaQuantum – Julia libraries for quantum-science and technology.
JuliaGPU – GPU libraries and tooling.
IJulia – notebook interface built on IPython.
Images – image processing and i/o library.
Gadfly – Grammar of Graphics-inspired statistical plotting.
Winston – 2D plotting.
JunoLab – LightTable-based interactive environment.

Auto Diff In Julia

By: Julia Developers

Re-posted from: http://feedproxy.google.com/~r/JuliaLang/~3/9uXzpsX1lxo/auto-diff-in-julia

This summer, I’ve had the good fortune to be able to participate in the first ever Julia Summer of Code (JSoC), generously sponsored by the Gordon and Betty Moore Foundation. My JSoC project was to explore the use of Julia for automatic differentiation (AD), a topic with a wide array of applications in the field of optimization.

Under the mentorship of Miles Lubin and Theodore Papamarkou, I completed a major overhaul of ForwardDiff.jl, a Julia package for calculating derivatives, gradients, Jacobians, Hessians, and higher-order derivatives of native Julia functions (or any callable Julia type, really).

By the end of this post, you’ll hopefully know a little bit about how ForwardDiff.jl works, why it’s useful, and why Julia is uniquely well-suited for AD compared to other languages.

Seeing ForwardDiff.jl In Action

Before we get down to the nitty-gritty details, it might be helpful to see a simple example that illustrates various methods from ForwardDiff.jl’s API.

The snippet below is a somewhat contrived example, but works well enough as an introduction to the package. First, we define a target function we’d like to differentiate, then use ForwardDiff.jl to calculate some derivatives of the function at a given input:

“`julia
julia> using ForwardDiff

julia> f(x::Vector) = sum(sin, x) + prod(tan, x) * sum(sqrt, x);

julia> x = rand(5)
5-element Array{Float64,1}:
0.986403
0.140913
0.294963
0.837125
0.650451

julia> g = ForwardDiff.gradient(f); # g = ∇f

julia> g(x)
5-element Array{Float64,1}:
1.01358
2.50014
1.72574
1.10139
1.2445

julia> j = ForwardDiff.jacobian(g); # j = J(∇f)

julia> j(x)
5×5 Array{Float64,2}:
0.585111 3.48083 1.7706 0.994057 1.03257
3.48083 1.06079 5.79299 3.25245 3.37871
1.7706 5.79299 0.423981 1.65416 1.71818
0.994057 3.25245 1.65416 0.251396 0.964566
1.03257 3.37871 1.71818 0.964566 0.140689

julia> ForwardDiff.hessian(f, x) # H(f)(x) == J(∇f)(x), as expected
5×5 Array{Float64,2}:
0.585111 3.48083 1.7706 0.994057 1.03257
3.48083 1.06079 5.79299 3.25245 3.37871
1.7706 5.79299 0.423981 1.65416 1.71818
0.994057 3.25245 1.65416 0.251396 0.964566
1.03257 3.37871 1.71818 0.964566 0.140689
“`

Tada! Okay, that’s not too exciting – I could’ve just done the same thing with Calculus.jl! What makes ForwardDiff.jl different? Read on to find out…

How ForwardDiff.jl Works

What is Automatic Differentiation?

In broad terms, automatic differentiation describes a class of algorithms for automatically taking exact derivatives of user-provided functions. In addition to producing more accurate results, AD methods are also often faster than other common differentiation methods (such as finite differencing).

There are two main flavors of AD: forward mode, and reverse mode. As you might’ve guessed, this post only discusses forward mode, which is the kind of AD implemented by ForwardDiff.jl.

ForwardDiffNumber Types

ForwardDiff.jl implements several new number types, all of which are subtypes of ForwardDiffNumber{N,T,C} <: Number.

Elementary numerical functions on these types are then overloaded to evaluate both the original function and its derivative(s), returning the results in the form of a new ForwardDiffNumber. Thus, we can pass these number types into a general function $f$ (which is assumed to be composed of the overloaded elementary functions), and the derivative information is naturally propagated at each step of the calculation by way of the chain rule.

This propagation occurs all the way through to the result of the function, which is itself a ForwardDiffNumber (or an Array{F<:ForwardDiffNumber}). This final result contains the value $f(x)$ and the derivative $f’(x)$, where $x$ was the original point of evaluation.

Simple Forward Mode AD in Julia: Dual Numbers

The easiest way to write actual Julia code demonstrating this technique is to implement a simple dual number type. Note that there is already a Julia package dedicated to such an implementation, but we’re going to roll our own here for pedagogical purposes.

Here’s how we’ll define our DualNumber type:

“`julia
immutable DualNumber{T} <: Number
value::T
deriv::T
end

value(d::DualNumber) = d.value
deriv(d::DualNumber) = d.deriv
“`

Next, we can start defining functions on DualNumber. Here are a few examples to give you a feel for the process:

“`julia
function Base.sqrt(d::DualNumber)
new_value = sqrt(value(d))
new_deriv = 0.5 / new_value
return DualNumber(new_value, new_deriv*deriv(d))
end

function Base.sin(d::DualNumber)
new_value = sin(value(d))
new_deriv = cos(value(d))
return DualNumber(new_value, new_deriv*deriv(d))
end

function Base.(:+)(a::DualNumber, b::DualNumber)
new_value = value(a) + value(b)
new_deriv = deriv(a) + deriv(b)
return DualNumber(new_value, new_deriv)
end

function Base.(:*)(a::DualNumber, b::DualNumber)
val_a, val_b = value(a), value(b)
new_value = val_a * val_b
new_deriv = val_b * deriv(a) + val_a * deriv(b)
return DualNumber(new_value, new_deriv)
end
“`

We can now evaluate the derivative of any scalar function composed of the above elementary functions. To do so, we simply pass an instance of our DualNumber type into the function, and extract the derivative from the result. For example:

“`julia
julia> f(x) = sqrt(sin(x * x)) + x
f (generic function with 1 method)

julia> f(1.0)
1.8414709848078965

julia> d = f(DualNumber(1.0, 1.0))
DualNumber{Float64}(1.8414709848078965,1.5403023058681398)

julia> deriv1 = deriv(d)
1.589002649374538

julia> using Calculus; deriv2 = Calculus.derivative(f, 1.0)
1.5890026493377403

julia> deriv1 – deriv2
3.679767601738604e-11
“`

Notice that our dual number result comes close to the result obtained from Calculus.jl, but is actually slightly different. That slight difference is due to the approximation error inherent to the finite differencing method employed by Calculus.jl.

In reality, the number types that ForwardDiff.jl implements are quite a bit more complicated than DualNumber. ForwardDiffNumbers behave like ensembles of dual numbers and hyper-dual numbers (the higher-order analog of dual numbers), allowing for simultaneous calculation of multiple higher-order partial derivatives. For an in-depth examination of ForwardDiff.jl’s number type implementation, see this section of the developer documentation.

Performance Comparison: The Ackley Function

To illustrate the performance gains that can be achieved using ForwardDiff.jl, let’s do a (very) naive benchmark comparison between ForwardDiff.jl and other libraries. For this comparison, we’ll be comparing the time to calculate the gradient of a function using ForwardDiff.jl, Calculus.jl, and a Python-based AD tool, AlgoPy.

To put these benchmarks in context, here are my machine specs, along the version info of the relevant libraries:

Machine Specs

  • Model: Late 2013 MacBook Pro
  • OS: OSX 10.9.5
  • Processor: 2.6 GHz Intel Core i5
  • Memory: 8 GB 1600 MHz DDR3

Version Info

  • Julia v0.4.1-pre+15
    • ForwardDiff.jl v0.1.2
    • Calculus.jl v0.1.13
  • Python v2.7.9
    • AlgoPy v0.5.1

Defining the Ackley Function

The function we’ll be using in our test is the Ackley function, which is mathematically defined as

Here’s the definition of the function in Julia:

julia
function ackley(x)
a, b, c = 20.0, -0.2, 2.0*π
len_recip = inv(length(x))
sum_sqrs = zero(eltype(x))
sum_cos = sum_sqrs
for i in x
sum_cos += cos(c*i)
sum_sqrs += i^2
end
return (-a * exp(b * sqrt(len_recip*sum_sqrs)) -
exp(len_recip*sum_cos) + a + e)
end

…and here’s the corresponding Python definition:

python
def ackley(x):
a, b, c = 20.0, -0.2, 2.0*numpy.pi
len_recip = 1.0/len(x)
sum_sqrs, sum_cos = 0.0, 0.0
for i in x:
sum_cos += algopy.cos(c*i)
sum_sqrs += i*i
return (-a * algopy.exp(b*algopy.sqrt(len_recip*sum_sqrs)) -
algopy.exp(len_recip*sum_cos) + a + numpy.e)

Benchmark Results

The benchmarks were performed with input vectors of length 16, 1600, and 16000, taking the best time out of 5 trials for each test.

Function evaluation time

The below table compares the evaluation times of ackley(x) in both Python and Julia:

length(x) Python time (s) Julia time (s) Speed-Up vs. Python
16 0.00011 2.3e-6 47.83x
1600 0.00477 4.0269e-5 118.45x
16000 0.04747 0.00037 128.30x

Gradient evaluation time

The below table compares the evaluation times of ∇ackley(x) using various libraries (the chunk_size column denotes a configuration option passed to the ForwardDiff.gradient method, see the chunk-mode docs for details.):

length(x) AlgoPy time (s) Calculus.jl time (s) ForwardDiff time (s) chunk_size
16 0.00212 2.2e-5 3.5891e-5 16
1600 0.53439 0.10259 0.01304 10
16000 101.55801 11.18762 1.35411 10

Here, we show the speed-up ratio of ForwardDiff.jl over the other libraries:

length(x) Speed-Up vs. AlgoPy Speed-Up vs. Calculus.jl
16 59.07x 0.61x
1600 40.98x 7.86x
16000 74.99x 8.26x

As you can see, Python + AlgoPy falls pretty short of the speeds achieved by Julia + ForwardDiff.jl, or even Julia + Calculus.jl. While Calculus.jl is actually almost twice as fast as ForwardDiff.jl for the lowest input dimension vector, it is ~8 times slower than ForwardDiff.jl for the higher input dimension vectors.

Another metric that might be useful to look at is the “slowdown ratio” between the gradient evaluation time and the function evaluation time, defined as:

Here are the results (lower is better):

length(x) AlgoPy ratio Calculus.jl ratio ForwardDiff.jl ratio
16 19.27 9.56 15.60
1600 112.03 2547.61 323.82
16000 2139.41 30236.81 3659.77

Both AlgoPy and ForwardDiff.jl beat out Calculus.jl for evaluation at higher input dimensions, which isn’t too surprising. AlgoPy beating ForwardDiff.jl, though, might catch you off guard – ForwardDiff.jl had the fastest absolute runtimes, after all! One explanation for this outcome is that AlgoPy falls back to vectorized Numpy methods when calculating the gradient, while the ackley function itself uses your usual, slow Python scalar arithmetic. Julia’s scalar arithmetic performance is much faster than Python’s, so ForwardDiff.jl doesn’t have as much “room for improvement” as AlgoPy does.

Julia’s AD Advantage

At the beginning of this post, I promised I would give the reader an answer to the question: “Why is Julia uniquely well-suited for AD compared to other languages?”

There are several good answers, but the chief reason for Julia’s superiority is its efficient implementation of multiple dispatch.

Unlike many other languages, Julia’s type-based operator overloading is fast and natural, as it’s one of the central design tenets of the language. Since Julia is JIT-compiled, the bytecode representation of a Julia function can be tied directly to the types with which the function is called. This allows the compiler to optimize every Julia method for the specific input type at runtime.

This ability is phenomenally useful for implementing forward mode AD, which relies almost entirely on operator overloading in order to work. In most other scientific computing languages, operator overloading is either very slow (e.g. MATLAB), fraught with weird edge cases (e.g. Python), arduous to implement generally (e.g. C++) or some combination of all three. In addition, very few languages allow operator overloading to naturally extend to native, black-box, user-written code. Julia’s multiple dispatch is the secret weapon leveraged by ForwardDiff.jl to overcome these hurdles.

Future Directions

The new version of ForwardDiff.jl has just been released, but development of the package is still ongoing! Here’s a list of things I’d like to see ForwardDiff.jl support in the future:

  • More elementary function definitions on ForwardDiffNumbers
  • More optimized versions of existing elementary function definitions on ForwardDiffNumbers
  • Methods for evaluating Jacobian-matrix products (highly useful in conjunction with reverse mode AD).
  • Parallel/shared-memory/distributed-memory versions of current API methods for handling problems with huge input/output dimensions
  • A more robust benchmarking suite for catching performance regressions

If you have any ideas on how to make ForwardDiff.jl more useful, feel free to open a pull request or issue in the package’s GitHub repository.