Author Archives: Julia – Alex Mellnik

Why I’m still a Nullable-luddite

By: Julia – Alex Mellnik

Re-posted from: http://alex.mellnik.net/why-im-still-a-nullable-luddite/

Much of my Julia work involves manipulating data in the form of DataFrames.  DataFrames are pretty handy — it’s easy to get data into them from file formats like csv and feather, as well as all types of databases.  I can split-apply-combine to my heart’s content with do blocks, and while only basic joins are possible using join, it’s easy to form more complex ones with just a few lines of code.  It’s reasonably fast to work with them, but … it could be a whole lot faster.  That’s because DataFrames, as we know them today, have columns of type DataArray, which help deal with missing values.  A DataArray of some type T has elements which are either of type T, or if the element is missing, of type NAType.  This type instability means that the Julia JIT can’t easily determine the type of the elements, which is a big performance killer.1

This is all old news, and everyone agrees that that we need to move toward a better solution: Nullables.  Rather than encoding your possibly-missing values in a a DataArray, you can make a normal Array of type Nullable{T}.2  This has the advantage that when you access elements of Array you always get the same type, and the JIT is happy. 3

Sounds grand, right?  This is the reason that many packages that work with DataFrames have started moving from DataArrays to normal arrays of Nullables.  Unfortunately, it turns out that Nullables are a pain to work with, and because of this I generally take the performance hit and convert everything back to DataArrays.  How much of a pain are they to work with?  Here’s a list of some fairly reasonable things that you can’t do with Nullables (on 0.4.5):

 # Needs to be get(Nullable(1)) + get(Nullable(2))
Nullable(1) + Nullable(2)
 # You can't add arrays of Nullables to containers.
df = DataFrame()
df[:col] = NullableArray([1,2])
# This returns false, although it may be fixed soon. 
Nullable(0, true) !== Nullable{Int}()

Most importantly, Nullable(T) is a vastly different creature than T.  You can’t take it bowling and buy it a beer: any time you want to hand it off to pretty much any package you need to convert it back to the original type to have it function as expected.

Because of these issues, I’m planning on sticking with DataArrays until the Nullable ecosystem is in a more mature state.  I have high hopes that this will coincide with the release of 0.5.0, but others are not as optimistic.  We’ll see!  Despite these growing pains, it’s a very exciting time to work in Julia.

 

 

  1. John Myles White has a nice post about this here.
  2. There’s also NullableArrays.jl which cleans things up a bit, but most of the issues I discuss below hold regardless.
  3. There’s still some lingering performance issues related to the DataFrame container.

Using Julia on the web

By: Julia – Alex Mellnik

Re-posted from: http://alex.mellnik.net/using-julia-to-perform-calculations-on-a-web-site/

A fairly common problem is that you want to use Julia or another high level language to solve a problem, but also provide that solution to the world through a web site.  For Julia, you have a few different options:

  • Use Escher.jl.  This is a full-stack web server here pages are written in Julia.  It can do a lot of interesting things and is good for simple demos.  However, it is still in an early state, doesn’t play nicely with the existing web ecosystem, and has some performance issues.
  • Write the server-side code from scratch in Mux.jl.  This offers a lot more flexibility than Escher.jl, and could be used as a back-end along with static front end that uses your favorite tools.  However, it could lead to some extra overhead, especially if you already have server-side components.
  • Run an API from JuliaBox.  This is still somewhat experimental.
  • Use node-julia to call Julia directly from Node.js.  If you’re already using node, this seems like the best option at the moment.

Node-julia can be tricky to install and its syntax can seem arcane, but once you’re familiar with passing data back and forth between Javascript and Julia you can call your Julia code with minimal effort.  This winter I threw together a simple example here — index.js might look a bit complicated, but most of the code deals with handling an uploaded csv file.