Tag Archives: Programming

My Experience at JuliaCon

By: John Myles White

Re-posted from: http://www.johnmyleswhite.com/notebook/2014/06/30/my-experience-at-juliacon/

Introduction

I just got home from JuliaCon, the first conference dedicated entirely to Julia. It was a great pleasure to spend two full days listening to talks about a language that I started advocating for just a little more than two years ago.

What follows is a very brief review of the talks that excited me the most. It’s not in any way exhaustive: there were a bunch of other good talks that I saw as well as a few talks I missed so that I could visit the Data Science for Social Good fellows.

Optimization

The optimization community seems to be the academic field that’s been most ready to adopt Julia. Two talks about using Julia for optimization stood out: Iain Dunning and Joey Huchette’s talk about JuMP.jl, and Madeleine Udell’s talk about CVX.jl.

JuMP implements a DSL that allows users to describe an optimization problem in purely mathematical terms. This problem encoding can be then passed to one of many backend solvers to determine a solution. By abstracting across solvers, JuMP makes it easier for people like me to get access to well-established tools like GLPK.

CVX is quite similar to JuMP, but it implements a symbolic computation system that’s especially focused on allowing users to encode convex optimization problems. One of the things that’s most appealing about CVX is that it automatically confirms whether the problem you’re encoding is convex or not. Until I saw Madeleine’s talk, I hadn’t realized how much progress had been made on CVX.jl. Now that I’ve seen CVX.jl in action, I’m hoping to start using it for some of my work. I’ll probably also write a blog post about it in the future.

Statistics

I really enjoyed the statistics talks given by Doug Bates, Simon Byrne and Dan Wlasiuk. I was especially glad to hear Doug Bates remind the audience that, years ago, he’d attended a small meeting about R that was similar in size to this first iteration of JuliaCon. Over the course of the intervening decades, he noted that the R community has grown from dozens to millions of users.

Language-Level Issues

Given that Julia is still something of a language nerd’s language, it’s no surprise that some of the best talks focused on language-level issues.

Arch Robison gave a really interesting talk about the tools used in Julia 0.3 to automatically vectorize code so that it can take advantage of SIMD instructions. For those coming from languages like R or Python, you should be aware that vectorization means almost the exact opposite thing to compiler writers that it means to high-level language users: vectorization involves the transformation of certain kinds of iterative code into the thread-free parallelized instructions that modern CPU’s provide for performing a single operation on multiple data chunks simultaneously. I’ve come to love this kind of compiler design discussion and the invariance properties the compiler needs to prove before it can perform program transformations safely. For example, Arch noted that SIMD instructions can be safely used when working on many integers, but cannot be used on floating point numbers because of failures of associativity.

After Arch spoke, Jeff Bezanson gave a nice description of the process by which Julia code is transformed from raw text users enter into the REPL into the final compiled form that gets executed by CPU’s. For those interested in understanding how Julia works under the hood, this talk is likely to be the best place to start.

In addition, Leah Hanson and Keno Fischer both gave good talks about improved tools for debugging Julia code. Leah spoke about TypeCheck.jl, a system for automatically warning about potential code problems. Keno demoed a very rough draft of a Julia debugger built on top of LLDB. As an added plus, Keno also demoed a new C++ FFI for Julia that I’m really looking forward to. I’m hopeful that the new FFI will make it much easier to wrap C++ libraries for use from Julia.

Deploying Julia in Production

Both Avik Sengupta and Michael Bean described their experiences using Julia in production systems. Knowing that Julia was being used in production anywhere was inspiring.

Graphics and Audio

Daniel C. Jones and Spencer Russell both gave great talks about the developments taking place in graphics and audio support. Daniel C. Jones’s demo of a theremin built using Shashi Gowda’s React.jl and Spencer Russell’s AudioIO.jl was especially impressive.

Take Aways

The Julia community really is a community now. It was big enough to sell out a small conference and to field a large variety of discussion topics. I’m really excited to see how the next JuliaCon will turn out.

Revisiting emulated OOP behaviour and multiple dispatch in Julia

By: Terence Copestake

Re-posted from: https://thenewphalls.wordpress.com/2014/06/02/revisiting-emulated-oop-behaviour-and-multiple-dispatch-in-julia/

In an earlier post, I explored one approach to emulating bundling functionality with the data on which it operates, akin to object methods in OOP languages such as C# and PHP. A comment posted by Matthew Browne questioned whether this approach was compatible with Julia’s multiple dispatch.

This is something I thought about at the time of writing the original article, but I had assumed it wouldn’t be possible due to the way in which the anonymous functions are assigned to variables i.e. assigning one definition would overwrite the previous. However, Matthew’s question prompted me to reconsider – and after some brief experimentation and some small alterations, I found that there is indeed a way to maintain compatibility with multiple dispatch.

Below is an updated example type definition:

type MDTest
    method::Function

    function MDTest()
        this = new()

        function TestFunction(input::String)
            println(input)
        end

        function TestFunction(input::Int64)
            println(input * 10)
        end

        this.method = TestFunction

        return this
    end
end

The theory is basically the same, with the constructor assigning the methods to their respective fields within the type. The difference is in how the functions are defined and assigned.

On lines 7 and 11, methods are defined with different argument types. These methods could be defined outside of the type definition without error, but defining them within the constructor has the advantage of not polluting the global scope.

On line 15, the function is assigned to its field using some slightly different syntax, which allows both methods to be called.

With this, the example code below:

test = MDTest()

test.method("String")

test.method(5)

Produces the output:

String
50

Another advantage to this approach is the absence of anonymous functions – which, according to benchmarks and GitHub issues, have significantly worse performance compared to named functions.

Julia variable gotchas

By: Terence Copestake

Re-posted from: http://thenewphalls.wordpress.com/2014/04/07/julia-variable-gotchas/

As is typical for many languages, assigning one variable to another in Julia does not create a copy of the variable data, but rather a reference to the existing data. However, I learned the hard way whilst working on the CGI module* that Julia does not currently support a copy-on-write mechanism for collections.

Take the example code below:

n = [ 1, 2, 3 ]

m = n

As expected, m becomes a reference to the collection referenced by n. Working with any number of mainstream languages, one might expect a copy to be made of the data referenced by n if either n or m is modified, for example:

n = [ 1, 2, 3 ]

m = n

push!(n, 4)

# Expect n = [ 1, 2, 3, 4] and m = [ 1, 2, 3 ]

This is not the case for Julia. When the array pointed to by n is modified, m maintains its reference to that same array, giving both a value of [ 1, 2, 3, 4 ].

Problems in the wild

I encountered this quirk when working with binary data and UTF-8 strings.

n = Uint8[ 0x32, 0x33, 0x34, 0x61 ]

m = utf8(n)

empty!(n)

Having created a string using the utf8 function, I wanted to empty the original byte array to free those resources. After a few minutes of trying to figure out how a bounds error had crept in to my app, I narrowed it down to this deletion of the byte array.

Digging deeper into the Julia source, the utf8 function is just an alias for a conversion function.

utf8(x) = convert(UTF8String, x)
...
convert(::Type{UTF8String}, a::Array{Uint8,1}) = is_valid_utf8(a) ? UTF8String(a) : ...

You can see here that passing an array of Uint8 bytes to utf8() creates an instance of UTF8String with the Uint8 array as its data. The type definition for UTF8String is:

immutable UTF8String <: String
    data::Array{Uint8,1}
end

As was covered above, the UTF8String’s data field will be only a reference to the collection passed to the utf8 function. If that collection is modified in any way at any point during the program’s runtime, so too will be the returned string.

In closing

It seems that the solution at this time is to explicitly use the copy or deepcopy functions, where copies of data are required by the program logic.

The issue is explored in this Google Groups thread. If I’ve understood correctly, the gist of it is that Julia makes this sacrifice for the sake of performance. As this is a feature wanted by many, there’s a possibility of it being implemented in a later version of the language.

* Write-up to follow at a later date