Author Archives: Julia Developers

Put This In Your Pipe

By: Julia Developers

Re-posted from: http://feedproxy.google.com/~r/JuliaLang/~3/VsxSnYTyTTM/put-this-in-your-pipe

In a previous post, I talked about why “shelling out” to spawn a pipeline of external programs via an intermediate shell is a common cause of bugs, security holes, unnecessary overhead, and silent failures.
But it’s so convenient!
Why can’t running pipelines of external programs be convenient and safe?
Well, there’s no real reason, actually.
The shell itself manages to construct and execute pipelines quite well.
In principle, there’s nothing stopping high-level languages from doing it at least as well as shells do – the common ones just don’t by default, instead requiring users to make the extra effort to use external programs safely and correctly.
There are two major impediments:

  • Some moderately tricky low-level UNIX plumbing using the pipe, dup2, fork, close, and exec system calls;
  • The UX problem of designing an easy, flexible programming interface for commands and pipelines.

This post describes the system we designed and implemented for Julia, and how it avoids the major flaws of shelling out in other languages.
First, I’ll present the Julia version of the previous post’s example – counting the number of lines in a given directory containing the string “foo”.
The fact that Julia provides complete, specific diagnostic error messages when pipelines fail turns out to reveal a surprising and subtle bug, lurking in what appears to be a perfectly innocuous UNIX pipeline.
After fixing this bug, we go into details of how Julia’s external command execution and pipeline construction system actually works, and why it provides greater flexibility and safety than the traditional approach of using an intermediate shell to do all the heavy lifting.

Simple Pipeline, Subtle Bug

Here’s how you write the example of counting the number of lines in a directory containing the string “foo” in Julia
(you can follow along at home if you have Julia installed from source by changing directories into the Julia source directory and doing cp -a src "source code"; mkdir tmp and then firing up the Julia repl):

julia> dir = "src";

julia> int(readchomp(`find $dir -type f -print0` |> `xargs -0 grep foo` |> `wc -l`))
5

This Julia command looks suspiciously similar to the naïve Ruby version we started with in the previous post:

`find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i

However, it isn’t susceptible to the same problems:

julia> dir = "source code";

julia> int(readchomp(`find $dir -type f -print0` |> `xargs -0 grep foo` |> `wc -l`))
5

julia> dir = "nonexistent";

julia> int(readchomp(`find $dir -type f -print0` |> `xargs -0 grep foo` |> `wc -l`))
find: `nonexistent': No such file or directory
ERROR: failed processes:
  Process(`find nonexistent -type f -print0`, ProcessExited(1)) [1]
  Process(`xargs -0 grep foo`, ProcessExited(123)) [123]
 in pipeline_error at process.jl:412
 in readall at process.jl:365
 in readchomp at io.jl:172

julia> dir = "foo'; echo MALICIOUS ATTACK; echo '";

julia> int(readchomp(`find $dir -type f -print0` |> `xargs -0 grep foo` |> `wc -l`))
find: `foo\'; echo MALICIOUS ATTACK; echo \'': No such file or directory
ERROR: failed processes:
  Process(`find "foo'; echo MALICIOUS ATTACK; echo '" -type f -print0`, ProcessExited(1)) [1]
  Process(`xargs -0 grep foo`, ProcessExited(123)) [123]
 in pipeline_error at process.jl:412
 in readall at process.jl:365
 in readchomp at io.jl:172

The default, simplest-to-achieve behavior in Julia is:

  • not susceptible to any kind of metacharacter breakage,
  • reliably detects all subprocess failures,
  • automatically raises an exception if any subprocess fails,
  • prints error messages including exactly which commands failed.

In the above examples, we can see that even when dir contains spaces or quotes, the expression still behaves exactly as intended – the value of dir is interpolated as a single argument to the find command.
When dir is not the name of a directory that exists, find fails – as it should – and this failure is detected and automatically converted into an informative exception, including the fully expanded command-lines that failed.

In the previous post, we observed that using the pipefail option for Bash allows detection of pipeline failures, like this one, occurring before the last process in the pipeline.
However, it only allows us to detect that at least one thing in the pipeline failed.
We still have to guess at what parts of the pipeline actually failed.
In the Julia example, on the other hand, there is no guessing required:
when a non-existent directory is given, we can see that both find and xargs fail.
While it is unsurprising that find fails in this case, it is unexpected that xargs also fails.
Why does xargs fail?

One possibility to check for is that the xargs program fails with no input.
We can use Julia’s success predicate to try it out:

julia> success(`cat /dev/null` |> `xargs true`)
true

Ok, so xargs seems perfectly happy with no input.
Maybe grep doesn’t like not getting any input?

julia> success(`cat /dev/null` |> `grep foo`)
false

Aha! grep returns a non-zero status when it doesn’t get any input.
Good to know.
It turns out that grep indicates whether it matched anything or not with its return status.
Most programs use their return status to indicate success or failure, but some, like grep, use it to indicate some other boolean condition – in this case “found something” versus “didn’t find anything”:

julia> success(`echo foo` |> `grep foo`)
true

julia> success(`echo bar` |> `grep foo`)
false

Now we know why grep is “failing” – and xargs too, since it returns a non-zero status if the program it runs returns non-zero.
This means that our Julia pipeline and the “responsible” Ruby version are both susceptible to bogus failures when we search an existing directory that happens not to contain the string “foo” anywhere:

julia> dir = "tmp";

julia> int(readchomp(`find $dir -type f -print0` |> `xargs -0 grep foo` |> `wc -l`))
ERROR: failed process: Process(`xargs -0 grep foo`, ProcessExited(123)) [123]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in pipeline_error at process.jl:407
 in readall at process.jl:365
 in readchomp at io.jl:172

Since grep indicates not finding anything using a non-zero return status, the readall function concludes that its pipeline failed and raises an error to that effect.
In this case, this default behavior is undesirable:
we want the expression to just return 0 without raising an error.
The simple fix in Julia is this:

julia> dir = "tmp";

julia> int(readchomp(`find $dir -type f -print0` |> ignorestatus(`xargs -0 grep foo`) |> `wc -l`))
0

This works correctly in all cases.
Next I’ll explain how all of this works, but for now it’s enough to note that the detailed error message provided when our pipeline failed exposed a rather subtle bug that would eventually cause subtle and hard-to-debug problems when used in production.
Without such detailed error reporting, this bug would be pretty difficult to track down.

Do-Nothing Backticks

Julia borrows the backtick syntax for external commands form Perl and Ruby, both of which in turn got it from the shell.
Unlike in these predecessors, however, in Julia backticks don’t immediately run commands, nor do they necessarily indicate that you want to capture the output of the command.
Instead, backticks just construct an object representing a command:

julia> `echo Hello`
`echo Hello`

julia> typeof(ans)
Cmd

(In the Julia repl, ans is automatically bound to the value of the last evaluated input.)
In order to actually run a command, you have to do something with a command object.
To run a command and capture its output into a string – what other languages do with backticks automatically – you can apply the readall function:

julia> readall(`echo Hello`)
"Hello\n"

Since it’s very common to want to discard the trailing line break at the end of a command’s output, Julia provides the readchomp(x) command which is equivalent to writing chomp(readall(x)):

julia> readchomp(`echo Hello`)
"Hello"

To run a command without capturing its output, letting it just print to the same stdout stream as the main process – i.e. what the system function does when given a command as a string in other languages – use the run function:

julia> run(`echo Hello`)
Hello

The "Hello\n" after the readall command is a returned value, whereas the Hello after the run command is printed output.
(If your terminal supports color, these are colored differently so that you can easily distinguish them visually.)
Nothing is returned by the run command, but if something goes wrong, an exception is raised:

julia> run(`false`)
ERROR: failed process: Process(`false`, ProcessExited(1)) [1]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in run at process.jl:384

julia> run(`notaprogram`)
execvp(): No such file or directory
ERROR: failed process: Process(`notaprogram`, ProcessExited(-1)) [-1]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in run at process.jl:384

As with xargs and grep above, this may not always be desirable.
In such cases, you can use ignorestatus to indicate that the command returning a non-zero value should not be considered an error:

julia> run(ignorestatus(`false`))

julia> run(ignorestatus(`notaprogram`))
execvp(): No such file or directory
ERROR: failed process: Process(`notaprogram`, ProcessExited(-1)) [-1]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in run at process.jl:384

In the latter case, an error is still raised in the parent process since the problem is that the executable doesn’t even exist, rather than merely that it ran and returned a non-zero status.

Although Julia’s backtick syntax intentionally mimics the shell as closely as possible, there is an important distinction:
the command string is never passed to a shell to be interpreted and executed;
instead it is parsed in Julia code, using the same rules the shell uses to determine what the command and arguments are.
Command objects allow you to see what the program and arguments were determined to be by accessing the .exec field:

julia> cmd = `perl -e 'print "Hello\n"'`
`perl -e 'print "Hello\n"'`

julia> cmd.exec
3-element Union(UTF8String,ASCIIString) Array:
 "perl"
 "-e"
 "print \"Hello\\n\""

This field is a plain old array of strings that can be manipulated like any other Julia array.

Constructing Commands

The purpose of the backtick notation in Julia is to provide a familiar, shell-like syntax for making objects representing commands with arguments.
To that end, quotes and spaces work just as they do in the shell.
The real power of backtick syntax doesn’t emerge, however, until we begin constructing commands programmatically.
Just as in the shell (and in Julia strings), you can interpolate values into commands using the dollar sign ($):

julia> dir = "src";

julia> `find $dir -type f`.exec
4-element Union(UTF8String,ASCIIString) Array:
 "find"
 "src"
 "-type"
 "f"

Unlike in the shell, however, Julia values interpolated into commands are interpolated as a single verbatim argument – no characters inside the value are interpreted as special after the value has been interpolated:

julia> dir = "two words";

julia> `find $dir -type f`.exec
4-element Union(UTF8String,ASCIIString) Array:
 "find"
 "two words"
 "-type"
 "f"

julia> dir = "foo'bar";

julia> `find $dir -type f`.exec
4-element Union(UTF8String,ASCIIString) Array:
 "find"
 "foo'bar"
 "-type"
 "f"

This works no matter what the contents of the interpolated value is, allowing simple interpolation of characters that are quite difficult to pass as parts of command-line arguments even in the shell (for the following examples, tmp/a.tsv and tmp/b.tsv can be created in the shell with echo -e "foo\tbar\nbaz\tqux" > tmp/a.tsv; echo -e "foo\t1\nbaz\t2" > tmp/b.tsv):

julia> tab = "\t";

julia> cmd = `join -t$tab tmp/a.tsv tmp/b.tsv`;

julia> cmd.exec
4-element Union(UTF8String,ASCIIString) Array:
 "join"
 "-t\t"
 "tmp/a.tsv"
 "tmp/b.tsv"

julia> run(cmd)
foo     bar     1
baz     qux     2

Moreover, what comes after the $ can actually be any valid Julia expression, not just a variable name:

julia> `join -t$"\t" tmp/a.tsv tmp/b.tsv`.exec
4-element Union(UTF8String,ASCIIString) Array:
 "join"
 "-t\t"
 "a.tsv"
 "b.tsv"

A tab character is somewhat harder to pass in the shell, requiring command interpolation and some tricky quoting:

bash-3.2$ join -t"$(printf '\t')" tmp/a.tsv tmp/b.tsv
foo     bar     1
baz     qux     2

While interpolating values with spaces and other strange characters is great for non-brittle construction of commands, there was a reason why the shell split values on spaces in the first place:
to allow interpolation of multiple arguments.
Most modern shells have first-class array types, but older shells used space-separation to simulate arrays.
Thus, if you interpolate a value like “foo bar” into a command in the shell, it’s treated as two separate words by default.
In languages with first-class array types, however, there’s a much better option:
consistently interpolate single values as single arguments and interpolate arrays as multiple values.
This is precisely what Julia’s backtick interpolation does:

julia> dirs = ["foo", "bar", "baz"];

julia> `find $dirs -type f`.exec
6-element Union(UTF8String,ASCIIString) Array:
 "find"
 "foo"
 "bar"
 "baz"
 "-type"
 "f"

And of course, no matter how strange the strings contained in an interpolated array are, they become verbatim arguments, without any shell interpretation.
Julia’s backticks have one more fancy trick up their sleeve.
We saw earlier (without really remarking on it) that you could interpolate single values into a larger argument:

julia> x = "bar";

julia> `echo foo$x`
`echo foobar`

What happens if x is an array?
Only one way to find out:

julia> x = ["bar", "baz"];

julia> `echo foo$x`
`echo foobar foobaz`

Julia does what the shell would do if you wrote echo foo{bar,baz}.
This even works correctly for multiple values interpolated into the same shell word:

julia> dir = "/data"; names = ["foo","bar"]; exts=["csv","tsv"];

julia> `cat $dir/$names.$exts`
`cat /data/foo.csv /data/foo.tsv /data/bar.csv /data/bar.tsv`

This is the same Cartesian product expansion that the shell does if multiple {...} expressions are used in the same word.

Further Reading

You can read more in Julia’s online manual, including how to construct complex pipelines, and how shell-compatible quoting and interpolation rules in Julia’s backtick syntax make it both simple and safe to cut-and-paste shell commands into Julia code.
The whole system is designed on the principle that the easiest thing to do should also be the right thing.
The end result is that starting and interacting with external processes in Julia is both convenient and safe.

Distributed Numerical Optimization

By: Julia Developers

Re-posted from: http://feedproxy.google.com/~r/JuliaLang/~3/Zz1vqAz1dIw/distributed-numerical-optimization

This post walks through the parallel computing functionality of Julia
to implement an asynchronous parallel version of the classical
cutting-plane algorithm for convex (nonsmooth) optimization,
demonstrating the complete workflow including running on both Amazon
EC2 and a large multicore server. I will quickly review the
cutting-plane algorithm and will be focusing primarily on parallel
computation patterns, so don’t worry if you’re not familiar with the
optimization side of things.

Cutting-plane algorithm

The cutting-plane algorithm is a method for solving the optimization problem

$$\text{minimize} _ {x \in \mathbb{R}^d} \sum_{i=1}^n f_i(x)
$$

where the functions \( f_i \) are convex but not necessarily differentiable.
The absolute value function \( |x| \) and the 1-norm \( ||x|| _ 1 \) are
typical examples. Important applications also arise from Lagrangian
relaxation
. The idea of the algorithm is to approximate the functions \(
f_i \) with piecewise linear models \( m_i \) which are built up from
information obtained by evaluating \( f_i \) at different points. We
iteratively minimize over the models to generate candidate solution points.

We can state the algorithm as

  1. Choose starting point \( x \).
  2. For \(i = 1,\ldots,n\), evaluate \(
    f_i(x) \) and update corresponding model \( m_i \).
  3. Let the next
    candidate \( x \) be the minimizer of \( \sum_{i=1}^n m_i(x) \).
  4. If not converged, goto step 2.

If it is costly to evaluate \( f_i(x) \), then the algorithm is naturally
parallelizable at step 2. The minimization in step 3 can be computed by solving
a linear optimization problem, which is usually very fast. (Let me point out
here that Julia has interfaces to linear programming and other
optimization solvers under JuliaOpt.)

Abstracting the math, we can write the algorithm using the following Julia code.

# functions initialize, isconverged, solvesubproblem, and process implemented elsewhere
state, subproblems = initialize()
while !isconverged(state)
    results = map(solvesubproblem,subproblems)
    state, subproblems = process(state, results)
end

The function solvesubproblem corresponds to evaluating \( f_i(x) \) for a
given \( i \) and \( x \) (the elements of subproblems could be tuples
(i,x)). The function process corresponds to minimizing the model in step
3, and it produces a new state and a new set of subproblems to solve.

Note that the algorithm looks much like a map-reduce that would be easy to
parallelize using many existing frameworks. Indeed, in Julia we can simply
replace map with pmap (parallel map). Let’s consider a twist that makes
the parallelism not so straightforward.

Asynchronous variant

Variability in the time taken by the solvesubproblem function can lead to
load imbalance and limit parallel efficiency as workers sit idle waiting for new
tasks. Such variability arises naturally if solvesubproblem itself requires
solving a optimization problem, or if the workers and network are shared, as is
often the case with cloud computing.

We can consider a new variant of the cutting-plane algorithm to address this
issue. The key point is

  • When proportion \(0 < \alpha \le 1 \) of subproblems for a given candidate
    have been solved, generate a new candidate and corresponding set of
    subproblems by using whatever information is presently available.

In other words, we generate new tasks to feed to workers without needing to wait
for all current tasks to complete, making the algorithm asynchronous. The
algorithm remains convergent, although the total number of iterations may
increase. For more details, see this paper by Jeff Linderoth and
Stephen Wright.

By introducing asynchronicity we can no longer use a nice black-box pmap
function and have to dig deeper into the parallel implementation. Fortunately,
this is easy to do in Julia.

Parallel implementation in Julia

Julia implements distributed-memory parallelism based on one-sided message
passing, where process push work onto others (via remotecall) and the
results are retrieved (via fetch) by the process which requires them. Macros
such as @spawn and @parallel provide pretty syntax around this low-level
functionality. This model of parallelism is very different from the typical
SIMD style of MPI. Both approaches are useful in different contexts, and I
expect an MPI wrapper for Julia will appear in the future (see also here).

Reading the manual
on parallel computing is highly recommended, and I won’t try to reproduce it in
this post. Instead, we’ll dig into and extend one of the examples it presents.

The implementation of pmap in Julia is

function pmap(f, lst)
    np = nprocs()  # determine the number of processors available
    n = length(lst)
    results = cell(n)
    i = 1
    # function to produce the next work item from the queue.
    # in this case it's just an index.
    next_idx() = (idx=i; i+=1; idx)
    @sync begin
        for p=1:np
            if p != myid() || np == 1
                @spawnlocal begin
                    while true
                        idx = next_idx()
                        if idx > n
                            break
                        end
                        results[idx] = remotecall_fetch(p, f, lst[idx])
                    end
                end
            end
        end
    end
    results
end

On first sight, this code is not particularly intuitive. The @spawnlocal
macro creates a task
on the master process (e.g. process 1). Each task feeds work to a
corresponding worker; the call remotecall_fetch(p, f, lst[idx]) function
calls f on process p and returns the result when finished. Tasks are
uninterruptable and only surrender control at specific points such as
remotecall_fetch. Tasks cannot directly modify variables from the enclosing
scope, but the same effect can be achieved by using the next_idx function to
access and mutate i. The task idiom functions in place of using a loop to
poll for results from each worker process.

Implementing our asynchronous algorithm is not much more than a modification of
the above code:

# given constants n and 0 < alpha <= 1
# functions initialize and solvesubproblem defined elsewhere
np = nprocs() 
state, subproblems = initialize()
converged = false
isconverged() = converged
function updatemodel(mysubproblem, result)
    # store result
    ...
    # decide whether to generate new subproblems
    state.numback[mysubproblem.parent] += 1
    if state.numback[mysubproblem.parent] >= alpha*n && !state.didtrigger[mysubproblem.parent]
        state.didtrigger[mysubproblem.parent] = true
        # generate newsubproblems by solving linear optimization problem
        ...
        if ... # convergence test
            converged = true
        else
            append!(subproblems, newsubproblems)
            push!(state.didtrigger, false)
            push!(state.numback, 0)
            # ensure that for s in newsubproblems, s.parent == length(state.numback)
        end
    end
end

@sync begin
    for p=1:np
        if p != myid() || np == 1
            @spawnlocal begin
                while !isconverged()
                    if length(subproblems) == 0
                        # no more subproblems but haven't converged yet
                        yield()
                        continue
                    end
                    mysubproblem = shift!(subproblems) # pop subproblem from queue
                    result = remotecall_fetch(p, solvesubproblem, mysubproblem)
                    updatemodel(mysubproblem, result)
                end
            end
        end
    end
end

where state is an instance of a type defined as

type State
    didtrigger::Vector{Bool}
    numback::Vector{Int}
    ...
end

There is little difference in the structure of the code inside the @sync
blocks, and the asynchronous logic is encapsulated in the local updatemodel
function which conditionally generates new subproblems. A strength of Julia is
that functions like pmap are implemented in Julia itself, so that it is
particularly straightforward to make modifications like this.

Running it

Now for the fun part. The complete cutting-plane algorithm (along with
additional variants) is implemented in JuliaBenders. The code is
specialized for stochastic
programming
where the cutting-plane algorithm is known as the L-shaped
method
or Benders decomposition and is used to decompose the solution of
large linear optimization problems. Here, solvesubproblem entails solving a
relatively small linear optimization problem. Test instances are taken from the
previously mentioned paper.

We’ll first run on a large multicore server. The
runals.jl (asynchronous L-shaped) file contains the algorithm we’ll use. Its
usage is

julia runals.jl [data source] [num subproblems] [async param] [block size]

where [num subproblems] is the \(n\) as above and [async param] is
the proportion \(\alpha\). By setting \(\alpha = 1\) we obtain the
synchronous algorithm. For the asynchronous version we will take \(\alpha =
0.6\). The [block size] parameter controls how many subproblems are sent to
a worker at once (in the previous code, this value was always 1). We will use
4000 subproblems in our experiments.

To run multiple Julia processes on a shared-memory machine, we pass the -p N
option to the julia executable, which will start up N system processes.
To execute the asynchronous version with 10 workers, we run

julia -p 12 runals.jl Data/storm 4000 0.6 30

Note that we start 12 processes. These are the 10 workers, the master (which
distributes tasks), and another process to perform the master’s computations (an
additional refinement which was not described above). Results from various runs
are presented in the table below.

Synchronous Asynchronous
No. Workers Speed Efficiency

Speed Efficiency
10 154 Baseline 166 Baseline
20 309 100.3% 348 105%
40 517 84% 654 98%
60 674 73% 918 92%

Table:
Results on a shared-memory 8x Xeon E7-8850 server. Workers correspond to
individual cores. Speed is the rate of subproblems solved per second. Efficiency
is calculated as the percent of ideal parallel speedup obtained. The superlinear
scaling observed with 20 workers is likely a system artifact.

There are a few more hoops to jump through in order to run on EC2. First we must
build a system image (AMI) with Julia installed. Julia connects to workers over
ssh, so I found it useful to put my EC2 ssh key on the AMI and also set
StrictHostKeyChecking no in /etc/ssh/ssh_config to disable the
authenticity prompt when connecting to new workers. Someone will likely correct
me on if this is the right approach.

Assuming we have an AMI in place, we can fire up the instances. I used an
m3.xlarge instance for the master and m1.medium instances for the workers.
(Note: you can save a lot of money by using the spot market.)

To add remote workers on startup, Julia accepts a file with a list of host names
through the --machinefile option. We can generate this easily enough by
using the EC2 API Tools (Ubuntu package ec2-api-tools) with the command

ec2-describe-instances | grep running | awk '{ print $5; }' > mfile

On the master instance we can then run

julia --machinefile mfile runatr.jl Data/storm 4000 0.6 30

Results from various runs are presented in the table below.

Synchronous Asynchronous
No. Workers Speed Efficiency

Speed Efficiency
10 149 Baseline 151 Baseline
20 289 97% 301 99.7%
40 532 89% 602 99.5%

Table:
Results on Amazon EC2. Workers correspond to individual m1.medium instances. The
master process is run on an m3.xlarge instance.

On both architectures the asynchronous version solves subproblems at a higher
rate and has significantly better parallel efficiency. Scaling is better on EC2
than on the shared-memory server likely because the subproblem calculation is
memory bound, and so performance is better on the distributed-memory
architecture. Anyway, with Julia we can easily experiment on both.

Further reading

A more detailed tutorial
was prepared for the Julia IAP session at MIT in January 2013.

Videos from the Julia tutorial at MIT

By: Julia Developers

Re-posted from: http://feedproxy.google.com/~r/JuliaLang/~3/mHKnjWXt3KA/julia-tutorial-MIT

We held a two day Julia tutorial at MIT in January 2013, which included 10 sessions. MIT Open Courseware and MIT-X graciously provided support for recording of these lectures, so that the wider Julia community can benefit from these sessions.

Julia Lightning Round (slides)

This session is a rapid introduction to julia, using a number of lightning rounds. It uses a number of short examples to demonstrate syntax and features, and gives a quick feel for the language.

Rationale behind Julia and the Vision (slides)

The rationale and vision behind julia, and its design principles are discussed in this session.

Data Analysis with DataFrames (slides)

DataFrames is one of the most widely used Julia packages. This session is an introduction to data analysis with Julia using DataFrames.

Statistical Models in Julia (slides)

This session demonstrates Julia’s statistics capabilities, which are provided by these packages: Distributions, GLM, and LM.

Fast Fourier Transforms

Julia provides a built-in interface to the FFTW library. This session demonstrates the Julia’s signal processing capabilities, such as FFTs and DCTs. Also see the Hadamard package.

Optimization (slides)

This session focuses largely on using Julia for solving linear programming problems. The algebraic modeling language discussed was later released as JuMP. Benchmarks are shown evaluating the performance of Julia for implementing low-level optimization code. Optimization software in Julia has been grouped under the JuliaOpt project.

Metaprogramming and Macros

Julia is homoiconic: it represents its own code as a data structure of the language itself. Since code is represented by objects that can be created and manipulated from within the language, it is possible for a program to transform and generate its own code. Metaprogramming is described in detail in the Julia manual.

Parallel and Distributed Computing (Lab, Solution)

Parallel and distributed computing have been an integral part of Julia’s capabilities from an early stage. This session describes existing basic capabilities, which can be used as building blocks for higher level parallel libraries.

Networking

Julia provides asynchronous networking I/O using the libuv library. Libuv is a portable networking library created as part of the Node.js project.

Grid of Resistors (Lab, Solution)

The Grid of Resistors is a classic numerical problem to compute the voltages and the effective resistance of a 2n+1 by 2n+2 grid of 1 ohm resistors if a battery is connected to the two center points. As part of this lab, the problem is solved in Julia in a number of different ways such as a vectorized implementation, a devectorized implementation, and using comprehensions, in order to study the performance characteristics of various methods.