By: Christian Groll
Re-posted from: http://grollchristian.wordpress.com/2014/04/25/julia-language-recommendation/
After spending quite some time using Julia (a programming language for technical computing) during the last few months, I am confident enough to provide kind of a “letter of recommendation” by now. Hence, I decided to list some of the features that make Julia appealing to me, while also interspersing some resources on Julia that I found helpful and worth sharing.
1 It is free
Julia language is develop under the MIT open source license and hence can be used free of charge. Open source is a highly desirable feature to me, especially in research, as it promotes cooperation and interchange between researchers. That being said, Julia easily stands up to any comparison with proprietary software (like MATLAB) as well, and I hope this will become clear in the following.
2 It is fast
Although I did not make any formal performance comparisons with other programming languages so far, I can at least assure you that Julia feels quite fast on a day-to-day standard usage to me, especially in comparison to R. On the homepage, however, there are some formal benchmarks listed that indicate a really good performance compared to other languages. Of course, these are just some made-up test cases. Objective comparison in real applications, however, is quite hard to achieve, since languages like R make substantial use of C code in almost any computationally intensive package under the hood. For the sake of both efficiency and reliability, however, I think that researchers should generally avoid usage of low-level software languages like C. As most researchers did never get a true and deep training in software development, such low-level languages simply are too error-prone, especially if you refrain from any thought out extensive and well-structured software testing. So, excluding factoring out code parts into C, I am quite confident that Julia truly is faster than R, and at least equally fast as Matlab. Furthermore, Julia allegedly was designed to enable things like parallel computing and big data handling from scratch.
Nothing comes without a price, however, and hence truly leveraging performance capabilities forces you to deal with data types more explicitly. This happens in Julia to a far lower degree than in C, but it still can be un-intuitive and cumbersome at some points, and it especially complicates the learning process in the beginning. But, dealing with types more explicitly also allows some additional benefits like multiple dispatch, where the behavior of a function can be defined across many combinations of argument types.
3 It is expressive
The next selling point is probably a bit underestimated in general, since it is harder to understand its true benefit than, for example, when we are talking just about speed. The point is that the syntax of Julia is very rich and expressive. Having my roots in MATLAB, I myself generally favor the syntax of Julia (which is fundamentally similar to MATLAB) over the syntax of R, as it appears to be cleaner to me (of course: this is a matter of taste!). More generally, however, the syntax of Julia is much richer, such that it allows a high level of customization. For example, Julia is able to mimic R formula syntax as it is done in the GLM package. Furthermore, you can build your own types that behave exactly the way you want. For example, one of my first projects in Julia was to create a type that is especially suited for time series data (I wrapped it up in the package TimeData). Thereby, I could specify, for example, the way that objects are displayed, that mathematical functions do not apply on the time index column but to numeric data only, and that entries can be accessed and indexed through date strings. Also, the rich syntax includes meta-programming capabilities (generating code through code), and macros, making unit testing in Julia as straightforward as it could be (yes, I really think that software testing is indispensable when software is used and disclosed in research!). In order to get a feeling of the richness of the syntax and the infinite possibilities that it provides you could exemplarily check out a use case of iterators in Julia shown in this post. Or, for an impression on compactness and intuitive appeal of the syntax, there is a blog post on Econometrics by Simulation that compares it to R.
4 It is transparent
What I really like about Julia is that it was based on open source software development practices right away, such that it is tightly integrated with version control through git and github. This way, the complete code base is easily accessible and readable, and the github platform provides the best environment for further contributions and cooperation. For example, code improvements from any third person can easily be integrated into the code base, ultimately promoting cooperation. The times where you had to provide your code extension to some package author via email are gone. Also, github allows usage of automated testing services like Travis, such that code ultimately becomes more robust with less bugs.
5 It is illustrative
Meanwhile, there already is a bunch of graphics packages out there, for both on-the-fly visualizations and publication-ready graphics, as well as for graphics that suit for html formats:
Even more, there also exists an interface to Python’s interactive graphical notebook (IJulia), which allows to combine code, formatted text, math, and multimedia in a single document.
6 It is growing
Due to its outstanding features, Julia also seems to get increasing attention worldwide (you can find some numbers and graphics on the community in this post). With JuliaStudio, there even exists an integrated development environment similar to RStudio already.
7 It is unfinished
Despite all these positive features, there also are some deficiencies that yet need to be overcome. As a matter of fact, we have not yet reached an official 1.0 release of the language. Hence, code development sometimes can be a little bit more cumbersome than necessary, for example due to the following problems:
- sometimes Julia still crashes and needs to be re-started
- variables can not be completely removed from workspace
- in my opinion, MATLAB still has by far the best debugger – Julia does lag behind here
- any changes to type definitions require re-starting of Julia
And, of course, as a comparatively new language Julia still lacks some of the extensive libraries that already have been implemented for other languages and yet need to be imported into Julia. (That being said, there already exist quite good interfaces to other languages, in order to make some of these libraries also available in Julia).
However, keep in mind that this list of deficiencies is only a description of the state of Julia at the time of writing, and I am quite confident that it will be outdated rather soon.
8 Resources
Besides the helping material provided by the official homepage, here are some links to additional resources that I found helpful:
- a description of the state of Julia from someone who is much more involved into the development process than me
- a summary of some essential helper functions on Leah Hanson’s blog
- a more extensive summary of syntax and basic functions and applications provided by Bogumil Kaminski