Welcome to Learn With Me: Julia. A series where you can follow me along my journey of learning Julia, Data Science and Machine Learning. This series is heavily inspired by Learn With Me: Elixir, a series by Kevin Peter / The Inquisitive Developer and the format of this post will follow his introductory post for Elixir.
I realised that while there are plenty of resources about Julia already out there, it would be interesting to document my journey in picking up the language and some fundamental data science and machine learning with it.
The Julia community, to a large degree, consists of academics. The level of discourse on the Julia Slack / Zulip is often too advanced or me to understand. Researchers from all kinds of fields, space engineering, bio engineering, mathematics all come together to practice Julia. The 2020 Community Survey nicely shows this:
About You
I'm going to be writing this series for someone who has some programming knowledge and like me wants to learn about Julia. Some familiarity with computer science and algorithms will be useful and you should also have an interest in mathematics as I like to lean into the mathematics of machine learning when I'm ready to explore it with Julia.
For the rest of this section I'll quote Kevin Peter, since it also applies here:
So while this series is not meant for beginning programmers, you don't have to be a master programmer to follow along either. I very much doubt I will be delving into any advanced theoretical concepts or heavy mathematics. I'm aiming for practical stuff that a typical experienced software developer will be able to read and understand. I aim to be easily readable and informative.
If you need a resource on how to get started with Julia and a quick overview of why I chose this language you can read my Getting started with Julia post.
About Me
I left academia six years ago and have since been working with Ruby, JavaScript and Python in a professional setting almost exclusively. In my job I build web application backends and ETL pipelines and work closely with data scientists.
My personal interest in Machine Learning is what's driving me to Julia. Its expressivity over other languages like Python intrigues me and makes me think that it's only going to grow going forward.
Recently I've challenged myself to practice Julia 45 minutes every (week)day as part of a #100daysofcode challenge. I'm 25 days into this challenge and have explored the popular libraries such as Pluto, Plots, Revise and Javis.
You can find more information about who I am on the About page
I use Windows for gaming. It's been a long time since I've last done any serious development work on my Windows machine and yet I still spent a good chunk of money on building out a beefy machine for my efforts to learn machine learning. It has taken me a few months to finally sit down and get this machine ready for anything other than gaming.
With the recent announcement of GUI support for WSLg I got really excited to try out WSL and see how good the GPU support actually is, but that's not the main reason. I've been shying away from developing on Windows because I'm used to a *NIX environment. WSL gives you that, but up until recently you wouldn't have been able to interact with any GPU – and this all changed with this announcement!
You can watch the video below to see what's coming for WSL2.
The first half of this article will show you how to get everything set up and in the second half we'll set up CUDA.jl in Julia. Since the latter part is about CUDA I'll assume that you have a compatible nVidia GPU.
Installation
Here's a summary of what we need to go through to prepare our environment:
Update Windows 10 to latest release on the dev channel
Install nVidia CUDA drivers
Install Ubuntu 20.20 in WSL2
Install Linux CUDA packages
🎉
Windows 10 Insider Preview
At the time of writing all of the features are only available through the Windows Insider Program. The Windows Insider Program allows you to receive new Windows features before the hit the main update line. The program is split into three channels: Dev, Beta and Release Preview.
To receive the update with WSLg and GPU support we will need to switch to the dev channel.
Note: The dev channel comes with some rough edges and potential for system instability. Be mindful of this when you switch and make sure you have backups!
After installing all downloaded updates you should end up with OS Build 21364 or higher. You can check your OS Build by running winver in PowerShell/cmd.
With this all set we can hop on to install the latest WSL2 compatible CUDA drivers.
CUDA drivers
NVIDIA are providing special CUDA drivers for Windows 10 WSL. The link below will take you to the download page. It's required to sign up for the NVIDIA Developer Program, which is free.
Follow the setup wizard (I chose the express installation which keeps existing settings in place).
Note: There's also a documentation page provided by NVIDIA around setting up your GPU for WSL. I found the docs to be outdated and not working on my machine.
Installing Ubuntu 20.04 LTS with WSL2
Before we can proceed with installing Ubuntu I advise to update the WSL kernel by running:
wsl --update
In case you're like me and don't enjoy the default terminal Windows comes with I suggest you install Windows Terminal from the Microsoft Store. This terminal comes is a lot more pleasant to use than either cmd or the PowerShell terminal ever were.
It's also a good idea to set WSL to default to version 2:
wsl --set-default-version 2
Finally let's install Ubuntu with:
wsl --install --distribution Ubuntu-20.04
In case you were wondering what other distributions are available you can simply run: wsl --list --online.
Ubuntu 20.04 LTS
With Ubuntu installed we have a couple of final steps. First we will add an upstream repo to apt for getting the latest CUDA builds directly from NVIDIA:
We also need to add NVIDIA's GPG key for the apt repo:
And finally to make sure that we prefer the packages provided by NVIDIA over packages in mainline Ubuntu we need to pin the apt repo:
With this out of the way, we're ready to install the CUDA drivers inside our WSL Ubuntu installation:
You may pick a different target directory for your installation of Julia or use a version manager like asdf-vm.
Install Cuda.jl
At this point simply run julia in your terminal and you should be dropped into the Julia REPL. I assume you've worked with Julia before and know how to operate it's package manager.
Hit ] to enter pkg mode and install CUDA with:
activate --temp
add CUDA
From here hit backspace and import CUDA. CUDA.jl provides a useful function called functional which will confirm that we've done everything right (well, that's the hope at least, right?).
using CUDA
CUDA.functional()
You can additionally run CUDA.versioninfo() to get a more detailed breakdown of the supported features on your GPU.
At this point you should have a working installation with WSL2, Ubuntu 20.04, Julia and CUDA.jl. In case you're new to CUDA.jl I suggest you follow the excellent introduction to GPU programming by JuliaGPU or jump in at the deep end with FluxML's GPU support.
If you like articles like this one, please consider subscribing to my free newsletter where at least once a week I send out my latest work covering Julia, Python, Machine Learning and other tech.
During my early studies and later in my career I had the opportunity to hone my engineering skills in a variety of different programming languages. I started in my early teenage years with QBasic and then progressed on to more advanced languages.
In open source projects I've largely worked on JavaScript (node.js) and PHP codebases, while my professional career has taken me to Java, Ruby (on Rails) and finally Python. Little bits of exposure left and right to newer languages like Go and Rust, but the above are the main languages I've done actual work in.
Over the years I have worked on a number of optimization problems that required the use of machine learning algorithms and libraries or the plotting of graphs. The go-to language for this has naturally been Python with its large collection of data science packages of the like of numpy, scipy, matplotlib and not to mention pytorch and tensorflow.
Now there's nothing wrong with any of the above. Python is a solid language with a long history and its grown to become the language of choice for data science. For me though, Python never felt quite right. The Zen of Python makes you think the language is opinionated and has clear structure to it. Oftentimes I found the opposite to be true. Where in Ruby (on Rails) "convention over configuration" rules the thinking Python leaves most up to the developer.
Popular web frameworks like Django and Flask come with short examples on how to build an application but leave a lot unanswered when your application starts growing. Where do routes live? Where do I put my models? As a result every project will look differently making it inherently difficult to jump into new codebases.
In a similar vein Python has the basic building blocks for OOP and functional programming, but stops there. Private variables don't really exist, static methods and class methods require the use of decorators, lambda is but a crippled version of implementations in other languages like Ruby and JavaScript, map is a built-in function while reduce was demoted to the functools package. Even though the language technically supports these paradigms you will have to look hard to find them embraced in the popular packages.
Now you could argue that I'm comparing apples with oranges and to some extend I am. After all I'm comparing frameworks and not languages. To me though both go hand in hand. You can't have a language without the packages that make up said language. Just like a human language is influenced and evolves through its speakers, a programming language and the culture around it is defined by its community and ecosystem.
To be clear though – preferences differ and I'm not arguing that Python is a bad language, quite the contrary, it has built up a large following due to it being easy to learn and having a large ecosystem. It's just not a language that satisfies me because I care about structure and ultimately that's what made me look elsewhere.
Meet Julia
Julia goes back to 2009 and was started to create a fast high-level language. It wasn't until 2012 for Julia to be first announced and publicly released and version 1.0 landed 2018.
Julia has a dynamic type system, built-in package manager, multiple dispatch, performance approaching that of C and an ability to natively interface with other languages like Python. It's designed for distributed computing and makes it trivial to run algorithms on GPUs.
The ability to natively and easily interface with other languages sticks out to me. Take a look at the following example which uses PyCall to call some Python code from within Julia:
The beauty is that you can very easily mix Python and Julia this way. Python's entire ecosystem including all its great libraries are available while you can also explore Julia's ecosystem. In fact Plots.jl, a plotting library for Julia can use PyPlot as one of its backends.
A neat feature of Julia is that you can use LaTeX shortcuts in the editor. For example when you type \sqrt<TAB> most Julia editors and the Julia REPL will automatically turn this sequence into the square-root sign: √. In this case √ is also aliased to the sqrt function. This makes Julia great for mathematical applications where you can use mathematical notation in your programs.
Julia lends itself to scientific applications and machine learning. Tools like Flux.jl
While it's true that Julia is heavily used in research it doesn't mean that you can't also build run-of-the-mill web applications. Frameworks like Genie
Installing Julia
The easiest way to install Julia is to download the compatible package for your OS on the official website. Personally I tend to use asdf-vm as it allows me to install and run multiple versions of my favourite languages simultaneously.
With asdf-vm installing Julia becomes as easy as:
$ asdf plugin add julia
$ asdf install julia 1.6.0
REPL
Julia's REPL is different from what you may be used to from other languages. It comes bundled with a package manager and several different modes.
The main shortcuts that you'll want to remember are:
] gets you into package mode, from here you can add orupdate packages and manage projects
? gets you into help mode. Julia is a well documented language and the help mode will allow you to find explanations and examples for all the built-ins
; is how you can enter shell mode allowing you to use a bash-like shell without leaving the Julia interpreter
Installing packages
Installing packages is straight forward. Press ] to go into package mode and then add $PackageName to install a package named $PackageName. Note that this will install packages in the globally.
A package you'll definitely want to install globally is called Revise.jl.
Revise.jl
Revise allows you to hot reload code while working in the REPL. This is especially useful in Julia since the REPL takes a little longer to start than in other languages. With Revise you can edit code in your IDE and have it reflected in the REPL immediately.
Editors
The two main editors for Julia are Juno and Julia for VSCode. Julia for VSCode is relatively young and was introduced at the last JuliaCon in 2020. Juno is based on Atom which has seen some decline in usage over time.
Since I'm already using VSCode for all my other development using the Julia for VSCode extension was an easy decision. It has all the features present in Juno.
Conclusion
Julia is a fascinating language with a growing community and a solid package ecosystem. While the ecosystem isn't as vast as Python's, it is trivial to call out to Python, R, C and other languages.
Julia's package manager is fast, easy to use. It's a breath of fresh air when compared to the shenanigans one has to deal with in Python with pip, setuptools, conda, poetry, et.al.
The lack of object-oriented programming features is hardly noticeable once you are comfortable with multiple dispatch. In fact Julia's simplicity and well thought out type system make it easy to reason about the objects you're dealing with in code. On top of that, I find Julia very easy to learn and well positioned to attract developers, data scientists and researchers who traditionally picked Python.