Tag Archives: Python

How to call Julia code from Python

By: Abel Soares Siqueira

Re-posted from: https://blog.esciencecenter.nl/how-to-call-julia-code-from-python-8589a56a98f2?source=rss----ab3660314556--julia

A three-part series on achieving high performance with high-level code

By Abel Soares Siqueira and Faruk Diblen

Caption: Astronaut carrying Python and Julia. Photo by Brian McGowan on Unsplash (https://unsplash.com/photos/MR9xsNWVKvo), modified by us.

Target audience

This is the first post in a three-part series about achieving high performance with high-level code. This series is aimed at people working with Python who needs better performance but prefers not to develop a low-level performant library.

Introduction

Having recently joined the Netherlands eScience Center after working with research using the Julia language for seven years, I was excited to highlight some of its cool features. At the eScience Center, many of our engineers use Python, some use C or C++, and in some cases, we see Python calling C++ code, to speed up the code. This was the perfect opportunity to introduce Julia’s interoperability with Python and to investigate whether we could achieve comparable speed by calling Julia code in Python. This means that for situations where Python’s performance is not sufficient, we can speed it up with another high-level language, avoiding the use of a low-level language like C++. This series of three blog posts will investigate these topics.

In this first post, we will learn how to call a Julia code from Python. We will set up the environment and show some examples. In the second post, we will take a problem that was solved by using Python in combination with C++ to speed up the code. We will replace the C++ code with a Julia code and compare the performance. In the third post, we will solve the same problem in Julia, optimize the Julia code to reach its maximum performance and compare it with the implementation in the second post.

What is Julia, and how does it compare to Python?

What is Julia? Julia is a high-performance, high-level programming language. It was created a few years ago with the ambitious goal of being fast with a high-level syntax, and it has been mostly successful. It can, in a few cases, reach the speed of low-level programming languages like C. For more information, the julialang.org site is a great first stop.

One of the most frequently asked questions is: “how does it compare to Python or some other programming language in terms of performance?”. The short answer: Julia is generally faster than Python and many other programming languages.

The performance of a Python code can be optimized, but even the optimized code usually underperforms compared to a Julia version of the same code. The performance increase of a Python code can be achieved in a few ways, but a frequent one is to call a code written in a low-level language, such as C, C++ and Fortran, like NumPy which does its calculations mostly in those low-level languages. What is less common, but also possible, is to call Julia from Python. In this post, we are going to show you how to do that!

Before we forget, all the code used in this post can be found in our GitHub repository. We have also created a Docker image that includes a ready-to-use environment to run both Julia and Python. To run that environment with Python 3.10 and Julia 1.6, install Docker and run the following in your terminal:

$ docker pull abelsiqueira/python-and-julia:py3.10-jl1.6
$ docker run -it exec abelsiqueira/python-and-julia:py3.10-jl1.6 /bin/bash

Preparation

In the following steps, we will configure our system to execute Julia code from Python. To learn more about this topic, the documentation for the packages we describe below is a great starting point. You will need four things:

  1. Python distribution compiled with shared libpython option. There are workarounds, but this is the most straightforward way.
  2. Julia, the executable that runs the Julia language.
  3. PyCall, the Julia package that defines the conversions between Julia and Python.
  4. PyJulia, the Python package to access Julia from Python.

We are going to go through the installation and configuration of these steps on a Linux system. It will be very similar on MacOS or with WSL for Windows once the required tools are installed.

Step 1: Python with shared libpython

To check whether the Python distribution is compiled with –enable-shared option, we run:

ldd $(which python3) | grep libpython

If the output is something like:

libpython3.10.so.1.0 => /usr/local/lib/libpython3.10.so.1.0 (0x00007f567e548000)

… then we are good to go! If we get nothing, that means that the Python distribution has not been compiled with the desired flags. In this case, we can compile our own Python distribution with the flag –enable-shared, which takes some time but is mostly straightforward. This Dockerfile has the instructions. Remember that if you just want to test it out, you can run the Docker image as mentioned in the previous section.

Step 2: Julia and PyCall

Now, we will install Julia. We recommend using jill, a script I created, which downloads and installs a specific version of Julia, but Julia can also be installed via the official binaries or package managers. In this post, we use version 1.6.5, which is the current Long Term Support version at the time of writing. Most likely this will work with a newer version as well. To install Julia 1.6.5 using jill, we run:

$ wget https://raw.githubusercontent.com/abelsiqueira/jill/v0.4.0/jill.sh
$ sudo bash jill.sh -y -v 1.6.5

Now, we will install PyCall and configure it to use the correct Python version. We start Julia by running julia in the terminal, and then we set the ENV["PYTHON"] variable:

$ julia
julia> ENV["PYTHON"] = "PATH/TO/python"

Here we use the full path to Python’s executable. In our case, it is the Python distribution we compiled from the source code. You could change the path according to your configuration.

Now, we will install PyCall using Pkg, Julia’s package manager:

julia> using Pkg
julia> Pkg.add("PyCall")

Step 3: PyJulia

As the last step, we must install the Python package to talk with Julia. First, use pip, Python’s package manager, to install the package PyJulia — remember to use the same Python passed to ENV["PYTHON"]:

$ python3 –m pip install julia

To finalize configuring the communication between Julia and Python, we run the following in the Python interpreter:

$ python3
>>> import julia
>>> julia.install()

If we had more than one Julia version on our system, we could specify it with an argument:

>>> julia.install(julia='julia-1.6.5')

We test the installation running the following in the Python interpreter run:

>>> from julia import Main
>>> Main.eval('[x^2 for x in 0:4]')

Showcasing PyJulia

Basics

  • To use a Julia module, use from julia import MODULE
  • To evaluate a command, import Main and use Main.eval("…")
  • To create and use variables, use Main.VARIABLE
  • To install Julia packages, import Pkg and use Pkg.add("Package")
  • Use %load_ext julia.magic to add a IPython’s magic command called %julia. Just prepend %julia to Julia commands. In this case, use $var to access python variables

Example: Linear Algebra

In this short example, we can see one of the strengths of Julia syntax for Linear Algebra. A random linear system is created and solved. The result is checked with NumPy, so we can see the compatibility.

We have chosen to define A, b and x in three different ways, to show the different syntaxes. The definition of A occurs completely inside the eval block. The variable A is created and is available inside the Julia scope, or as Main.A. The definition of b uses the Main.b access directly and uses the result of Main.eval. Finally, %julia is the magic IPython command to simply use Julia syntax directly.

We can quickly compare the timing of solving the system with Julia’s backslash command and Numpy’s linalg.solve:

Example: Automatic differentiation

The next example installs and uses the package called ForwardDiff, which performs automatic differentiation. ForwardDiff defines a Julia type called Dual internally, so we can’t use it with Python functions because Python functions are not compatible with that type. However, we can define Julia functions and use them.

The local minimum of the quadratic occurs at 2.5, so the derivative at 2.5 is 0.0.

Another, more interesting interaction is below, in which we create a function g inside Julia, and define functions for its derivatives there. Then we create a Python function with the Taylor expansion around the value a. Furthermore, we use Matplotlib, Python’s plotting library to visualize the results coming from Julia. Pretty neat, right?

Image generated above, showing the function f and its third-order Taylor approximation.

Next episodes

Now that we can call Julia code in Python, we are prepared to move to our next adventure: improve the speed of a Python code by calling Julia from it. Follow our medium account to get notified when Part 2 goes live.

Many thanks to our proofreaders and reviewers, Elena Ranguelova, Jason Maassen, Jurrian Spaaks, Patrick Bos, Rob van Nieuwpoort, Stefan Verhoeven, and Veronica Pang.


How to call Julia code from Python was originally published in Netherlands eScience Center on Medium, where people are continuing the conversation by highlighting and responding to this story.

Django 3.2 Async Support: What You Need to Know

By: Logan Kilpatrick

Re-posted from: https://logankilpatrick.medium.com/django-3-2-async-support-what-you-need-to-know-5864a7b0259c?source=rss-2c8aac9051d3------2

Since the release of Django 3.0 in December of 2019, the wildly popular web framework has begun the journey to support Async through ASGI…

Writing type-stable Julia code

By: Philippe Mainçon

Re-posted from: https://blog.sintef.com/industry-en/writing-type-stable-julia-code/

Julia code
This text presents type stability, which is one of the important concepts that one needs to understand in order to write high-performance Julia code.

This text is aimed at Julia users that are familiar with composite types, abstract types, functions, methods and multiple dispatch. At the same time, as little advanced Julia syntax as possible is used, to make the text accessible.

To type, or not to type

The developers of Julia wanted to solve the two-language problem. They have achieved this and produced a language that “walks like Python and runs like C”. Julia “walks like Python”, because it is not necessary to systematically define the type of every variable that appears in the code. It “runs like C” because it is a compiled language, and produces (or rather, can produce) highly efficient machine code.

Python and MATLAB are examples of interpreted language. In a pure interpreted language, the type of the variables is computed at run time, at the same time as the value of the variables. As long as the values of the inputs to the code are known at the top level (in the REPL or the top script), the interpretation infers, step by step the type of the variables, all the way down the call stack. This allows to write functions without specifying types, and this in turn allows to write generic code (for example an iterative solver that works just as well with Float64 and Float32 variables). The disadvantage is that inferring the type of variables on the fly introduces significant overhead at run time.

At the other end of the scale C and Fortran are examples of strictly typed compiled languages. Because the source code specifies the type of every variable in the function (both variables in the function interface, and local variables), the compiler can create efficient machine code for each function, just by considering the code of that function alone. The disadvantage is that type declaration takes time to write and clutters the source code, and (unless the language offers “templates”, as C++ does), it may be necessary to write several methods, identical in all but types of variables, to make an algorithm available to various data types.

Julia’s approach to type specifications

Julia takes the sweet spot in between, not requiring to specify the type of each variable, yet producing fast machine code. The trick is as follows: every time a method is called (so, at run time), with a combination of concrete types of arguments that has not yet been encountered for this method, the compiler quicks in. A “concrete type” is the information returned by typeof() when called on a variable. One example is Float64. This is as opposed to an abstract type, like Real, which is a set of concrete types, and includes Float64 and Float32. In the rest of this text “type” will refer to “concrete type”.

The compiler now has the source code of the method, and the types of all the arguments. The compiler will produce a method instance (or instance, for short), which is machine code for this combination. One interesting implication is that writing strictly typed method interfaces in Julia does not provide any improvement of machine code performance: the compiler takes the type of the arguments from the calling context anyway. A strictly typed interface has the disadvantage of offering no flexibility. A method that only accepts a Vector will not accept other vector-like things like a SubArray (an array view), a Adjoint (a transposed array), a SparseMatrix or a StaticArray, even thought the method probably implements an algorithm that would compile perfectly well for all of these.

However, providing partial specification of the type of the arguments of a method serves important purposes in Julia:

  1. If a function has several methods, it allows to specify which method should be executed (multiple dispatch). This is where abstract types like Real, AbstractVector and AbstractVector{<:Real} come into their own.
  2. It improves code readability, stating for example “this method expects some vector of some real numbers – but not a string”.
  3. It provides more graceful failures: “function foo has no method that takes in a string” is more informative that some esoteric failure down the line when attempting to add two strings.

What is type stability?

If the source code of the method is well written, the source code and the concrete type of all arguments is enough information for the compiler to infer the concrete type of every variable and expression within the method. The method is then said to be “typestable”, and the Julia compiler will produce efficient code.

If, for a variety of reasons that will be studied in the following, the type of a local variable cannot be inferred from the types of the arguments, the compiler will produce machine code full of “if”s, covering all options of what the type of each variable could be. The loss in performance is often significant, easily by a factor of 10.

If you are yourself able to infer the type of every local variable, and every expression in a method (or script) from the types (not the values) of the arguments or from constants in the code, the function will be typestable. Actually, as will be seen below, this inference of types is also allowed access to struct declarations, and to the types of the return values of functions called by the function you are studying.

The rest of this text will examine a variety of situations, ranging from obvious to more tricky, in which it is not possible to infer the types of local variables from the types of the arguments, resulting in type instability.

For this purpose, it will be useful to write down the information available to the compiler. So for example, if the method

function add(a::Number,b::Number)
    c = a+b
    return c
end

is called with a of type Float64 and b of type Int32, then we will write the information available to the compiler to create an instance as

instance add(a::Float64,b::Int32)
    c = a+b
    return c
end

instance is not Julia syntax, it is just a notation introduced in this text to describe an instance. In such instance description, a concrete type must be associated with every argument.

Example of Julia programming language code
Julia is a high-level, high-performance, dynamic programming language.

If, then

Consider the following method instance

instance largest(a::Float64,b::Int64)
    if a > b
        c = a
    else
        c = b    
    end
    return c
end

The variable c will be set to either a or b. c will take the value and the type of either a or b. The type of c depends on an operation a > b on the values of a and b: the type of c cannot be inferred from the type of arguments alone, and this code is not typestable.

Several approaches might be relevant to prevent type instability. The simplest is to code largest so that it only accepts two arguments of the same type.

function largest(a::R,b::R) where{R<:Real}
    if a > b
        c = a
    else
        c = b    
    end
    return c
end

The method is general, it can result in the generation of method instances like instance largest(a::Float64,b::Float64), instance largest(a::Int64,b::Int64) and many others. It cannot result in the generation of machine code for instance largest(a::Float64,b::Int64) (because R cannot be both Int64 and Float64). If we need to be able to handle variables of different types, yet want type stability, a solution is to use promotion to ensure that c is always of the same type.

function largest(a,b)
    pa,pb = promote(a,b)
    if a > b
        c = pa
    else
        c = pb  
    end
    return c
end

promote is defined so that pa and pb have the same type, and this type is inferred from the types of a and b. For example, for a call instance largest(a::Float64,b::Int64), the types of pa, pb and c will be Float64, to which one can convert a Int64 variable without loss of information (well, mostly).

Do not allow an if-then construct to return a variable which type depends on the branch taken.

Method return value

A method foo that would call the above first, not typestable, version of the method instance largest would receive as output a variable of a type that is value dependent: foo itself would not be typestable. The workaround here is to create typestable methods for largest, as suggested above.

One example is the method Base.findfirst(A), which given a Vector{Boolean} returns the index of the first true element of the vector. The catch is that if all the vector’s elements are false, the method returns nothing. nothing is of type Nothing, while the index is of type Int64. Using this method will make the calling method not typestable.

Avoid methods that return variables of value-dependant types.

Array of abstract element type

Consider the following code

v = [3.,1,"Hello world!"]
function showall(v)
    for e ∈ v
        @show e
    end
end
showall(v)

The above call showall(v) generates a method instance

instance showall(v::Array{Any,1})
    for e ∈ v
        @show e
    end
end

The concrete type of e cannot be inferred from Array{Any,1}, because Any is not a concrete type. More specifically, the type of e changes from one iteration to the next: the code is not typestable. If v is of type Array{Any,1}, even if V is has elements that are all of the same type, this does not help:

v = Vector{Any}(undef,3)
v[1] = 3.
v[2] = 1.
v[3] = 3.14
showall(v)

e may have the same type at each iteration, but this type still cannot be inferred from the type Array{Any,1} of the argument.

If we define w = randn(3), w has type Array{Float64,1}. This is much more informative: every element of w is known to have the same concrete type Float64. Hence the call showall(w) generates a method instance

instance showall(v::Array{Float64,1})
    for e ∈ v
        @show e
    end
end

and the compiler can infer that e is a Float64.

Wherever possible use arrays with a concrete element type.

Sometimes, the use of array with abstract element type is deliberate. One may really wish to iterate over a heterogeneous collection of elements and apply various methods of the same function to them: we design for dynamic dispatch, and must accept that the process of deciding which method to call takes time. Two techniques can be used to limit the performance penalty.

The first is the use of a “function barrier”: The loop over the heterogenous array should contain as little code as possible, ideally only the access to the arrays element, and the call to a method.

for e ∈ v
    foo(e)
end

If v contains elements of different type, the loop is not typestable and hence slow. Yet each value of e at each iteration has its unique concrete type, for which an instance of foo will be generated: foo can be made typestable and fast.

The second, a further improvement of the first, is to group elements by concrete type, for example, using a heterogenous arrays of homogeneous arrays.

vv = [[1.,2.,3.],[1,2]]
for v ∈ vv  # outerloop
    innerloop(v)
end
function innerloop(v)
    for e ∈ v
        foo(e)
    end
end

Here vv is an Array{Any,1}, containing two vectors of different types. vv[1] is a Array{Float64,1} and vv[2] is a Array{Int64,1}. Function innerloop is called twice and two instances are generated

instance innerloop(v::Array{Float64,1})
    for e ∈ v  # e is Float64
        foo(e)
    end
end
instance innerloop(v::Array{Int64,1})
    for e ∈ v  # e is Int64
        foo(e)
    end
end

and in both instances, the type of e is clearly defined: the instances are typestable.

The with this second approach is that the loop for v ∈ vv has few iterations (if the number of types is small compared to the number of elements in each types).

Structure of abstract field type

A similar loss of type stability arises when reading data from structures that have a field of abstract type:

struct SlowType
    a
end
struct JustAsBad
    a::Real
end
struct MuchBetter
    a::Float64
end
function show_a(s)
    @show s.a
end
show_a(SlowType(3.))
show_a(JustAsBad(3.))
show_a(MuchBetter(3.))

The first call to show_a generates

instance show_a(s::SlowType)
    @show s.a # The concrete type of field a of type SlowType cannot be inferred from the definition of SlowType
end

The second call to show_a has the same problem. The third call generates a typestable instance

instance show_a(s::Better)
    @show s.a # That's a Float64
end

It is often interesting to create structures with fields that can have various types. A classic example is Julia’s Complex type, which can have real and imaginary components which are either both Float64, both Float32 or other more exotic choices. This can be done without losing type stability by using parametric types:

struct FlexibleAndFast{R}
    a::R
end
show_a(FlexibleAndFast(3.))
show_a(FlexibleAndFast(3 ))

The above calls generate two typestable instances of show_a

instance show_a(s::FlexibleAndFast{Float64})
    @show s.a # That's a Float64
end
instance show_a(s::FlexibleAndFast{Int64})
    @show s.a # That's an Int64
end

Always use struct with fields of concrete types. Use parametric structure where necessary.

A note on constructors for parametric types

Consider a struct definition without inner constructor:

struct MyType{A,B}
    a::A
    b::B
end

Julia will automatically generate a constructor method with signature

MyType{A,B}(a::A,b::B)

Julia will also produce another method with signature

MyType(a::A,b::B)

because for MyType, it is possible to infer all type parameters from the types of the inputs to the constructor. Other constructors like

MyType{A}(a::A,b::B)

have to be defined explicitly (how should the compiler decide whether to interpret a single type-parameter input as A or B…).

Consider another example:

struct MyType{A,B,C}
    a::A
    b::B
end

Julia will automatically generate a constructor method with signature

MyType{A,B,C}(a::A,b::B)

but will not generate other methods. A method like

MyType{C}(a::A,b::B)

would have to be defined explicitly.

StaticArray

Julia Arrays are an example of parametric type, where the parameters are the type of elements, and the dimension (the number of indices). Importantly, the size of the array is not part of the type, it is a part of the value of the array.

The package StaticArrays.jl provides the type StaticArray, useful for avoiding another performance problem: garbage collection that follows the allocation of Arrays on the heap. This is because StaticArray are allocated on the stack, simplifying runtime memory management.

using StaticArrays
SA = SVector{3,Float64}([1.,2.,3.])
SA = SVector(1.,2.,3.)
SA = SVector([1.,2.,3.])

The first call to SVector is typestable: all the information needed to infer the type of SA is provided in curly braces. The second call is typestable too, because the compiler can deduce the same information from the type and number of inputs. The third call is problematic: while the type of the elements of SA can be inferred by the compiler, the length of [1.,2.,3.] is part of this array’s value, not type. The type of SA has a parameter that depends on the value (the size) of the argument passed to the constructor. Not only does this generate an instance of the constructor that is not type stable, but the non-inferable type of SA “contaminates” the calling code with type instability.

Val

What if we want to write a function that takes an Vector as an input, processes it (for example just keeps it as it is), and returns a SVector of the same shape. Of course we want this function to be general and not be limited to a given array size and we want this function to be typestable, for good performance.

First attempt:

function static(v::Vector)
    return SVector{length(v),eltype(v)}(v)
end

This function is not typestable. It constructs a variable of type StaticArray{(3,),Float64}, where 3 is obtained as the length of v, and the length is part of the value of an Array. Value-to-type alarm!

One possible solution is to use Val. Let us say that static is called by a function foo within which the length of v can be inferred at compile time. We could create the following code

function static(v,::Val{L}) where{L}
    return SVector{L,Float64}(v)
end
function foo()
    Val3 = Val(3)
    Val4 = Val(4)
    @show static([1.,2.,3.]   ,Val3)
    @show static([1.,2.,3.,4.],Val4)
end

The call Val(3) generates a variable, of type Val{3}. Clearly, Val as a function is not typestable, since it creates a variable of a type depending on the value of its argument.

However, function foo is typestable. This may come as a surprise, but two things conspire to allow this:

  1. The source code of foo explicitly mentions the constants 3 and 4, and the compiler has access to it.
  2. The compiler is greedy – it evaluates at compile time whenever possible. Hence the call Val(3) is evaluated during compilation, and Val3 is known to the compiler to be a a value-empty variable of type Val{3}.

In foo, the method static is called twice, leading to the generation of two typestable instances

instance static(v,::Val{3})  
    return SVector{3,Float64}(v)
end
instance static(v,::Val{4})
    return SVector{4,Float64}(v)
end

What if the length of the vectors is not defined as a constant in foo? If this length is the result of some computation, the call to Val with not be typestable. If foo is high enough in the call hierarchy, and outside any time-critical loop, this is not an issue: only foo will not be typestable, but functions that it calls can still be typestable (cf. the function barrier pattern).

Val allows to move type instability up the call hierarchy, or eliminate it altogether.

Anonymous function

Before Julia 1.6.0, the output type of an anonymous function was known to the compiler as Any. As a consequence, the following code was not typestable:

function foo(f)
    @show f(randn())
    return
end
foo(x->0)

However the following code was

function f(x)
    return 0
end
foo(f)

This is not an issue with Julia 1.6, and anonymous functions can be used.

@code_warntype

One important tool to check that an instance is typestable is the macro @code_warntype. For example

v = randn(3)
@code_warntype Val(length(v))
@code_warntype static(v,Val(length(v)))

The first invocation of @code_warntype outputs a semi-compiled code, and highlights some of the types in red: the call Val(3) is not typestable. The second invocation of @code_warntype produces an output in which all types are highlighted in blue: the call to static is typestable. Note that @code_warntype only analyses the compilation of the outermost function static – given the arguments v and Val(length(v)).

Profiler.jl

Profiler.jl provides a graphical representation of where processor time goes, in which code that is not typestable is highlighted. Output from the profiler often shows how type instability propagates: a single variable that is not typestable makes “anything it touches” type unstable.