Re-posted from: https://bkamins.github.io/julialang/2021/02/19/binding.html
Introduction
Today I decided to write about a feature of Julia that is well known to people
working with it a lot, but which often triggers questions from experienced
people switching to Julia from other languages.
The topic of this post is why this code fails:
julia> function f()
for i in 0:3
if i == 0
v = 0
else
v += 1
end
end
end
f (generic function with 1 method)
julia> f()
ERROR: UndefVarError: v not defined
and what to do to fix the problem.
The post was written under Julia 1.5.3.
The root cause
The reason why people ask this question is that e.g. in Python the following code
runs without any problem:
>>> def f():
... for i in range(4):
... if i == 0:
... v = 0
... else:
... v += 1
... return v
...
>>> f()
3
So what is the root cause of this difference? The reason is that in Julia
for
loop creates a new scope for the variables that are not present in the
enclosing scope (i.e. variables local to the for loop do not leak out).
Moreover, as is explained here in the Julia manual such variables
get a new binding in each iteration of the loop. So in our example although we
set v = 0
in the first iteration this value is not retained in the following
iterations of the loop.
Fixing the problem
Fortunately it is easy to fix the problem in case you wanted v
to behave
differently. Just define a local
variable in the enclosing scope like this:
julia> function f()
local v
for i in 0:3
if i == 0
v = 0
else
v += 1
end
end
return v
end
f (generic function with 1 method)
julia> f()
3
Why do we need a fresh binding of loop-local variables in each iteration?
If you thought that what Julia did in the topmost example was strange then
consider which of the following examples you find surprising (I use
comprehensions this time).
This is Python:
>>> l = [lambda : i for i in range(4)]
>>> for i in range(4):
... print(l[i]())
...
3
3
3
3
And this is Julia:
julia> l = [() -> i for i in 1:4];
julia> for i in 1:4
println(l[i]())
end
1
2
3
4
Sometimes things can get nasty
And what if we update a variable that is defined local
outside the loop?
Here are two examples:
julia> function g1()
l = []
j = 0
for i in 0:3
push!(l, () -> j)
j += 1
end
return l
end
g (generic function with 1 method)
julia> l1 = g1();
julia> for i in 1:4
println(l1[i]())
end
4
4
4
4
julia> function g2()
l = []
j = 0
for i in 0:3
push!(l, () -> j)
j = i + 1
end
return l
end
g (generic function with 1 method)
julia> l2 = g2();
julia> for i in 1:4
println(l2[i]())
end
4
4
4
4
So far we see what we would expect. Variable j
does not get a new binding
inside the loop as it is defined outside of it, so we have just reproduced the
behavior seen in Python.
However, how would you then explain this:
julia> function g3()
x = []
local j
for i in 0:3
j = i
push!(x, () -> j)
end
return x
end
g (generic function with 1 method)
julia> l3 = g3();
julia> for i in 1:4
println(l3[i]())
end
0
1
2
3
The reason is that in g1
and g2
Julia is boxing j
, while it does not in
g3
. Here are the consequences.
Consequence 1: impact on performance
Have a look at this test:
julia> function agg(fun)
s = 0
for i in 1:10^6
s += fun()
end
return s
end
agg (generic function with 1 method)
julia> agg(l1[1])
4000000
julia> @time agg(l1[1])
0.038266 seconds (999.87 k allocations: 15.257 MiB)
4000000
julia> agg(l3[1])
0
julia> @time agg(l3[1])
0.000007 seconds
0
And we see that closures created by g1
have a very bad performance, while
g3
gives us super fast closures.
Consequence 2: crazy things you can do
Since Julia is boxing j
in the case of g1
and g2
you can do the following:
julia> for i in 1:4
println(l1[i]())
end
4
4
4
4
julia> l1[1].j.contents = 100
100
julia> for i in 1:4
println(l1[i]())
end
100
100
100
100
Of course I do not recommend doing such things. By this example I just highlight
that indeed j
is boxed in this case.
Let us check:
julia> l1[1].j # boxed value
Core.Box(100)
julia> l3[1].j # just an Int
0
How we could have learned about this? You can use @code_warntype
to see what
is going on:
julia> @code_warntype g1()
Variables
#self#::Core.Compiler.Const(g1, false)
l::Array{Any,1}
j@_3::Core.Box
@_4::Union{Nothing, Tuple{Int64,Int64}}
i::Int64
#15::var"#15#16"
j@_7::Union{}
Body::Array{Any,1}
1 ─ (j@_3 = Core.Box())
│ (l = Base.vect())
│ Core.setfield!(j@_3, :contents, 0)
│ %4 = (0:3)::Core.Compiler.Const(0:3, false)
│ (@_4 = Base.iterate(%4))
│ %6 = (@_4::Core.Compiler.Const((0, 0), false) === nothing)::Core.Compiler.Const(false, false)
│ %7 = Base.not_int(%6)::Core.Compiler.Const(true, false)
└── goto #7 if not %7
2 ┄ %9 = @_4::Tuple{Int64,Int64}::Tuple{Int64,Int64}
│ (i = Core.getfield(%9, 1))
│ %11 = Core.getfield(%9, 2)::Int64
│ %12 = l::Array{Any,1}
│ (#15 = %new(Main.:(var"#15#16"), j@_3))
│ %14 = #15::var"#15#16"
│ Main.push!(%12, %14)
│ %16 = Core.isdefined(j@_3, :contents)::Bool
└── goto #4 if not %16
3 ─ goto #5
4 ─ Core.NewvarNode(:(j@_7))
└── j@_7
5 ┄ %21 = Core.getfield(j@_3, :contents)::Any
│ %22 = (%21 + 1)::Any
│ Core.setfield!(j@_3, :contents, %22)
│ (@_4 = Base.iterate(%4, %11))
│ %25 = (@_4 === nothing)::Bool
│ %26 = Base.not_int(%25)::Bool
└── goto #7 if not %26
6 ─ goto #2
7 ┄ return l
julia> @code_warntype g3()
Variables
#self#::Core.Compiler.Const(g3, false)
j::Int64
x::Array{Any,1}
@_4::Union{Nothing, Tuple{Int64,Int64}}
i::Int64
#19::var"#19#20"{Int64}
Body::Array{Any,1}
1 ─ Core.NewvarNode(:(j))
│ (x = Base.vect())
│ %3 = (0:3)::Core.Compiler.Const(0:3, false)
│ (@_4 = Base.iterate(%3))
│ %5 = (@_4::Core.Compiler.Const((0, 0), false) === nothing)::Core.Compiler.Const(false, false)
│ %6 = Base.not_int(%5)::Core.Compiler.Const(true, false)
└── goto #4 if not %6
2 ┄ %8 = @_4::Tuple{Int64,Int64}::Tuple{Int64,Int64}
│ (i = Core.getfield(%8, 1))
│ %10 = Core.getfield(%8, 2)::Int64
│ (j = i)
│ %12 = x::Array{Any,1}
│ %13 = Main.:(var"#19#20")::Core.Compiler.Const(var"#19#20", false)
│ %14 = Core.typeof(j)::Core.Compiler.Const(Int64, false)
│ %15 = Core.apply_type(%13, %14)::Core.Compiler.Const(var"#19#20"{Int64}, false)
│ (#19 = %new(%15, j))
│ %17 = #19::var"#19#20"{Int64}
│ Main.push!(%12, %17)
│ (@_4 = Base.iterate(%3, %10))
│ %20 = (@_4 === nothing)::Bool
│ %21 = Base.not_int(%20)::Bool
└── goto #4 if not %21
3 ─ goto #2
4 ┄ return x
Conclusions
The post has started-off easy, but ended with some surprising behavior.
I hope you found it useful to better understand how Julia works and how to
diagnose things.
The major take aways are:
- basic: Julia creates new bindings for loop-local variables on each iteration;
- not-basic: if you are creating closures using local variables and need them to
be fast (and you probably do if you use Julia) always check if Julia compiler
was able to prove that it does not have to do boxing as it affects both the
behavior and the performance.