How much do collections of allocated objects cost?

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2021/06/11/vecvec.html

Introduction

In my recent comment on StackOverflow I have said that using a vector of
vectors is: a) slow, b) uses more memory than needed, and c) puts much more
stress on garbage collection. Having read it one of my students asked me to
expand on this issue. In this post I want to give you some examples that were
designed to clarify this issue.

The post was tested under Julia 1.6.1 on Linux with a machine having 16 GB of
RAM (the last point affects the frequency of triggering GC).

In the post we will compare the same operations for vector of vectors and
vector of tuples defined using the following functions:

test1(n) =[rand(2) for _ in 1:n]
test2(n) =[(rand(), rand()) for _ in 1:n]

The difference is that test1 creates a vector of references to dynamically
allocated objects, while in test2 the tuples are stored directly in the vector.

Let us test all three claims. In the comparisons I use a fresh session in
each code block (as the examples given use a lot of memory). This means that
the timings will include compilation time, but the size of the computations is
large enough so that this is relatively negligible.

Examples

Start with a vector of vectors:

julia> test1(n) =[rand(2) for _ in 1:n]
test1 (generic function with 1 method)

julia> @time t1 = test1(10^8);
 21.907014 seconds (100.00 M allocations: 9.686 GiB, 67.93% gc time)

julia> @time sum(x -> x[1], t1);
  0.768779 seconds (179.18 k allocations: 11.105 MiB, 7.44% compilation time)

julia> @time Base.summarysize(test1(10^7))
262.832256 seconds (50.06 M allocations: 2.864 GiB, 4.58% gc time, 0.02% compilation time)
640000040

julia> @time GC.gc()
  4.631474 seconds (100.00% gc time)

julia> @time GC.gc()
  2.647404 seconds (100.00% gc time)

julia> @time GC.gc(false)
  0.004821 seconds (99.89% gc time)

julia> @time GC.gc(false)
  0.005526 seconds (99.89% gc time)

And now vector of tuples (fresh Julia session):

julia> test2(n) =[(rand(), rand()) for _ in 1:n]
test2 (generic function with 1 method)

julia> @time t2 = test2(10^8);
  1.458052 seconds (25 allocations: 1.490 GiB, 0.36% gc time)

julia> @time sum(x -> x[1], t2);
  0.164072 seconds (178.23 k allocations: 11.036 MiB, 35.07% compilation time)

julia> @time Base.summarysize(test2(10^7))
  0.190496 seconds (24.27 k allocations: 153.937 MiB, 3.25% gc time, 17.60% compilation time)
160000040

julia> @time GC.gc()
  0.070696 seconds (99.99% gc time)

julia> @time GC.gc()
  0.102222 seconds (99.99% gc time)

julia> @time GC.gc(false)
  0.000517 seconds (99.13% gc time)

julia> @time GC.gc(false)
  0.000523 seconds (98.97% gc time)

As you can see:

  • creation of vector of vectors is much slower; in particular a lot of small
    allocations happens (which is expensive) and in total also more memory is
    allocated.
  • a simple aggregation with sum is also slower because for vector of vectors
    we have to go through references to objects (which takes time) which also
    means that this is less CPU cache friendly.
  • With Base.summarysize we can check that using vectors also uses up much more
    memory; also as a side issue we learn that functions like Base.summarysize
    which traverse the tree of object references are much, much slower for a
    vector of vectors.
  • Finally both full sweep GC.gc() and incremental sweep GC.gc(false) are
    slower with vector of vectors; this is especially visible for full sweep case
    (fortunately it is triggered less often in normal usage). The important thing
    to note here is that for garbage collection time to be affected it is enough
    that the vector of vectors is somewhere in the memory; it does not have to be
    used in some operation you do.

Conclusions

The conclusion is something that seasoned Julia developers know very well:
avoid having many small allocated objects in your Julia programs. Having read
this post I hope you now have a better understanding what aspects of the
performance of your code can be affected when you have to use such data.

Before I finish let me add that collections of any mutable objects will be
affected by the same issue, and even some special immutable objects (like
String, or to some extent e.g. Symbol) can cause issues like presented
in this post.