Testing push! on Julia 1.11

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/06/21/push.html

Introduction

Currently Julia 1.11 is being in its beta testing phase.
One of the changes it introduces is redesign of internal representation of arrays.
This redesign, from the user perspective, promises to speed up certain operations.
One of the common ones that I use often is push!. Therefore today I decided to benchmark it.

The tests were performed under Julia 1.11.0-beta2 and Julia 1.10.1. The benchmarks use BenchmarkTools.jl 1.5.0.

The test

Here is the function we are going to use for our tests:

using BenchmarkTools

function test(n)
    x = Int[]
    for i in 1:n
        push!(x, i)
    end
    return x
end

This is the most basic test of the performance of push! operation.
I want to check the performance for various numbers of push! operations.

Let us run the tests first under Julia 1.11.0-beta2:

julia> @benchmark test(100)
BenchmarkTools.Trial: 10000 samples with 849 evaluations.
 Range (min … max):  129.800 ns …   1.682 μs  ┊ GC (min … max):  0.00% … 85.46%
 Time  (median):     194.582 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   232.033 ns ± 125.847 ns  ┊ GC (mean ± σ):  15.55% ± 19.08%

 Memory estimate: 1.94 KiB, allocs estimate: 4.

julia> @benchmark test(10_000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  14.400 μs …   6.167 ms  ┊ GC (min … max):  0.00% … 97.67%
 Time  (median):     30.300 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   51.585 μs ± 148.402 μs  ┊ GC (mean ± σ):  21.45% ± 10.84%

 Memory estimate: 326.41 KiB, allocs estimate: 14.

julia> @benchmark test(1_000_000)
BenchmarkTools.Trial: 808 samples with 1 evaluation.
 Range (min … max):  2.993 ms … 90.221 ms  ┊ GC (min … max):  0.00% … 95.38%
 Time  (median):     4.674 ms              ┊ GC (median):    19.53%
 Time  (mean ± σ):   6.176 ms ±  8.774 ms  ┊ GC (mean ± σ):  35.69% ± 20.33%

 Memory estimate: 17.41 MiB, allocs estimate: 24.

julia> @benchmark test(100_000_000)
BenchmarkTools.Trial: 6 samples with 1 evaluation.
 Range (min … max):  808.177 ms …   1.020 s  ┊ GC (min … max):  9.38% … 26.13%
 Time  (median):     959.266 ms              ┊ GC (median):    25.01%
 Time  (mean ± σ):   944.380 ms ± 81.448 ms  ┊ GC (mean ± σ):  22.25% ±  6.41%

 Memory estimate: 2.95 GiB, allocs estimate: 42.

Now the same tests under Julia 1.10.1:

julia> @benchmark test(100)
BenchmarkTools.Trial: 10000 samples with 199 evaluations.
 Range (min … max):  359.296 ns …  10.699 μs  ┊ GC (min … max): 0.00% … 82.66%
 Time  (median):     923.116 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   959.401 ns ± 347.833 ns  ┊ GC (mean ± σ):  2.21% ±  6.25%

 Memory estimate: 1.92 KiB, allocs estimate: 4.

julia> @benchmark test(10_000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   47.700 μs …   4.938 ms  ┊ GC (min … max): 0.00% … 94.66%
 Time  (median):     103.200 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   133.997 μs ± 158.472 μs  ┊ GC (mean ± σ):  7.43% ±  6.81%

 Memory estimate: 326.55 KiB, allocs estimate: 9.

julia> @benchmark test(1_000_000)
BenchmarkTools.Trial: 504 samples with 1 evaluation.
 Range (min … max):  6.729 ms … 88.534 ms  ┊ GC (min … max): 0.00% … 91.83%
 Time  (median):     8.773 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.924 ms ±  5.161 ms  ┊ GC (mean ± σ):  7.30% ±  9.24%

 Memory estimate: 9.78 MiB, allocs estimate: 14.

julia> @benchmark test(100_000_000)
BenchmarkTools.Trial: 4 samples with 1 evaluation.
 Range (min … max):  1.184 s …   1.394 s  ┊ GC (min … max): 8.36% … 6.56%
 Time  (median):     1.275 s              ┊ GC (median):    7.46%
 Time  (mean ± σ):   1.282 s ± 86.217 ms  ┊ GC (mean ± σ):  6.89% ± 5.29%

 Memory estimate: 1019.60 MiB, allocs estimate: 23.

Conclusions

From the tests we can see that:

  • The new implementation in Julia 1.11 is faster for various values of n. This is very nice.
  • The new implementation in Julia 1.11 does more allocations, has higher memory estimate, and, in consequence spends more time in garbage collection. This means that in cases when available RAM is scarce the code performance could be affected.