Package Governance Tools on JuliaHub: Private Registries, Analytics and Policies

Re-posted from: https://info.juliahub.com/blog/high-performance-computing-for-government-innovation-0

Private Registries, Analytics and Policies on JuliaHub

The Julia language was designed from the ground-up to take the best parts of other languages; their development best practices and concepts and improve upon those capabilities. Many of the ecosystem tools that are growing as part of the Julia language are meant to accelerate the methodology of languages created decades ago. Consequently, our package ecosystem, package management methods, private registries, security, policies, and analytics were all designed with this in mind.

IdSet in Julia 1.11

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/06/28/idset.html

Introduction

We are now in RC1 phase of Julia 1.11.
One small but important addition it is making IdSet a public type.
Today I want to discuss when this type is useful.

The code was tested under Julia 1.11 RC1.

Equality in Julia

There are three basic ways to test equality in Julia:

the == operator;
the isequal function;
the === operator.

I have ordered these comparison operators by their level of strictness.

The == operator is most loose. It can return true, false or missing. The missing value is returned if any of the compared values are missing (or recursively contain missing value). For floating point numbers it assumes that 0.0 is equal to -0.0 and that NaN is not equal to NaN. Let us see the last case in action as it can be surprising if you have never seen this before:

julia> NaN == NaN
false

Next is isequal that is more strict. It guarantees to return true or false. It treats all floating-point NaN values as equal to each other, treats -0.0 as unequal to 0.0, and missing as equal to missing. It compares objects by their value, not by their identity. So, for example, two different vectors having the same contents are considered equal:

julia> v1 = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

julia> v2 = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

julia> isequal(v1, v2)
true

Finally we have ===, which is most strict. It returns true or false. However, true is returned if and only if the compared values are indistinguishable. They must have the same type. If their types are identical, mutable objects are compared by address in memory and immutable objects (such as numbers) are compared by contents at the bit level. Therefore the v1 and v2 vectors we created above are not equal when compared with ===:

julia> v1 === v2
false

You might ask about NaN. We saw that we talked about before. Here the situation is complicated. They can be equal or be not equal. Since NaN values are immutable === compares them on bit level. So we have:

julia> Float16(NaN) == Float32(NaN)
false

julia> isequal(Float16(NaN), Float32(NaN))
true

julia> Float16(NaN) === Float32(NaN)
false

julia> Float16(NaN) == Float16(NaN)
false

julia> isequal(Float16(NaN), Float16(NaN))
true

julia> Float16(NaN) === Float16(NaN)
true

Thus, you have to be careful. Each of the three comparison methods I discussed have their uses and it is well worth learning them.

Sets in Julia

Standard sets in Julia, created using the Set constructor use isequal to test for equality. Therefore we have:

julia> Set([v1, v2])
Set{Vector{Int64}} with 1 element:
  [1, 2, 3]

We see that v1 and v2 got de-duplicated because they are equal with respect to isequal since they have the same contents. This is often what the user wants.

However, sometimes we want to track actual objects (irrespective of their contents). This is especially important when working with mutable structures. In this case IdSet is useful:

julia> IdSet{Vector{Int}}([v1, v2])
IdSet{Vector{Int64}} with 2 elements:
  [1, 2, 3]
  [1, 2, 3]

Note that we needed to specify the type of the values stored in IdSet. As an exception the IdSet() is allowed (not requiring you to pass the stored object type specification) and in this case an empty IdSet{Any} is created.

Conclusions

Now you might ask when in practice IdSet is most useful. I needed it in my coding practice most often when I worked with nested mutable containers that potentially could contain circular references. In such case using IdSet allows you to easily keep track of the list of mutable objects already seen and avoid an infinite loop or stack overflow if you e.g. use recursion to work with such a deeply nested data structure.

Testing push! on Julia 1.11

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/06/21/push.html

Introduction

Currently Julia 1.11 is being in its beta testing phase.
One of the changes it introduces is redesign of internal representation of arrays.
This redesign, from the user perspective, promises to speed up certain operations.
One of the common ones that I use often is push!. Therefore today I decided to benchmark it.

The tests were performed under Julia 1.11.0-beta2 and Julia 1.10.1. The benchmarks use BenchmarkTools.jl 1.5.0.

The test

Here is the function we are going to use for our tests:

using BenchmarkTools

function test(n)
    x = Int[]
    for i in 1:n
        push!(x, i)
    end
    return x
end

This is the most basic test of the performance of push! operation.
I want to check the performance for various numbers of push! operations.

Let us run the tests first under Julia 1.11.0-beta2:

julia> @benchmark test(100)
BenchmarkTools.Trial: 10000 samples with 849 evaluations.
 Range (min … max):  129.800 ns …   1.682 μs  ┊ GC (min … max):  0.00% … 85.46%
 Time  (median):     194.582 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   232.033 ns ± 125.847 ns  ┊ GC (mean ± σ):  15.55% ± 19.08%

 Memory estimate: 1.94 KiB, allocs estimate: 4.

julia> @benchmark test(10_000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  14.400 μs …   6.167 ms  ┊ GC (min … max):  0.00% … 97.67%
 Time  (median):     30.300 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   51.585 μs ± 148.402 μs  ┊ GC (mean ± σ):  21.45% ± 10.84%

 Memory estimate: 326.41 KiB, allocs estimate: 14.

julia> @benchmark test(1_000_000)
BenchmarkTools.Trial: 808 samples with 1 evaluation.
 Range (min … max):  2.993 ms … 90.221 ms  ┊ GC (min … max):  0.00% … 95.38%
 Time  (median):     4.674 ms              ┊ GC (median):    19.53%
 Time  (mean ± σ):   6.176 ms ±  8.774 ms  ┊ GC (mean ± σ):  35.69% ± 20.33%

 Memory estimate: 17.41 MiB, allocs estimate: 24.

julia> @benchmark test(100_000_000)
BenchmarkTools.Trial: 6 samples with 1 evaluation.
 Range (min … max):  808.177 ms …   1.020 s  ┊ GC (min … max):  9.38% … 26.13%
 Time  (median):     959.266 ms              ┊ GC (median):    25.01%
 Time  (mean ± σ):   944.380 ms ± 81.448 ms  ┊ GC (mean ± σ):  22.25% ±  6.41%

 Memory estimate: 2.95 GiB, allocs estimate: 42.

Now the same tests under Julia 1.10.1:

julia> @benchmark test(100)
BenchmarkTools.Trial: 10000 samples with 199 evaluations.
 Range (min … max):  359.296 ns …  10.699 μs  ┊ GC (min … max): 0.00% … 82.66%
 Time  (median):     923.116 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   959.401 ns ± 347.833 ns  ┊ GC (mean ± σ):  2.21% ±  6.25%

 Memory estimate: 1.92 KiB, allocs estimate: 4.

julia> @benchmark test(10_000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   47.700 μs …   4.938 ms  ┊ GC (min … max): 0.00% … 94.66%
 Time  (median):     103.200 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   133.997 μs ± 158.472 μs  ┊ GC (mean ± σ):  7.43% ±  6.81%

 Memory estimate: 326.55 KiB, allocs estimate: 9.

julia> @benchmark test(1_000_000)
BenchmarkTools.Trial: 504 samples with 1 evaluation.
 Range (min … max):  6.729 ms … 88.534 ms  ┊ GC (min … max): 0.00% … 91.83%
 Time  (median):     8.773 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.924 ms ±  5.161 ms  ┊ GC (mean ± σ):  7.30% ±  9.24%

 Memory estimate: 9.78 MiB, allocs estimate: 14.

julia> @benchmark test(100_000_000)
BenchmarkTools.Trial: 4 samples with 1 evaluation.
 Range (min … max):  1.184 s …   1.394 s  ┊ GC (min … max): 8.36% … 6.56%
 Time  (median):     1.275 s              ┊ GC (median):    7.46%
 Time  (mean ± σ):   1.282 s ± 86.217 ms  ┊ GC (mean ± σ):  6.89% ± 5.29%

 Memory estimate: 1019.60 MiB, allocs estimate: 23.

Conclusions

From the tests we can see that:

The new implementation in Julia 1.11 is faster for various values of n. This is very nice.
The new implementation in Julia 1.11 does more allocations, has higher memory estimate, and, in consequence spends more time in garbage collection. This means that in cases when available RAM is scarce the code performance could be affected.

juliabloggers.com

A Julia Language Blog Aggregator

Package Governance Tools on JuliaHub: Private Registries, Analytics and Policies

IdSet in Julia 1.11

Introduction

Equality in Julia

Sets in Julia

Conclusions

Testing push! on Julia 1.11

Introduction

The test

Conclusions