Author Archives: Blog by Bogumił Kamiński

One puzzle, two solutions

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/02/09/pe116.html

Introduction

Today, I wanted to switch back to a lighter subject.
Therefore I decided to have a look at my favorite Project Euler website.

I picked the problem 116 as I have not tried to solve it yet.
Interestingly, it turned out that there are two ways to approach this puzzle,
so I thought to share them here.

The post was written under Julia 1.10.0.

The puzzle

The Project Euler puzzle 116 can be briefly stated as follows:

Given a row of 50 grey squares is to have a number of its tiles replaced with
coloured oblong tiles chosen from red (length two),
green (length three), or blue (length four).
How many different ways can the grey tiles be replaced if colours
cannot be mixed and at least one coloured tile must be used?

(If you want to see some visual examples of valid tilings, I encourage you to
visit the puzzle 116 page.)

The first approach

When we think of this problem, it is natural to generalize it. By C(n, d) we can
define the number of ways that n gray squares can be replaced with tiles of length d.
Then the solution to our problem is C(n, 2) + C(n, 3) + C(n, 4).
So let us focus on computing C(n, d) (assuming d is positive).

The first approach is to ask how many tiles of length d can be put. There must be at least
1, and we cannot put more than n ÷ d (here I use the ÷ notation taken from Julia that
denotes integer division; in other words the integer part of n / d).

So now assume that we want to put i blocks of length d (assuming i is valid). In
how many ways can we do it. Well, we put i long blocks and we are left with n - d*i gray blocks.
In total we have i + (n - d*i) blocks. You can then think of it as you have that many slots
from which you need pick i slots to put the long blocks. The number of ways you can do it is
given by the value of binomial coefficient. In Julia notation it is:
binomial(BigInt(i + (n - d*i)), BigInt(i)).

Now you might ask why I put the BigInt wrapper around the passed numbers? The reason is
that binomial coefficient gets large pretty quickly, so I want to make sure I will not
have issues with integer overflow.

Given these considerations the first function that produces C(n, d) can be defined as:

function C1(n::Integer, d::Integer)
    @assert d > 0 && n >= 0
    return sum(i -> binomial(BigInt(i + (n - d*i)), BigInt(i)), 1:n ÷ d; init=big"0")
end

Note that I use the init=big"0" initialization statement in the sum to ensure the
correct handling of the cases when n < d when we are given an empty collection to sum over.

The second approach

However, there is a different way how we can think of computing C(n, d).

Assume we know the values of C(n, d) for values of n smaller than the requested one.

We look at the last tile in our row.

If it is empty, then we are down to n-1 tiles to be filled.
This can be done in C(n-1, d) ways (remember that this value takes care
of the fact that at least one block of length d has to be used).

But what if the last tile in our row is filled with a block of length d?
Then we have two options. Either all other blocks are left gray (which gives us 1 combination)
or we are left with n-d tiles that are filled with at least one block of length d. The
second value is exactly C(n-d, d).

In summary we get that C(n, d) = C(n-1, d) + C(n-d, d) + 1.

This formula assumes n is at least d. But clearly for n < d
we have 0 ways to arrange the blocks.

Let us write down the code that performs the required computation:

function C2(n::Integer, d::Integer)
    @assert d > 0 && n >= 0
    npos = Dict{Int,BigInt}(i => 0 for i in 0:d-1)
    for j in d:n
        npos[j] = npos[j-1] + npos[j-d] + 1
    end
    return npos[n]
end

Note in the code that I used the npos dictionary to flexibly allow
for any potential integer values of n. The dictionary has
Dict{Int,BigInt} type, again, to ensure that the results of the computations
are stored correctly even if they are large.

Testing

Now we have two functions C1 and C2 that look completely differently.
Do they produce the same results. Let us check:

julia> using Test

julia> @testset "test C1 and C2 equality" begin
           for n in 0:200, d in 1:20
               @test C1(n, d) == C2(n, d)
           end
       end;
Test Summary:           | Pass  Total  Time
test C1 and C2 equality | 4020   4020  0.9s

Indeed we see that both C1 and C2 functions produce the same results.

To convince ourselves that using arbitrary precision integers was indeed needed
let us check some example values of the functions:

julia> C1(200, 2)
453973694165307953197296969697410619233825

julia> C2(200, 2)
453973694165307953197296969697410619233825

julia> typemax(Int)
9223372036854775807

Indeed, if we were not careful, we would have an integer overflow issue.

Conclusions

As usual I will not show the value of the solution to the problem to encourage you
to run the code yourself. You can get it by executing either
sum(d -> C1(50, d), 2:4) or sum(d -> C2(50, d), 2:4).
(We have just checked that the value produced in both cases is the same).

One puzzle, two solutions

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/02/02/pe116.html

Introduction

Today, I wanted to switch back to a lighter subject.
Therefore I decided to have a look at my favorite Project Euler website.

I picked the problem 116 as I have not tried to solve it yet.
Interestingly, it turned out that there are two ways to approach this puzzle,
so I thought to share them here.

The post was written under Julia 1.10.0.

The puzzle

The Project Euler puzzle 116 can be briefly stated as follows:

Given a row of 50 grey squares is to have a number of its tiles replaced with
coloured oblong tiles chosen from red (length two),
green (length three), or blue (length four).
How many different ways can the grey tiles be replaced if colours
cannot be mixed and at least one coloured tile must be used?

(If you want to see some visual examples of valid tilings, I encourage you to
visit the puzzle 116 page.)

The first approach

When we think of this problem, it is natural to generalize it. By C(n, d) we can
define the number of ways that n gray squares can be replaced with tiles of length d.
Then the solution to our problem is C(n, 2) + C(n, 3) + C(n, 4).
So let us focus on computing C(n, d) (assuming d is positive).

The first approach is to ask how many tiles of length d can be put. There must be at least
1, and we cannot put more than n ÷ d (here I use the ÷ notation taken from Julia that
denotes integer division; in other words the integer part of n / d).

So now assume that we want to put i blocks of length d (assuming i is valid). In
how many ways can we do it. Well, we put i long blocks and we are left with n - d*i gray blocks.
In total we have i + (n - d*i) blocks. You can then think of it as you have that many slots
from which you need pick i slots to put the long blocks. The number of ways you can do it is
given by the value of binomial coefficient. In Julia notation it is:
binomial(BigInt(i + (n - d*i)), BigInt(i)).

Now you might ask why I put the BigInt wrapper around the passed numbers? The reason is
that binomial coefficient gets large pretty quickly, so I want to make sure I will not
have issues with integer overflow.

Given these considerations the first function that produces C(n, d) can be defined as:

function C1(n::Integer, d::Integer)
    @assert d > 0 && n >= 0
    return sum(i -> binomial(BigInt(i + (n - d*i)), BigInt(i)), 1:n ÷ d; init=big"0")
end

Note that I use the init=big"0" initialization statement in the sum to ensure the
correct handling of the cases when n < d when we are given an empty collection to sum over.

The second approach

However, there is a different way how we can think of computing C(n, d).

Assume we know the values of C(n, d) for values of n smaller than the requested one.

We look at the last tile in our row.

If it is empty, then we are down to n-1 tiles to be filled.
This can be done in C(n-1, d) ways (remember that this value takes care
of the fact that at least one block of length d has to be used).

But what if the last tile in our row is filled with a block of length d?
Then we have two options. Either all other blocks are left gray (which gives us 1 combination)
or we are left with n-d tiles that are filled with at least one block of length d. The
second value is exactly C(n-d, d).

In summary we get that C(n, d) = C(n-1, d) + C(n-d, d) + 1.

This formula assumes n is at least d. But clearly for n < d
we have 0 ways to arrange the blocks.

Let us write down the code that performs the required computation:

function C2(n::Integer, d::Integer)
    @assert d > 0 && n >= 0
    npos = Dict{Int,BigInt}(i => 0 for i in 0:d-1)
    for j in d:n
        npos[j] = npos[j-1] + npos[j-d] + 1
    end
    return npos[n]
end

Note in the code that I used the npos dictionary to flexibly allow
for any potential integer values of n. The dictionary has
Dict{Int,BigInt} type, again, to ensure that the results of the computations
are stored correctly even if they are large.

Testing

Now we have two functions C1 and C2 that look completely differently.
Do they produce the same results. Let us check:

julia> using Test

julia> @testset "test C1 and C2 equality" begin
           for n in 0:200, d in 1:20
               @test C1(n, d) == C2(n, d)
           end
       end;
Test Summary:           | Pass  Total  Time
test C1 and C2 equality | 4020   4020  0.9s

Indeed we see that both C1 and C2 functions produce the same results.

To convince ourselves that using arbitrary precision integers was indeed needed
let us check some example values of the functions:

julia> C1(200, 2)
453973694165307953197296969697410619233825

julia> C2(200, 2)
453973694165307953197296969697410619233825

julia> typemax(Int)
9223372036854775807

Indeed, if we were not careful, we would have an integer overflow issue.

Conclusions

As usual I will not show the value of the solution to the problem to encourage you
to run the code yourself. You can get it by executing either
sum(d -> C1(50, d), 2:4) or sum(d -> C2(50, d), 2:4).
(We have just checked that the value produced in both cases is the same).

Working with vectors using DataFrames.jl minilanguage

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/02/02/combine.html

Introduction

I have written in the past about DataFrames.jl operation specification syntax
(also called minilanguage), see for example this post or this post.

Today I want to discuss one design decision made in this minilanguage and its consequences.
It is related with how vectors are handled when they are returned from some transformation function.

The post was written under Julia 1.10.0 and DataFrames.jl 1.6.1.

A basic example

Consider the following example, where we want to compute a profit from some sales data:

julia> using DataFrames

julia> df = DataFrame(name=["A", "B", "C"],
                      revenue=[10, 20, 30],
                      cost=[5, 12, 18])
3×3 DataFrame
 Row │ name    revenue  cost
     │ String  Int64    Int64
─────┼────────────────────────
   1 │ A            10      5
   2 │ B            20     12
   3 │ C            30     18

julia> combine(df, All(), ["revenue", "cost"] => (-) => "profit")
3×4 DataFrame
 Row │ name    revenue  cost   profit
     │ String  Int64    Int64  Int64
─────┼────────────────────────────────
   1 │ A            10      5       5
   2 │ B            20     12       8
   3 │ C            30     18      12

The crucial point to understand here is that the - function takes
two columns "revenue" and "cost" and returns a vector.
Users typically expect, as in this example, that this vector should
be spread across several rows.

When vector spreading is not desirable?

However, there are cases, when we might not want to spread a vector
into multiple rows. Consider for example a transformation in which
we want to put "revenue" and "profit" values in a 2-element vector
per product. Intuitively we could write something like:

julia> combine(groupby(df, :name),
               All(),
               ["revenue", "cost"] => ((x,y) -> [only(x), only(y)]) => "vec")
ERROR: ArgumentError: all functions must return vectors of the same length

We get an error unfortunately. We will soon understand why, but before
I proceed let me comment on the [only(x), only(y)] part of the definition.
The only function makes sure that we have exactly one row per product.

To diagnose the issue let us drop the All() part in our call:

julia> combine(groupby(df, :name),
               ["revenue", "cost"] => ((x,y) -> [only(x), only(y)]) => "vec")
6×2 DataFrame
 Row │ name    vec
     │ String  Int64
─────┼───────────────
   1 │ A          10
   2 │ A           5
   3 │ B          20
   4 │ B          12
   5 │ C          30
   6 │ C          18

Now we understand the problem. Because our function returns a vector it gets
spread over several rows (which leads to an error as other columns of df have
a different length).

Solving the vector-spreading issue

As I have said above, most of the time vector spreading is a desired feature,
but in the example we have just studied it is not wanted.
For such cases DataFrames.jl allows you to protect vectors from being spread.
What you need to do is to call Ref function on the returned value.
This will protect the result from being spread:

julia> combine(groupby(df, :name),
               All(),
               ["revenue", "cost"] => ((x,y) -> Ref([only(x), only(y)])) => "vec")
3×4 DataFrame
 Row │ name    revenue  cost   vec
     │ String  Int64    Int64  Array…
─────┼──────────────────────────────────
   1 │ A            10      5  [10, 5]
   2 │ B            20     12  [20, 12]
   3 │ C            30     18  [30, 18]

Now, as we wanted, the entries of the "vec" columns are vectors. Wrapping the return
value of our function with Ref protected the vectors from being spread.

The alternative function that you could use to get the same effect is fill:

julia> combine(groupby(df, :name),
               All(),
               ["revenue", "cost"] => ((x,y) -> fill([only(x), only(y)])) => "vec")
3×4 DataFrame
 Row │ name    revenue  cost   vec
     │ String  Int64    Int64  Array…
─────┼──────────────────────────────────
   1 │ A            10      5  [10, 5]
   2 │ B            20     12  [20, 12]
   3 │ C            30     18  [30, 18]

or you could wrap the return value with another pair of [...]:

julia> combine(groupby(df, :name),
               All(),
               ["revenue", "cost"] => ((x,y) -> [[only(x), only(y)]]) => "vec")
3×4 DataFrame
 Row │ name    revenue  cost   vec
     │ String  Int64    Int64  Array…
─────┼──────────────────────────────────
   1 │ A            10      5  [10, 5]
   2 │ B            20     12  [20, 12]
   3 │ C            30     18  [30, 18]

What is going on here? In all three cases (Ref, fill, and [...]) we are wrapping a vector in another object that works like an outer vector.
In the case of [...] it is just a vector, fill produces a 0-dimensional array, and Ref creates a wrapper that behaves like 0-dimensional array.
In all cases DataFrames.jl treats this outer wrapper as a 1-element vector and just stores its contents in a single row (because there is one element to store).

Conclusions

I hope that you will find the example I gave today useful when transforming vectors using DataFrames.jl.