Author Archives: Julia on μβ

DifferentialEquations.jl – part 2: decision from the model

By: Julia on μβ

Re-posted from: https://mbesancon.github.io/post/2017-12-20-diffeq-julia2/

In the last article, we explored different modeling options for a
three-component systems which could represent the dynamics of a chemical
reaction or a disease propagation in a population. Building on top of this
model, we will formulate a desirable outcome and find a decision which
maximizes this outcome.

In addition to the packages imported in the last post,
we will also use BlackBoxOptim.jl:

import DifferentialEquations
const DiffEq = DifferentialEquations
import Plots
import BlackBoxOptim

The model

The same chemical system with three components, A, B and R will be used:
$$A + B → 2B$$ $$B → R$$

The reactor where the reaction occurs must remain active for one minute.
Let’s imagine that $B$ is our valuable component while $R$ is a waste.
We want to maximize the quantity of $B$ present within the system after one
minute, that’s the objective function. For that purpose, we can choose to add
a certain quantity of new $A$ within the reactor at any point.
$$t_{inject} ∈ [0,t_{final}]$$.

Implementing the injection

There is one major feature of DifferentialEquations.jl we haven’t explored yet:
the event handling system.
This allows for the system state to change at a particular point in time,
depending on conditions on the time, state, etc…

# defining the problem
const α = 0.8
const β = 3.0
diffeq = function(t, u, du)
    du[1] = - α * u[1] * u[2]
    du[2] = α * u[1] * u[2] - β * u[2]
    du[3] = β * u[2]
end
u0 = [49.0;1.0;0.0]
tspan = (0.0, 1.0)
prob = DiffEq.ODEProblem(diffeq, u0, tspan)

const A_inj = 30
inject_new = function(t0)
    condition(t, u, integrator) = t0 - t
    affect! = function(integrator)
        integrator.u[1] = integrator.u[1] + A_inj
    end
    callback = DiffEq.ContinuousCallback(condition, affect!)
    sol = DiffEq.solve(prob, callback=callback)
    sol
end

# trying it out with an injection at t=0.4
sol = inject_new(0.4)
Plots.plot(sol)

Injection simulation

The ContinuousCallback construct is the central element here, it takes as
information:

  • When to trigger the event, implemented as the condition function. It triggers
    when this function reaches 0, which is here the case when $t = t_0$.
  • What to do with the state at that moment. The state is encapsulated within
    the integrator variable. In our case, we add 30 units to the concentration in A.

As we can see on the plot, a discontinuity appears on the concentration in A
at the injection time, the concentration in B restarts increasing.

Finding the optimal injection time: visual approach

From the previously built function, we can get the whole solution with a given
injection time, and from that the final state of the system.

tinj_span = 0.05:0.005:0.95
final_b = [inject_new(tinj).u[end][2] for tinj in tinj_span]
Plots.plot(tinj_span, final_b)

Using a plain for comprehension, we fetch the solution of the simulation for
the callback built with each $t_{inject}$.

Quantity of B

Injecting $A$ too soon lets too much time for the created $B$ to turn into $R$,
but injecting it too late does not let enough time for $B$ to be produced from
the injected $A$. The optimum seems to be around ≈ 0.82,

Finding the optimum using BlackBoxOptim.jl

The package requires an objective function which takes a vector as input.
In our case, the decision is modeled as a single variable (the injection time),
it’s crucial to make the objective use a vector nonetheless, otherwise
calling the solver will just explode with cryptic errors.

compute_finalb = tinj -> -1 * inject_new(tinj[1]).u[end][2]
# trust the default algorithm
BlackBoxOptim.bboptimize(compute_finalb, SearchRange=(0.1,0.9), NumDimensions=1)
# use probabilistic descent
BlackBoxOptim.bboptimize(compute_finalb, SearchRange=(0.1,0.9), NumDimensions=1, Method=:probabilistic_descent)

The function inject_new we defined above returns the complete solution
of the simulation, we get the state matrix u, from which we extract the
final state u[end], and then the second component, the concentration in
B: u[end][2]. The black box optimizer minimizes the objective, while we want
to maximize the final concentration of B, hence the -1 multiplier used for
compute_finalb.

The bboptimize function can also be passed a Method argument specifying
the optimization algorithm. In this case, the function is smooth, so we can
suppose gradient estimation methods would work pretty well. We also let the
default algorithm (differential evolution) be used. After some lines logging
the progress on the search, we obtain the following for both methods:

Best candidate found: [0.835558]
Fitness: -24.039369448

More importantly, we can refer to the number of evaluations of the objective
function as a measure of the algorithm performance, combined with the time taken.

For the probabilistic descent:

Optimization stopped after 10001 steps and 10.565439939498901 seconds
Steps per second = 946.5767689058794
Function evals per second = 1784.6866867802482
Improvements/step = 0.0
Total function evaluations = 18856

For the differential evolution:

Optimization stopped after 10001 steps and 5.897292137145996 seconds
Steps per second = 1695.863078751937
Function evals per second = 1712.8200138459472
Improvements/step = 0.1078
Total function evaluations = 10101

We found the best injection time (0.835558), and the corresponding final
concentration (24.04).

Extending the model

The decision over one variable was pretty straightforward. We are going to
extend it by changing how the $A$ component is added at $t_{inject}$.
Instead of being completely dissolved, a part of the component will keep being
poured in after $t_{inject}$. So the decision will be composed of two variables:

  • The time of the beginning of the injection
  • The part of $A$ to inject directly and the part to inject in a
    continuous fashion. We will note the fraction injected directly $\delta$.

Given a fixed available quantity $A₀$ and a fraction to inject directly $\delta$,
the concentration in A is increased of $\delta \cdot A₀$ at time $t_{inject}$,
after which the rate of change of the concentration in A is increased by a
constant amount, until the total amount of A injected (directly and over time)
is equal to the planned quantity.

We need a new variable in the state of the system, $u_4(t)$, which stands
for the input flow of A being active or not.

  • $u(t) = 0$ if $t < t_{inject}$
  • $u(t) = 0$ if the total flow of A which has been injected is equal to the planned quantity
  • $u(t) = \dot{A}\ $ otherwise, with $\dot{A}\ $ the rate at which A is being poured.

New Julia equations

We already built the key components in the previous sections. This time we need
two events:

  • A is directly injected at $t_{inject}$, and then starts being poured at constant rate
  • A stops being poured when the total quantity has been used
const inj_quantity = 30.0;
const inj_rate = 40.0;

diffeq_extended = function(t, u, du)
    du[1] = - α * u[1] * u[2] + u[4]
    du[2] = α * u[1] * u[2] - β * u[2]
    du[3] = β * u[2]
    du[4] = 0.0
end

u0 = [49.0;1.0;0.0;0.0]
tspan = (0.0, 1.0)
prob = DiffEq.ODEProblem(diffeq_extended, u0, tspan)

We wrap the solution building process into a function taking the starting time
and the fraction being directly injected as parameters:

inject_progressive = function(t0, direct_frac)
    condition_start(t, u, integrator) = t0 - t
    affect_start! = function(integrator)
        integrator.u[1] = integrator.u[1] + inj_quantity * direct_frac
        integrator.u[4] = inj_rate
    end
    callback_start = DiffEq.ContinuousCallback(
        condition_start, affect_start!, save_positions=(true, true)
    )
    condition_end(t, u, integrator) = (t - t0) * inj_rate - inj_quantity * (1 - direct_frac)
    affect_end! = function(integrator)
        integrator.u[4] = 0.0
    end
    callback_end = DiffEq.ContinuousCallback(condition_end, affect_end!, save_positions=(true, true))
    sol = DiffEq.solve(prob, callback=DiffEq.CallbackSet(callback_start, callback_end), dtmax=0.005)
end

Plots.plot(inject_progressive(0.6,0.6))

We can notice callback_start being identical to the model we previously built,
while condition_end corresponds to the time when the total injected
quantity reaches inj_quantity. The first events activates $u_4$ and sets it
to the nominal flow, while the second callback resets it to 0.

Constant rate

BlackBoxOptim.jl can be re-used to determine the optimal decision:

objective = function(x)
    sol = inject_progressive(x[1], x[2])
    -sol.u[end][2]
end
BlackBoxOptim.bboptimize(objective, SearchRange=[(0.1,0.9),(0.0,1.0)], NumDimensions=2)

The optimal solution corresponds to a complete direct injection
($\delta \approx 1$) with $t_{inject}^{opt}$ identical to the previous model.
This means pouring the A component in a continuous fashion does not allow to
produce more $B$ at the end of the minute.

Conclusion

We could still built on top of this model to keep refining it, taking more
phenomena into account (what if the reactions produce heat and are sensitive
to temperature?). The structures describing models built with
DifferentialEquations.jl are transparent and easy to use for further manipulations.

One point on which I place expectations is some additional interoperability
between DifferentialEquations.jl and JuMP,
a Julia meta-package for optimization. Some great work was already performed to
combine the two systems, one use case that has been described is the parameter
identification problem (given the evolution of concentration in the system,
identify the α and β parameters).

But given that the function I built from a parameter was a black box
(without an explicit formula, not a gradient), I had to use BlackBoxOptim,
which is amazingly straightforward, but feels a bit overkill for smooth
functions as presented here. Maybe there is a different way to build the
objective function, using parametrized functions for instance, which could
make it transparent to optimization solvers.

If somebody has info on that last point or feedback, additional info you’d like
to share regarding this post, hit me on Twitter.
Thanks for reading!


Edits and improvements

Of course, BlackBoxOptim.jl was not the most appropriate algorithm as
predicted. Patrick and Chris
gave me some hints in this thread
and I gave Optim.jl a try.

This package has a range of algorithms to choose from depending on the
structure of the function and the knowledge of its gradient and Hessian.
The goal is continuous optimization, (as opposed to BlackBoxOptim.jl which supports
more exotic search spaces).

Finding the optimum $t_{inject}$ of the first problem is pretty simple:

import Optim
Optim.optimize(compute_finalb, 0.1, 0.9)

This yields the following information:

Results of Optimization Algorithm
 * Algorithm: Brent's Method
 * Search Interval: [0.100000, 0.900000]
 * Minimizer: 8.355891e-01
 * Minimum: -2.403824e+01
 * Iterations: 13
 * Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
 * Objective Function Calls: 14

14 calls to the objective function, pretty neat compared to the hundreds of
BlackBoxOptim. We also confirm the optimum of 0.8355891.



[1] Cover image: Lorenz attractor on Wikimedia, again.

Getting started with DifferentialEquations.jl

By: Julia on μβ

Re-posted from: https://mbesancon.github.io/post/2017-12-14-diffeq-julia/

DifferentialEquations.jl
came to be a key component of Julia’s scientific ecosystem. After checking the
JuliaCon talk of its creator, I couldn’t wait to start building stuff with it,
so I created and developed a simple example detailed in this blog post.
Starting from a basic ordinary differential equation (ODE), we add noise,
making it stochastic, and finally turn it into a discrete version.

Before running the code below, two imports will be used:

import DifferentialEquations
const DiffEq = DifferentialEquations
import Plots

I tend to prefer explicit imports in my Julia code, it helps see which function
comes from which part. As DifferentialEquations is longuish to write, we use
an alias in the rest of the code.

The model

We use a simple 3-element state in a differential equation. Depending on your
background, pick the interpretation you prefer:

  1. An SIR model, standing for susceptible, infected, and recovered, directly
    inspired by the talk and by the Gillespie.jl
    package. We have a total population with healthy people, infected people
    (after they catch the disease) and recovered (after they heal from the disease).

  2. A chemical system with three components, A, B and R.
    $$A + B → 2B$$ $$B → R$$

After searching my memory for chemical engineering courses and the
universal source of knowledge,
I could confirm the first reaction is an autocatalysis, while the second is
a simple reaction. An autocatalysis means that B molecules turn A molecules
into B, without being consumed.

The first example is easier to represent as a discrete problem: finite
populations make more sense when talking about people. However, it can be seen
as getting closer to a continuous differential equation as the number of people
get higher. The second model makes more sense in a continuous version as we are
dealing with concentrations of chemical components.

A first continuous model

Following the tutorials from the
official package website,
we can build our system from:

  • A system of differential equations: how does the system behave (dynamically)
  • Initial conditions: where does the system start
  • A time span: how long do we want to observe the system

The system state can be written as:
$$u(t) =
\begin{bmatrix}
u₁(t) \
u₂(t) \
u₃(t)
\end{bmatrix}^T
$$

With the behavior described as:
$$
\dot{u}(t) = f(u,t)
$$
And the initial conditions $u(0) = u₀$.

In Julia, this becomes:

α = 0.8
β = 3.0
diffeq = function(t, u, du)
    du[1] = - α * u[1] * u[2]
    du[2] = α * u[1] * u[2] - β * u[2]
    du[3] = β * u[2]
end
u₀ = [49.0;1.0;0.0]
tspan = (0.0, 1.0)

diffeq models the dynamic behavior, u₀ the starting conditions
and tspan the time range over which we observe the system
evolution.

We know that our equation is smooth, so we’ll let
DifferentialEquations.jl figure out the solver. The general API
of the package is built around two steps:
1. Building a problem/model from behavior and initial conditions
2. Solving the problem using a solver of our choice and providing additional
information on how to solve it, yielding a solution.

prob = DiffEq.ODEProblem(diffeq, u₀, tspan)
sol = DiffEq.solve(prob);

One very nice property of solutions produced by the package is that they
contain a direct way to produce plots.

Plots.plot(sol)

Solution to the ODE

If we use the disease propagation example, $u₁(t)$ is the number of
healthy people who haven’t been infected. It starts high, which makes the rate
of infection by the diseased population moderate. As the number of sick people
increases, the rate of infection increases: there are more and more possible
contacts between healthy and sick people.

As the number of sick people increases, the recovery rate also increases,
absorbing more sick people. So the “physics” behind the problem makes sense
with what we observe on the curve.

A key property to notice is the mass conservation: the sum of the three elements
of the vector is constant (the total population in the health case). This makes
sense from the point of view of the equations:
$$\frac{du₁}{dt} + \frac{du₂}{dt} + \frac{du_3}{dt} = 0$$

Adding randomness: first attempt with a simple SDE

The previous model works successfully, but remains naive. On small populations,
the rate of contamination and recovery cannot be so smooth. What if some sick
people isolate themselves from others for an hour or so, what there is a
meeting organized, with higher chances of contacts? All these plausible events
create different scenarios that are more or less likely to happen.

To represent this, the rate of change of the three variables of the system
can be considered as composed of a deterministic part and of a random variation.
One standard representation for this, as laid out in the
package documentation
is the following:
$$
du = f(u,t) dt + ∑ gᵢ(u,t) dWᵢ
$$

In our case, we could consider two points of randomness at the two interactions
(one for the transition from healthy to sick, and one from sick to recovered).

Stochastic version

σ1 = 0.07
σ2 = 0.4
noise_func = function(t, u, du)
    du[1] = σ1 * u[1] * u[2]
    du[3] = σ2 * u[2]
    du[2] = - du[1]  - du[3]
end

stoch_prob = DiffEq.SDEProblem(diffeq, noise_func, u₀, tspan)
sol_stoch = DiffEq.solve(stoch_prob, DiffEq.SRIW1())

Note that we also change the solver provided to the solve function to adapt
to stochastic equations. The last variation is set to the opposite of the sum
of the two others to compensate the two other variations (we said we had only
one randomness phenomenon per state transition).

SDE

Woops, something went wrong. This time the mass conservation doesn’t hold,
we finish with a population below the initial condition. What is wrong is that
we don’t define the variation but the gᵢ(u,t) function, which is then
multiplied by dWᵢ. Since we used the function signature corresponding to
the diagonal noise, there is a random component per $uᵢ$ variable.

Adding randomness: second attempt with non-diagonal noise

As explained above, we need one source of randomness for each transition.
This results in a $G(u,t)$ matrix of $3 × 2$. We can then make sure that the
the sum of variations for the three variables cancel out to keep a constant
total population.

noise_func_cons = function(t, u, du)
    du[1, 1] = σ1 * u[1] * u[2]
    du[1, 2] = 0.0
    du[2, 1] = - σ1 * u[1] * u[2]
    du[2, 2] = - σ2 * u[2]
    du[3,1] = 0.0
    du[3,2] = σ2 * u[2]
end
sde_cons = DiffEq.SDEProblem(
    diffeq, noise_func_cons, u₀, tspan,
    noise_rate_prototype=zeros(3,2)
)
cons_solution = DiffEq.solve(sde_cons, DiffEq.EM(), dt=1/500)

We also provide a noise_rate_prototype parameter to the problem builder to
indicate we don’t want to use a diagonal noise.

SDE

This time the population conservation holds, at any point in time the sum of
the $uᵢ(t)$ remains 50.

Discretizing: Gillespie model

The models we produced so far represent well the chemical reaction problem,
but a bit less the disease propagation. We are using continuous quantities
to represent discrete populations, how do we interpret 0.6 people sick at a time?

One major strength of the package is its effortless integration of discrete
phenomena in a model, alone or combined with continuous dynamics. Our model
follows exactly the package tutorial on
discrete stochastic problems,
so building it should be straightforward.

infect_rate = DiffEq.Reaction(α, [1,2],[(1,-1),(2,1)])
recover_rate = DiffEq.Reaction(β, [2],[(2,-1),(3,1)])
disc_prob = DiffEq.GillespieProblem(
    DiffEq.DiscreteProblem(round.(Int,u₀), tspan),
    DiffEq.Direct(),
    infect_rate, recover_rate,
)
disc_sol = DiffEq.solve(disc_prob, DiffEq.Discrete());

We define the infection and recovery rate and the variables $uᵢ$ that are
affected, and call the Discrete solver. The Plots.jl integration once again
yields a direct representation of the solution over the time span.

SDE

Again, the conservation of the total population is guaranteed by the effect of
the jumps deleting one unit from a population to add it to the other.

Conclusion

The DifferentialEquations.jl package went from a good surprise to a key tool in
my scientific computing toolbox. It does not require learning another embedded
language but makes use of real idiomatic Julia. The interface is clean and
working on edge cases does not feel hacky. I’ll be looking forward to using
it in my PhD or side-hacks, especially combined to the
JuMP.jl package: DifferentialEquations
used to build simulations and JuMP to optimize a cost function on top of the
created model.

Thanks for reading, hit me on Twitter
for feedback or questions 😉


Edits:

Chris, the creator and main developer
of DifferentialEquations.jl, gave me valuable tips on two
points which have been edited in the article. You can find the thread
here.

  • Import aliases should use const PackageAlias = PackageName for type
    stability. This allows the compiler to generate efficient code.
    Some further mentions of type-stability can be found in the
    official doc
  • The second attempts uses non-diagonal noise, the “:additive” hint I passed
    to the solve function does not hold. Furthermore, the appropriate algorithm in
    that case is EM
    .

Many thanks to him for these tips, having such devoted and friendly developers
is also what makes an open-source project successful.



[1] Cover image: Lorenz attractor on Wikimedia