Author Archives: Abel Soares Siqueira

NLPModels.jl and CUTEst.jl: Constrained optimization

Re-posted from: http://abelsiqueira.github.io/nlpmodelsjl-and-cutestjl-constrained-optimization/

This is a continuation of this
post.
And again, you can follow the commands of this post in the
asciinema.

If you followed along last post, you should know the basics of our
NLPModels API, including CUTEst access.

One thing I didn’t explore, though, was constrained problems.
It’d complicate too much.

However, now that we know how to handle the basics, we can move to the
advanced.

Nonlinear Programming format

The NLPModels internal structure is based on the CUTEst way of storing a
problem.
We use the following form for the optimization problem:

$\begin{align} \min \quad & f(x) \\ s.t. \quad & c_L \leq c(x) \leq c_U \\ & \ell \leq x \leq u\end{align} %]]>$

Given an AbstractNLPModel named nlp, the values for $\ell$, $u$, $c_L$ and
$c_U$ are stored in an NLPModelMeta structure, and can be accessed by
through nlp.meta.

Let’s look back at the simple Rosenbrock problem of before.

using NLPModels

f(x) = (x[1] - 1)^2 + 100*(x[2] - x[1]^2)^2
x0 = [-1.2; 1.0]
nlp = ADNLPModel(f, x0)
print(nlp.meta)

You should be seeing this:

Minimization problem Generic
nvar = 2, ncon = 0 (0 linear)
lvar = -Inf  -Inf
uvar = Inf  Inf
lcon = ∅
ucon = ∅
x0 = -1.2  1.0
y0 = ∅
nnzh = 4
nnzj = 0

Although the meaning of these values is reasonably straigthforward, I’ll explain a bit.

nvar is the number of variables in a problem;
ncon is the number of constraints, without counting the bounds;
lvar is the vector $\ell$, the lower bounds on the variables;
uvar is the vector $u$, the upper bounds on the variables;
lcon is the vector $c_L$, the lower bounds of the constraints function;
ucon is the vector $c_U$, the upper bounds of the constraints function;
x0 is the initial approximation to the solution, aka the starting point;
y0 is the initial approximation to the Lagrange multipliers;
nnzh is the number of nonzeros on the Hessian¹;
nnzj is the number of nonzeros on the Jacobian¹;

¹ nnzh and nnzj are not consistent between models, because some consider the dense matrix, and for the Hessian, some consider only the triangle. However, if you’re possibly considering using nnzh, you’re probably looking for hess_coord too, and hess_coord returns with the correct size.

These values can be accessed directly as fields in meta with the same name above.

nlp.meta.ncon
nlp.meta.x0
nlp.meta.lvar

Bounds

Now, let’s create a bounded problem.

nlp = ADNLPModel(f, x0, lvar=zeros(2), uvar=[0.4; 0.6])
print(nlp.meta)

Now the bounds are set, and you can access them with

nlp.meta.lvar
nlp.meta.uvar

That’s pretty much it. For SimpleNLPModel, it’s the same thing.
MathProgNLPModel inherits the bounds, as expected:

using JuMP

jmp = Model()
u = [0.4; 0.6]
@variable(jmp, 0 <= x[i=1:2] <= u[i], start=(x0[i]))
@NLobjective(jmp, Min, (x[1] - 1)^2 + 100*(x[2] - x[1]^2)^2)
mpbnlp = MathProgNLPModel(jmp)
print(mpbnlp.meta)

For CUTEst, there is no differentiation on creating a problem with bounds or
not, since it uses the internal description of the problem.
For instance, HS4 is a bounded problem.

using CUTEst

clp = CUTEstModel("HS4")
print(clp.meta)
finalize(clp)

Notice that it can happen that one or more of the variables is unlimited
(lower, upper or both). This is represented by the value Inf in Julia.
This should be expected since the unconstrained problem already used these
Inf values.

On the other hand, it could happen that $\ell_i = u_i$, in which case the
variable is fixed, or that $\ell_i > u_i$, in which case the variable (and the
problem) is infeasible.
Note that NLPModels only creates the model, it doesn’t check whether it is
feasible or not, even in this simple example. That said, CUTEst shouldn’t have
any infeasible variable.

Furthermore, all these types of bounds can be accessed from meta. Notice that
there are 6 possible situations:

Free variables, stored in meta.ifree;
Fixed variables, stored in meta.ifix;
Variables bounded below, stored in meta.ilow;
Variables bounded above, stored in meta.iupp;
Variables bounded above and below, stored in meta.irng;
Infeasible variables, stored in meta.iinf.

Here is one example with one of each of them

nlp = ADNLPModel(x->dot(x,x), zeros(6),
  lvar = [-Inf, -Inf, 0.0, 0.0, 0.0,  0.0],
  uvar = [ Inf,  1.0, Inf, 1.0, 0.0, -1.0])
nlp.meta.ifree
nlp.meta.ifix
nlp.meta.ilow
nlp.meta.iupp
nlp.meta.irng
nlp.meta.iinf

Constraints

Constraints are stored in NLPModels following in the format $c_L \leq c(x) \leq c_U$.
That means that an equality constraint happens when $c_{L_j} = c_{U_j}$.
Let’s look at how to create a problem with constraints.

For ADNLPModel, you need to pass three keywords arguments: c, lcon and ucon,
which represent $c(x)$, $c_L$ and $c_U$, respectively.
For instance, the problem

$\begin{align} \min \quad & x_1^2 + x_2^2 \\ s.t. \quad & x_1 + x_2 = 1 \end{align} %]]>$

is created by doing

c(x) = [x[1] + x[2] - 1]
lcon = [0.0]
ucon = [0.0]
nlp = ADNLPModel(x->dot(x,x), zeros(2), c=c, lcon=lcon, ucon=ucon)

or alternatively, if you don’t want the intermediary functions

nlp = ADNLPModel(x->dot(x,x), zeros(2), c=x->[x[1]+x[2]-1], lcon=[0.0], ucon=[0.0])

Another possibility is to do

nlp = ADNLPModel(x->dot(x,x), zeros(2), c=x->[x[1]+x[2]], lcon=[1.0], ucon=[1.0])

Personally, I prefer the former.

For inequalities, you can have only lower, only upper, and both.
The commands

nlp = ADNLPModel(x->dot(x,x), zeros(2),
  c=x->[x[1] + x[2]; 3x[1] + 2x[2]; x[1]*x[2]],
  lcon = [-1.0; -Inf; 1.0],
  ucon = [Inf;   3.0; 2.0])

implement the problem

$\begin{align} \min \quad & x_1^2 + x_2^2 \\ s.t. \quad & x_1 + x_2 \geq -1 \\ & 3x_1 + 2x_2 \leq 3 \\ & 1 \leq x_1x_2 \leq 2. \end{align} %]]>$

Again, the types of constraints can be accessed in meta, through
nlp.meta.jfix, jfree, jinf, jlow, jrng and jupp.
Notice if you forget to set lcon and ucon, there will be no
constraints, even though c is set. This is because the number of
constraints is taken from the lenght of these vectors.

Now, to access these constraints, let’s consider this simple problem.

nlp = ADNLPModel(f, x0, c=x->[x[1]*x[2] - 0.5], lcon=[0.0], ucon=[0.0])

The function cons return $c(x)$.

cons(nlp, nlp.meta.x0)

The function jac returns the Jacobian of $c$. jprod and jtprod the
Jacobian product times a vector, and jac_op the LinearOperator.

jac(nlp, nlp.meta.x0)
jprod(nlp, nlp.meta.x0, ones(2))
jtprod(nlp, nlp.meta.x0, ones(1))
J = jac_op(nlp, nlp.meta.x0)
J * ones(2)
J' * ones(1)

To get the Hessian we’ll use the same functions as the unconstrained case,
with the addition of a keyword parameter y.

y = [1e4]
hess(nlp, nlp.meta.x0, y=y)
hprod(nlp, nlp.meat.x0, ones(2))
H = hess_op(nlp, nlp.meta.x0, y=y)
H * ones(2)

If you want to ignore the objective function, or scale it by some value,
you can use the keyword parameter obj_weight.

s = 0.0
hess(nlp, nlp.meta.x0, y=y, obj_weight=s)
hprod(nlp, nlp.meat.x0, ones(2), obj_weight=s)
H = hess_op(nlp, nlp.meta.x0, y=y, obj_weight=s)
H * ones(2)

Check the
API
for more details.

We can also create a constrained JuMP model.

x0 = [-1.2; 1.0]
jmp = Model()
@variable(jmp, x[i=1:2], start=(x0[i]))
@NLobjective(jmp, Min, (x[1] - 1)^2 + 100*(x[2] - x[1]^2)^2)
@NLcontraint(jmp, x[1]*x[2] == 0.5)
mpbnlp = MathProgNLPModel(jmp)
cons(mpbnlp, mpbnlp.meta.x0)
jac(mpbnlp, mpbnlp.meta.x0)
hess(mpbnlp, mpbnlp.meta.x0, y=y)

And again, the access in CUTEst problems is the same.

clp = CUTEstModel("BT1")
cons(clp, clp.meta.x0)
jac(clp, clp.meta.x0)
hess(clp, clp.meta.x0, y=clp.meta.y0)
finalize(clp)

Convenience functions

There are some convenience functions to check whether a problem has only
equalities, only bounds, etc.
For clarification, we’re gonna say function constraint to refer to constraints that are not bounds.

has_bounds: Returns true is variable has bounds.
bound_constrained: Returns true if has_bounds and no function
constraints;
unconstrained: No function constraints nor bounds;
linearly_constrained: There are function constraints, and they are
linear; obs: even though a bound_constrained problem is linearly
constrained, this will return false.
equality_constrained: There are function constraints, and they are all equalities;
inequality_constrained: There are function constraints, and they are all inequalities;

Example solver

Let’s implement a “simple” solver for constrained optimization.
Our solver will loosely follow the Byrd-Omojokun implementation of

M. Lalee, J. Nocedal, and T. Plantenga. On the implementation of an algorithm for large-scale equality constrained optimization. SIAM J. Optim., Vol. 8, No. 3, pp. 682-706, 1998.

function solver(nlp :: AbstractNLPModel)
  if !equality_constrained(nlp)
    error("This solver is for equality constrained problems")
  elseif has_bounds(nlp)
    error("Can't handle bounds")
  end

  x = nlp.meta.x0

  fx = obj(nlp, x)
  cx = cons(nlp, x)

  ∇fx = grad(nlp, x)
  Jx = jac_op(nlp, x)

  λ = cgls(Jx', -∇fx)[1]
  ∇ℓx = ∇fx + Jx'*λ
  norm∇ℓx = norm(∇ℓx)

  Δ = max(0.1, min(100.0, 10norm∇ℓx))
  μ = 1
  v = zeros(nlp.meta.nvar)

  iter = 0
  while (norm∇ℓx > 1e-4 || norm(cx) > 1e-4) && (iter < 10000)
    # Vertical step
    if norm(cx) > 1e-4
      v = cg(Jx'*Jx, -Jx'*cx, radius=0.8Δ)[1]
      Δp = sqrt(Δ^2 - dot(v,v))
    else
      fill!(v, 0)
      Δp = Δ
    end

    # Horizontal step
    # Simplified to consider only ∇ℓx = proj(∇f, Nu(A))
    B = hess_op(nlp, x, y=λ)
    B∇ℓx = B * ∇ℓx
    gtBg = dot(∇ℓx, B∇ℓx)
    gtγ = dot(∇ℓx, ∇fx + B * v)
    t = if gtBg <= 0
      norm∇ℓx > 0 ? Δp/norm∇ℓx : 0.0
    else
      t = min(gtγ/gtBg, Δp/norm∇ℓx)
    end

    d = v - t * ∇ℓx

    # Trial step acceptance
    xt = x + d
    ft = obj(nlp, xt)
    ct = cons(nlp, xt)
    γ = dot(d, ∇fx) + 0.5*dot(d, B * d)
    θ = norm(cx) - norm(Jx * d + cx)
    normλ = norm(λ, Inf)
    if θ <= 0
      μ = normλ
    elseif normλ > γ/θ
      μ = min(normλ, 0.1 + γ/θ)
    else
      μ = 0.1 + γ/θ
    end
    Pred = -γ + μ * θ
    Ared = fx - ft + μ * (norm(cx) - norm(ct))

    ρ = Ared/Pred
    if ρ > 1e-2
      x .= xt
      fx = ft
      cx .= ct
      ∇fx = grad(nlp, x)
      Jx = jac_op(nlp, x)
      λ = cgls(Jx', -∇fx)[1]
      ∇ℓx = ∇fx + Jx'*λ
      norm∇ℓx = norm(∇ℓx)
      if ρ > 0.75 && norm(d) > 0.99Δ
        Δ *= 2.0
      end
    else
      Δ *= 0.5
    end

    iter += 1
  end

  return x, fx, norm∇ℓx, norm(cx)
end

Too loosely, in fact.

The horizontal step computes only the Cauchy step;
No special updates;
No second-order correction;
No efficient implementation beyond the easy-to-do.

To test how good it is, let’s run on the Hock-Schittkowski constrained problems.

function runcutest()
  problems = filter(x->contains(x, "HS") && length(x) <= 5, CUTEst.select(only_free_var=true, only_equ_con=true))
  sort!(problems)
  @printf("%-7s  %15s  %15s  %15s\n",
          "Problem", "f(x)", "‖∇ℓ(x,λ)‖", "‖c(x)‖")
  for p in problems
    nlp = CUTEstModel(p)
    try
      x, fx, nlx, ncx = solver(nlp)
      @printf("%-7s  %15.8e  %15.8e  %15.8e\n", p, fx, nlx, ncx)
    catch
      @printf("%-7s  %s\n", p, "failure")
    finally
      finalize(nlp)
    end
  end
end

I’m gonna print the output of this one, so you can compare it with yours.

Problem             f(x)        ‖∇ℓ(x,λ)‖           ‖c(x)‖
HS26      5.15931251e-07   9.88009545e-05   5.24359322e-05
HS27      4.00000164e-02   5.13264248e-05   2.26312672e-09
HS28      7.00144545e-09   9.46563681e-05   2.44249065e-15
HS39     -1.00000010e+00   1.99856691e-08   1.61607518e-07
HS40     -2.50011760e-01   4.52797064e-05   2.53246505e-05
HS42      1.38577292e+01   5.06661945e-05   5.33092868e-05
HS46      3.56533430e-06   9.98827045e-05   8.00086215e-05
HS47      3.53637757e-07   9.71339790e-05   7.70496596e-05
HS48      4.65110036e-10   4.85457139e-05   2.27798719e-15
HS49      3.14248189e-06   9.94899395e-05   2.27488138e-13
HS50      1.36244906e-12   2.16913725e-06   2.90632554e-14
HS51      1.58249170e-09   8.52213221e-05   6.52675179e-15
HS52      5.32664756e+00   3.35626559e-05   3.21155766e-14
HS56     -3.45604528e+00   9.91076239e-05   3.14471179e-05
HS6       5.93063756e-13   6.88804464e-07   9.61311292e-06
HS61     -1.43646176e+02   1.06116455e-05   1.80421875e-05
HS7      -1.73205088e+00   1.23808109e-11   2.60442422e-07
HS77      2.41501014e-01   8.31210333e-05   7.75367223e-05
HS78     -2.91972281e+00   2.27102179e-05   2.88776440e-05
HS79      7.87776482e-02   4.77319205e-05   7.55827729e-05
HS8      -1.00000000e+00   0.00000000e+00   2.39989802e-06
HS9      -5.00000000e-01   1.23438228e-06   3.55271368e-15

If you compare against the Hock-Schitkowski paper, you’ll see that
the method converged for all 22 problems.
Considering our simplifications, this is a very exciting.

That’s all for now. Use our RSS feed to keep updated.

NLPModels.jl, CUTEst.jl and other Nonlinear Optimization Packages on Julia

By: Abel Soares Siqueira

Re-posted from: http://abelsiqueira.github.io/nlpmodelsjl-cutestjl-and-other-nonlinear-optimization-packages-on-julia/

A couple of weeks ago me and Professor Dominique Orban have finally made a release of
CUTEst.jl, a wrapper for the CUTEst repository of problems for nonlinear
optimization (which I’ve mentioned before).
Along with this release, we’ve done a release of NLPModels.jl, the underlying
package. I think it’s time I explain a little about these packages, others,
and how to use them together.
If you want to see the output of the commands, you can open
this ASCIInema
side by side.

Obs.: Tutorial using Julia 0.5.0

Edit: Second part is
here.

JuliaSmoothOptimizers

Most packages mentioned here will be a part of the JuliaSmoothOptimizers (JSO)
organization. There are more packages in the organization that I won’t mention here, but you should check it out.

NLPModels.jl

NLPModels is a package for creating Nonlinear Optimization Models. It is
focused on the needs of the solver writer, such as the ability to write one
solver that works on many models.
The package defines a few models, and there are others on the horizon.
The ones already done are:

ADNLPModel: A model with automatic differentiation;
MathProgNLPModel: A model for MathProgBase/JuMP conversion, whose utility will be shown below (obs: MPB and JuMP are packages from the JuliaOpt organization);
SimpleNLPModel: A model in which nothing is automatic, i.e., all functions have to be provided by the user.
SlackModel: A model that changes all inequalities to equalities adding extra variables;
LBFGSModel and LSR1Model: Models that create quasi-Newton models from another model.

The first two models are designed to be easy to use; the third is useful for
efficient model creation in specific cases; the last ones are utility models.

Let’s begin by installing NLPModels.jl, and a couple of optional requirements.

Pkg.add("NLPModels.jl")
Pkg.add("JuMP.jl") # Installs ForwardDiff also.

This should install version 0.1.0. After that, just do

using NLPModels

Now, let’s create a simple function: Rosenbrock’s.

f(x) = (x[1] - 1)^2 + 100*(x[2] - x[1]^2)^2

The Rosenbrock problem traditionally starts from $(-1.2,1.0)$.

x0 = [-1.2; 1.0]

Now, we are ready to create the problem.

adnlp = ADNLPModel(f, x0)

Now, we can access the function and derivatives using the NLPModels API

obj(adnlp, adnlp.meta.x0)
grad(adnlp, adnlp.meta.x0)
hess(adnlp, adnlp.meta.x0)
objgrad(adnlp, adnlp.meta.x0)
hprod(adnlp, adnlp.meta.x0, ones(2))
H = hess_op(adnlp, adnlp.meta.x0)
H * ones(2)

At this point, we can’t differentiate many things from simply using
ForwardDiff interface directly, but two things stand out: objgrad returns
both functions at once, and hess_op returns a
LinearOperator,
another structure created in JuliaSmoothOptimizers.
This one defines a linear operator, extending Julia matrices in the sense that if

using LinearOperators
n = 100
A = rand(n, n)
B = rand(n, n)
opA = LinearOperator(A)
opB = LinearOperator(B)
v = rand(n)

then (A * B) * v computes the matrix product, whereas (opA * opB) * v won’t.
Furthermore, the linear operator can be created from the functions
v->Mp(v) and v->Mtp(v), defining the product of the linear operator times a vector and its transpose times a vector.

T = LinearOperator(2, 2, # sizes
                   false, false,
                   v->[-v[2]; v[1]], v->[v[2]; -v[1]])
v = rand(2)
T * v
T' * v

Obs: In the ADNLPModel case, hess_op returns a linear operator that is actually
computing the matrix, but this is a issue to be tackled on the future (PRs
welcome). But we’ll be back with uses for hess_op soon.

The next model is the MathProgNLPModel. This model’s main use is the JuMP
modelling language. This is very useful for more elaborate writing, specially
with constraints. It does create a little more overhead though, so keep that
in mind.

using JuMP
jmp = Model()
@variable(jmp, x[i=1:2], start=(x0[i])) # x0 from before
@NLobjective(jmp, Min, (x[1] - 1)^2 + 100*(x[2] - x[1]^2)^2)
mpbnlp = MathProgNLPModel(jmp)

Try the commands again.

obj(mpbnlp, mpbnlp.meta.x0)
grad(mpbnlp, mpbnlp.meta.x0)
hess(mpbnlp, mpbnlp.meta.x0)
objgrad(mpbnlp, mpbnlp.meta.x0)
hprod(mpbnlp, mpbnlp.meta.x0, ones(2))
H = hess_op(mpbnlp, mpbnlp.meta.x0)
H * ones(2)

It should be pretty much the same, though there is a little difference in hess.
JuMP creates the sparse Hessian, which is better, from a computational point of
view.

Notice how the commands are the same. I’ve actually copy-pasted the commands
from above.
This allows the write of a solver in just a couple of commands.
For instance, a simple Newton method.

function newton(nlp :: AbstractNLPModel)
  x = nlp.meta.x0
  fx = obj(nlp, x)
  gx = grad(nlp, x)
  ngx = norm(gx)
  while ngx > 1e-6
    Hx = hess(nlp, x)
    d = -gx
    try
      G = chol(Hermitian(Hx, :L)) # Make Cholesky work on lower-only matrix.
      d = -G\(G'\gx)
    catch e
      if !isa(e, Base.LinAlg.PosDefException); rethrow(e); end
    end
    t = 1.0
    xt = x + t * d
    ft = obj(nlp, xt)
    while ft > fx + 0.5 * t * dot(gx, d)
      t *= 0.5
      xt = x + t * d
      ft = obj(nlp, xt)
    end
    x = xt
    fx = ft
    gx = grad(nlp, x)
    ngx = norm(gx)
  end
  return x, fx, ngx
end

And we run in the problems with

newton(adnlp)
newton(mpbnlp)

Write once, use on problems from different sources.

Now, to have more fun, let’s get another package:
OptimizationProblems.jl.
This package doesn’t have a release yet, so we have to clone it:

Pkg.clone("https://github.com/JuliaSmoothOptimizers/OptimizationProblems.jl")

What we have here is a collection of JuMP models implementing some of the
CUTEst problems. Together with NLPModels.jl, we have a good opportunity to test our Newton implementation.

using OptimizationProblems

x, fx, ngx = newton(MathProgNLPModel( rosenbrock() ))
x, fx, ngx = newton(MathProgNLPModel( dixmaanj() ))
x, fx, ngx = newton(MathProgNLPModel( brownbs() ))

An issue with OptimizationProblems is that it still doesn’t have a way to get
all unconstrained problems, for instance. (PRs are welcome).

So far we used 3 packages from JSO: NLPModels.jl, LinearOperators.jl and OptimizationProblems.jl. It’s time to meet another important package.

CUTEst.jl

CUTEst, the Constrained and Unconstrained Testing Environment with safe
threads, is a package written in Fortran providing over a thousand problems to
allow testing of Nonlinear Programming solvers. However, CUTEst is hard to use
by first-timers. Just installing it was already hard.
CUTEst.jl provides an interface for CUTEst that is simple to install and use
(comparing to the original).

Obs.: CUTEst.jl does not work on Windows for now. In fact, there is no plan to
make it work on Windows because “people interested in doing it”∩”people capable
of doing it” = ∅, as far as we’ve looked. If you are in this set, PRs are
welcome.

To install CUTEst.jl you need to install something manually. Unfortunately,
this is specific for each system. Except for OSX, actually, which is using
homebrew-cutest.

For Linux users, check out this
page.
Essentially, we need libgfortran.so in a visible place. And it’s especially
annoying that some distritions don’t put it in a visible place.

With that done, enter

Pkg.add("CUTEst")

which should install CUTEst.jl 0.1.0.

Yes, it takes some time.

Finally, we start using CUTEst with

using CUTEst

nlp = CUTEstModel("ROSENBR")

ROSENBR is a CUTEst problem, in case you want the list, see
here. Keep reading for a way
to select them, though.

Now, let’s solve this CUTEst problem with our Newton method.

x, fx, ngx = newton(nlp)

Yes, exactly like before!.

CUTEst is a little more annoying in other aspect also. Like, you can’t have two
or more problems open at the same time, and you have to close this problem
before opening a new one. (Again, PRs are welcome).

finalize(nlp)
nlp = CUTEstModel("HIMMELBB")
x, fx, ngx = newton(nlp)
finalize(nlp)

This allows a simple workflow for writing optimization solvers.

Write some problems by hand (using ADNLPModel or MathProgNLPModel);
Test your solvers with these hand-written problems;
Repeat last two steps until you believe you’re ready to competitive comparison;
Test with CUTEst problems seamlessly.

Now, let’s get back to hess_op. Remember that it used Matrix vector products?
Well, CUTEst has separate functions for the product of the Hessian at a point
and a vector. Which means that hprod actually computes this product without
having to create the matrix. Which means it is, at least, memory-efficient.
Furthermore, hess_op will be created with the hprod function, which means
it is also memory-efficient.

Let’s look at a huge problem to feel the difference.

nlp = CUTEstModel("BOX")
nlp.meta.nvar

Let’s make a simple comparison

function foo1()
  H = hess(nlp, nlp.meta.x0)
  v = ones(nlp.meta.nvar)
  return Hermitian(H, :L) * v
end

function foo2()
  H = hess_op(nlp, nlp.meta.x0)
  v = ones(nlp.meta.nvar)
  return H * v
end

@time w1 = foo1();
@time w2 = foo2();
norm(w1 - w2)

Yes, that’s a huge difference.

This is a very good reason to use hess_op and hprod. But let’s take a step further.

We can’t implement Cholesky using only hprods, so our Newton method would
actually take a long time to reach a solution for the problem above.
To circunvent that, we could try using the Conjugate Gradients Method instead
of Cholesky. This would only use Hessian-vector products.

We arrive on a new package,
Krylov.jl, which
implements Krylov methods. In particular, Conjugate Gradients.
This package is also unreleased, so we need to clone it.

Pkg.clone("https://github.com/JuliaSmoothOptimizers/Krylov.jl")

Consider a simple example

using Krylov
A = rand(3,3)
A = A*A'
b = A*ones(3)
cg(A, b)

As expected, the system is solver, and the solution is $(1,1,1)$.
But let’s do something more.

A = -A
cg(A, b)

Yes, Krylov does indeed solves the non-positive definite system using Conjugate Gradient.
Well, actually, a variant.

That’s not enough tough. Krylov.jl also accepts an additional argument radius
to set a trust-region radius.

cg(A, b, radius=0.1)

Well, as an exercise I suggest you implement a trust-region version of Newton
method, but for now, let’s continue with our line-search version.

We know now how cg behaves for non-positive definite systems, we can’t make
the changes for a new method.

function newton2(nlp :: AbstractNLPModel)
  x = nlp.meta.x0
  fx = obj(nlp, x)
  gx = grad(nlp, x)
  ngx = norm(gx)
  while norm(gx) > 1e-6
    Hx = hess_op(nlp, x)
    d, _ = cg(Hx, -gx)
    slope = dot(gx, d)
    if slope >= 0 # Not a descent direction
      d = -gx
      slope = -dot(d,d)
    end
    t = 1.0
    xt = x + t * d
    ft = obj(nlp, xt)
    while ft > fx + 0.5 * t * slope
      t *= 0.5
      xt = x + t * d
      ft = obj(nlp, xt)
    end
    x = xt
    fx = ft
    gx = grad(nlp, x)
    ngx = norm(gx)
  end
  return x, fx, ngx
end

Now, running newton2 on our large problem, we obtain

x, fx, ngx = newton2(nlp)

Which is the method working very fast. Less that a second here.

There is actually another package I’d like to talk about, but it needs some
more work for it to be ready for a release:

Optimize.jl

Optimize.jl is a package with solvers. We intend to implement some high quality
solvers in there, but there is actually more to it. We have in there tools to
benchmark packages. These tools should allow the testing of a set of solvers in
a set of problems without much fuss, while creating the comparison information,
including the performance profile.
It also includes, or will include, “parts” of solvers to create your own
solver. Like trust-region and line-search algorithms and auxiliary functions
and types.
Unfortunately, it’s not done enough for me to extend on it, and this is already
getting too long.

End

I hope you enjoyed this overview of packages.
Subscribe to the RSS feed to keep updated in future tutorials. I intend to talk
about the constrained part of NLPModels soon.

Julia Fractal on Julia

By: Abel Soares Siqueira

Re-posted from: http://abelsiqueira.github.io/julia-fractal-on-julia/

I wanted a background that would update automatically in some
interesting way, instead of just random images.
After some thought, I decided to use some Julia
fractals.

I made a code to create the Julia fractals in the Julia
language, and then some code to run it for a random
point.

The code is here, including some
explaining on how to use and install it.

Here are some examples: