Author Archives: Tamás K. Papp

Automatic differentiation of discontinuous integrals

By: Tamás K. Papp

Re-posted from: https://tamaspapp.eu/post/discontinuous_integral_ad/

This is a much simplified writeup of a problem I encountered, in a self-contained blog post.

You want to approximate the integral

\[
I(\theta) = \int 1(g(x) > 0) f(x,\theta) dx
\]

where \(g\) is a continuous function, and \(f(x,\theta)\) is a parametric distribution over \(x\). Everthing is continous, and thus \(I\) is, too.

You face the following constraints:

  1. \(g\) is a black box. We will pretend that you can’t invert it (except for checking our results, of course).

  2. You can calculate the probability density function \(f\) and even draw \(x\)‘s for a particular \(\theta\), but that’s pretty much it. You don’t even get a cdf! (Again, except for checking our results.)

Using Monte Carlo methods, you can do the following:

  1. draw \(x_i \sim F(\cdot, \theta)\), \(i=1,\dots,N\),

  2. approximate

\[
I(\theta) \approx \frac{1}{N}\sum_i 1(g(x_i) > 0)
\]

You could code this in Julia as

d = distr(θ)   # suppose this returns some distribution that supports Base.rand
x = rand(d, N)
mean(g.(x) > 0)

So far, neat and simple. Now, the fly in the ointment: you need the derivative \(I'(\theta)\) for optimization or Hamiltonian Monte Carlo. The problem is that you cannot ForwardDiff your way through the code above: AD’ing a discontinuous step function will just give you \(0\), and rand does not work with ForwardDiff.Duals anyway (which is very sensible).

However, there is a solution: rewrite the approximation as

\[
I(\theta; \theta_0) \approx \frac{1}{N}\sum_i 1(g(x_i) > 0) \frac{f(x_i,\theta)}{f(x_i,\theta_0)}
\]

where \(\theta_0\) is the parameter used to simulate the \(x_i\). Differentiate the above at \(\theta = \theta_0\). This approximates

\[
I'(\theta) = \int 1(g(x) > 0) \frac{\partial f(x,\theta)/\partial \theta}{f(x,\theta)}f(x,\theta) dx = \int 1(g(x) > 0) \partial f(x,\theta)/\partial \theta dx
\]

which is exactly what you want.

Assume that the actual calculation is very complicated, so we would rather avoid implementing it for the integral and the derivative separately. It turns out that this is very simple to do with ForwardDiff.Dual values: the code is literally a one-liner and a fallback method:

elasticity(x::Real) = one(x)
elasticity(x::Dual{T,V,N}) where {T,V,N} = Dual{T}(one(V), partials(x) / value(x))

which you can use in a function like

integral_approx(g, d, x) = mean((g.(x) .> 0) .* elasticity.(pdf.(d, x)))

I demonstrate this with \(g(x) = x\) and \(x \sim \text{Normal}(\theta, 1)\), for which of course we know that the analytical values for \(I\) and \(I’\) are the right tail probability and the pdf at 0, respectively.

Graphs below show that the approximation is reasonable — we could make it much better with low-discrepancy sequences, but that is an orthogonal issue.

integral

derivative

It is amazing how much you can accomplish with two lines of code in Julia! The problem that motivated this blog post is multivariate with irregular regions over which \(\{ x: g(x) > 0 \}\), but I used elasticity as above.

Self-contained code for everything is available below.

download as code.jl

using ForwardDiff
import ForwardDiff: Dual, value, partials, Tag
using Distributions
using Plots
using LaTeXStrings
gr()

######################################################################
# elasticity calculation
######################################################################

"""
    elasticity(x)

Calculate `x/x`, stripping `x` of partial derivative information. Useful for
calculating elasticities of the form `∂f/f` using ForwardDiff.
"""
elasticity(x::Real) = one(x)

elasticity(x::Dual{T,V,N}) where {T,V,N} = Dual{T}(one(V), partials(x) / value(x))

######################################################################
# example application
######################################################################

integral_approx(g, d, x) = mean((g.(x) .> 0) .* elasticity.(pdf.(d, x)))

"Helper function that returns the value and derivative at the same time."
function value_and_derivative(f::F, x::R) where {F,R<:Real}
    T = typeof(Tag(F, R))
    y = f(Dual{T}(x, one(x)))
    value(y), partials(y, 1)
end

distr(θ) = Normal(θ, 1)

g(x) = x

function ID_analytical(θ)
    d = distr(θ)
    ccdf(d, 0), pdf(d, 0)
end

function ID_approx(θ)
    x = rand(distr(θ), 1000)
    value_and_derivative(θ->integral_approx(g, distr(θ), x), θ)
end

θs = linspace(-2, 2, 51)
ID0s = ID_analytical.(θs)
ID1s = ID_approx.(θs)

scatter(θs, first.(ID0s), xlab = "\\theta", ylab = "integral",
        label = "analytical", legend = :topleft)
scatter!(θs, first.(ID1s), label = "approximation")
savefig("integral.svg")

scatter(θs, last.(ID0s), xlab = "\\theta", ylab = "derivative",
        label = "analytical", legend = :topleft)
scatter!(θs, last.(ID1s), label = "approximation")
savefig("derivative.svg")

log1p in Julia

By: Tamás K. Papp

Re-posted from: https://tamaspapp.eu/post/log1p/

edit: fixed bogus interaction of MathJax and code highlighting.

edit2: added benchmarks.

This is a follow-up on a question I asked on the Julia forums about calculating

\[
\text{log1p}(x) = \log(1+x)
\]

This calculation is tricky because if \(x \approx 0\),

\[
\log(1+x) \approx x
\]

while as \(x \to \infty\), \(\log(1+x)\) approaches \(\log(x)\), so simply using log(1+x) will not be as accurate as it could be. Numerical analysts have developed specialized methods for these calculations that try to get an accurate answer, and all programming languages serious about numerical calculations have an implementation either in the core language or a library.

Julia’s Base.log1p currently suggests that Base.Math.JuliaLibm.log1p would be preferable, but then I was wondering why isn’t that the default? So I decided to perform a trivial numerical experiment, calculating the error for both, and also benchmark the two methods.

Accuracy

The key question is what to compare the results with. One could compare to an existing "gold standard" implementation, or simply calculate the results using a higher precision representation. Fortunately, Julia has BigFloats available right out of the box.

The graph below shows the base-2 logarithm of the relative error for Base.log1p vs \(\log\_2(1+x)\); horizontal lines are log2(eps()) and log2(eps())+1. This suggests that Base.log1p is very accurate, but not as good as it could be when \(x \approx 0\).

Base.log1p error

The next plot shows the relative accuracy of the relative error above, comparing Base.Math.JuliaLibm.log1p to Base.log1p (lower values better). In these simulations, Base.Math.JuliaLibm.log1p is never worse, but sometimes significantly better (resulting in an extra binary digit of accuracy). This matters especially when \(x \approx 0\).

relative log2 acccuracy improvement over Base.log1p

The next plot confirms this.

JuliaLibm log1p error

Speed

I also evaluated relative speed. The graph below shows the relative runtimes, obtained using BenchmarkTools.@belapsed. Values below \(1\) mean that Base.Math.JuliaLibm.log1p is faster: indeed, this seems to be the case except for values very close to \(0\), where there is a 10–20% performance penalty. At other values, Base.Math.JuliaLibm.log1p is 30–40% faster.

relative time

Conclusion

  1. For values near \(0\), Base.Math.JuliaLibm.log1p is more accurate, at a slight performance cost.

  2. For values away from \(0\), it is at least as accurate as Base.log1p, and 30—40% faster.

To me, this suggests that Base.Math.JuliaLibm.log1p should be the default method — mostly because the extra accuracy is more important to me than the slight performance cost.

Code is available below.

download as code.jl

# consistent random numbers
srand(UInt32[0xfd909253, 0x7859c364, 0x7cd42419, 0x4c06a3b6])

"""
    err(x, [prec])

Return two values, which are the log2 relative errors for calculating
`log1p(x)`, using `Base.log1p` and `Base.Math.JuliaLibm.log1p`.

The errors are calculated by compating to `BigFloat` calculations with the given
precision `prec`.
"""
function err(x, prec = 1024)
    yb = log(1+BigFloat(x, prec))
    e2(y) = Float64(log2(abs(y-yb)/abs(yb)))
    e2(log1p(x)), e2(Base.Math.JuliaLibm.log1p(x))
end

z = exp.(randn(20000)*10)       # z > 0, Lognormal(0, 10)
x = z .- 1                      # x > -1
es = map(err, x)                # errors

using Plots; gr()               # plots

scatter(log2.(z), first.(es), xlab = "log2(x+1)", ylab = "log2 error of Base.log1p",
        legend = false)
hline!(log2(eps())-[0,1])
Plots.svg("Base_log1p_error.svg")
scatter(log2.(z), last.(es) .- first.(es), xlab = "log2(x+1)",
        ylab = "improvement (Base.Math.JuliaLibm.log1p)", legend = false)
Plots.svg("JuliaLibm_improvement.svg")
scatter(log2.(z), last.(es), xlab = "log2(x+1)",
        ylab = "log2 error of Base.Math.JuliaLibm.log1p", legend = false)
hline!(log2(eps())-[0,1])
Plots.svg("JuliaLibm_log1p_error.svg")

######################################################################
# WARNING: these run for a very long time
######################################################################
using BenchmarkTools

z = exp.(vcat(randn(200)*10, rand(200)*0.1)) # z > 0, more values around 
x = z .- 1                                   # x > -1
b1 = [@belapsed log1p($x) for x in x]        # WARNING: takes forever
b2 = [@belapsed Base.Math.JuliaLibm.log1p($x) for x in x] # ditto

scatter(log2.(z), b2 ./ b1, xlab = "log2(x+1)",
        ylab = "time Math.JuliaLibm.log1p / log1p", legend = false, yticks = 0:0.2:1.2)
hline!([1])
Plots.svg("relative_time.svg")

Emacs customizations for julia-mode

By: Tamás K. Papp

Re-posted from: https://tamaspapp.eu/post/emacs-julia-customizations/

I find the following customizations very useful for editing Julia
code in Emacs. Add them to julia-mode-hook, eg

(defun customize-julia-mode ()
  "Customize julia-mode."
  (interactive)
  ;; my customizations go here
  )

(add-hook 'julia-mode-hook 'customize-julia-mode)

Highlight FIXME/TODO/…

When I just want to note something in a comment for future reference,
I prefer to have certain words highlighted. You can use something like
this:

(font-lock-add-keywords nil
                        '(("\\<\\(FIXME\\|TODO\\|QUESTION\\|NOTE\\)"
                        1 font-lock-warning-face t)))

This is what it looks like:

            chklapackerror(info[])
            if any(ifail .!= 0)
                # TODO: better error message / type
                error("failed to converge eigenvectors:\n$(nonzeros(ifail))")
            end

Highlight symbols

After

(require 'highlight-symbol)

add a hook for

(local-set-key [(control ?c) ?s] 'highlight-symbol-at-point)
(local-set-key [(control ?c) ?n] 'highlight-symbol-next)
(local-set-key [(control ?c) ?p] 'highlight-symbol-prev)

This highlights symbols with C-c s:

function issymmetric(A::AbstractMatrix)
    indsm, indsn = indices(A)
    if indsm != indsn
        return false
    end
    for i = first(indsn):last(indsn), j = (i):last(indsn)
        if A[i,j] != transpose(A[j,i])
            return false
        end
    end
    return true
end

Fill docstrings

This is useful if you want to use M-q on docstrings.

(defun julia-fill-string ()
  "Fill a docstring, preserving newlines before and after triple quotation marks."
  (interactive)
  (if (and transient-mark-mode mark-active)
      (fill-region (region-beginning) (region-end) nil t)
    (cl-flet ((fill-if-string ()
                              (when (or (looking-at (rx "\"\"\""
                                                        (group
                                                         (*? (or (not (any "\\"))
                                                                 (seq "\\" anything))))
                                                        "\"\"\""))
                                        (looking-at (rx "\""
                                                        (group
                                                         (*? (or (not (any "\\"))
                                                                 (seq "\\" anything))))
                                                        "\"")))
                                (let ((start (match-beginning 1))
                                      (end (match-end 1)))
                                  ;; (ess-blink-region start end)
                                  (fill-region start end nil nil nil)))))
      (save-excursion
        (let ((s (syntax-ppss)))
          (when (fourth s) (goto-char (ninth s))))
        (fill-if-string)))))

Add

(local-set-key (kbd "M-q") 'julia-fill-string)

to the mode hook.

Highlight things after column 80

I add this to the mode hook:

(set-fill-column 80)

I also use whitespace globally:

(require 'whitespace)
(setq whitespace-style '(face empty tabs lines-tail trailing))
(global-whitespace-mode t)

This is what it looks like:

    QR{T,S}(factors::AbstractMatrix{T}, τ::Vector{T}) where {T,S<:AbstractMatrix} = new(factors, τ)
end
QR(factors::AbstractMatrix{T}, τ::Vector{T}) where {T} = QR{T,typeof(factors)}(factors, τ)

Hungry delete-mode

Add this to the mode hook:

(hungry-delete-mode)

and then backspace and delete will remove all whitespace near the point in the relevant direction.

In case you are wondering, the theme is alect-light.