Author Archives: Philip Zucker

Progress on Automated Reasoning for Catlab with Metatheory.jl Egraphs

Re-posted from: https:/www.philipzucker.com/metatheory-progress/

Oh cursed equation! How you haunt me!

I’ve been working on mechanizing equational reasoning for categories for a while now with not so much success.

Using ATP like Vampire https://www.philipzucker.com/notes-on-synthesis-and-equation-proving-for-catlab-jl/
Using SMT like Z3 https://www.philipzucker.com/theorem-proving-for-catlab-2-lets-try-z3-this-time-nope

The Egg paper came out and was very compelling to me. It is an implementation with novel twists of egraphs, a useful data structure for equational reasoning. I tried my hand at implementing the egraph data structure here.

But Alessandro Cheli has come along and really taken things to the next level with his Metatheory.jl library. The package includes a significantly improved egraph implementation, plus a classical rewriting system. It also has some really interesting ideas in regards to homoiconicity and theory composition. Maybe it’s useful to you. Check it out.

The equational reasoning that occurs in Catlab is compelling because it feels like high school algebra, but describes significantly more powerful systems of reasoning. It also has a beautiful and intuitive diagrammatic interpretation.

Understanding how to take a system like egraph rewriting, which in examples seems at the moment best suited for rewriting ordinary algebraic expressions, and tweak it to work with the more complicated logic of Catlab has been a challenge.

There are a couple problems.

The typing context contains important information. Consider the rule $f \rightarrow id(a) \cdot f $. Looking at only the syntax of this rule, a is invented on the right hand side. This information actually comes from the type of f::hom(a,b)
We need to strip off overloading? $\otimes$ is a symbol used in catlab for both the binary operator on objects and morphisms, as is customary. I guess this might fine? It makes me squeamish.
Some rules require guarding on the type before execution. For most of the rules in Catlab, if the left hand side of the rule is well-typed, then so is the right. This is not the case in general though.
Consider the trick question: Can you apply the interchange law (f ⊗ g) ⋅ (h ⊗ k) => (f ⋅ h) ⊗ (g ⋅ k) to the term (f ⊗ id(a)) ⋅ (id(b) ⊗ k)?
No you can’t. For in my example, I have secretly decided that f actually has type b ⊗ c -> b ⊗ c and k has type c ⊗ a -> c ⊗ a.
The other direction is always fine though. Given (f ⋅ h) ⊗ (g ⋅ k) is well typed, so is (f ⊗ g) ⋅ (h ⊗ k)

I thought way back that the types could be stripped and everything would be finem like if you strip the types from the simply typed lambda calculus STLC. This is not the case. I think this misconception is due to my inexperience doing hand calculations of this type (which almost no one does. String diagrams rule), plus I think you “code the happy path” when doing it by hand and may not even notice a situation where just shallow pattern matching on the syntax would allow you do do something verboten, since you’re smart enough to not make that dumb mistake.

My initial thinking this time around was to take a type tagging approach https://people.mpi-inf.mpg.de/~jblanche/mono-trans.pdf. This is a standard way of encoding more sophisticated typed logics into simpler typed ones. Type tagging replaces every term ever used with a more complex term that contains the type. In a curious analogy, it is the analog of dynamic typing vs static typing. In dynamically typed languages, all values carry with them tags describing which type they are at runtime.

As an example, the identity id(a) could be tagged as type(id(type(a,ob)), hom(typ(a,ob), type(a,ob))) or as hom(id(ob(a)), ob(a), ob(a)). You can see how verbose this is. It is not ideal both because it is unacceptable to write by hand, and because this extra structure is just more junk to choke your solver. But you can see that now the necessary types to fill in missing pieces of the right hand side is always available.

I tried this through some arcane metaprogramming methods involving Julog and it seemed too slow. Alessandro and James Fairbanks have a more elegant looking version of this here.

It feels insane that you need all this overhead when it is not necessary when working by hand. Many of the rules do not have this rhs invention problem or the typed guarding requirement. For cartesian categories, only about 5 of 30 rules have this condition. $(f \cdot g) \cdot h \rightarrow (f \cdot g) \cdot h$ is totally fine for example.

So here is idea 1, replace any new derived information on the right hand side with “function calls” that find that information. For our example, this becomes f => id(dom(type(f))) ⋅ f. Then we need to internalize the typing rules into the rewrite system

typing_rules = @theory begin
    dom(hom(a,b)) => a
    cod(hom(a,b)) => b
    type(id(a)) => hom(a,a)
    type(f ⋅ g) => hom( dom(type(f)) , cod(type(g)) )
    type(f ⊗ₘ g) => hom( dom(type(f)) ⊗ₒ dom(type(g)) , cod(type(f)) ⊗ₒ cod(type(g)))
    type(a ⊗ₒ b) => :ob
    type(munit()) => :ob
    type(σ(a,b)) => hom(a ⊗ₒ b, b ⊗ₒ a)
    type(⋄(a)) => hom(a, munit())
    type(Δ(a)) => hom(a, a ⊗ₒ a)
    type(pair(f,g)) => hom(dom(type(f)), cod(type(f)) ⊗ₒ cod(type(g)))
    type(proj1(a,b)) => hom(a ⊗ₒ b, a)
    type(proj2(a,b)) => hom(a ⊗ₒ b, b)
end

Idea two is to use Metatheory “guards” to enforce type correctness of the left hand side before a rule can fire. As part of this, the necessary type queries are added to the egraph. Alessandro had this idea of just returning the original expression to enforce a guard, which I think is pretty clever. Here is how I think the guard works for the interchange law. This will be cleaner with an upcoming Metatheory macro.

    (f ⊗ₘ p) ⋅ (g ⊗ₘ q) |>
    # future metatheory macros will clean this up
    begin
        fcod = addexpr!(_egraph, :(cod(type($f))))
        gdom = addexpr!(_egraph, :(dom(type($g))))
        pcod = addexpr!(_egraph, :(cod(type($p))))
        qdom = addexpr!(_egraph, :(dom(type($q))))
        if (fcod == gdom && pcod == qdom)
            :(($f ⋅ $g) ⊗ₘ ($p ⋅ $q))
        else
            :(($f ⊗ₘ $p) ⋅ ($g ⊗ₘ $q))
        end
    end

You can see that even making the query about whether the types are right adds them to the egraph. It may be currently unknown what the types are, but by adding them, eventually the typing rules will fire and this rule will have its chance again. This also isn’t as bad as it may appear performance wise, because the egraph in a sense memoizes these typing calls. See the Fibonacci example by Mason Protter which has non exponential complexity despite the naive implementation https://github.com/0x0f0f0f/Metatheory.jl/blob/master/test/test_fibonacci.jl

It is my belief that both of these points (internalization of the typing rules and guards) can be mechanized to produce a sound rule system for an arbitrary GAT. I could easily be wrong. The rough picture I suggest is that you need to infer by looking at the lhs of each rule what typing constraints are implied. If you do not imply the entire typing context, you need to add as many extra guards to ensure it as needed. For example, the well typed-ness of (f ⊗ₘ p) ⋅ (g ⊗ₘ q) implies $cod(f) \otimes cod(p) = dom(g) \otimes dom(q)$ but this does not imply the condition `cod(f) = dom(g) \land cod(p) == dom(q)$ so they must be added.

Here is a full script to run the derivation pair(proj1(a,b),proj2(a,b)) = id(a ⊗ₒ b). I can’t get it to go end to end without assistance yet. We’ll get there. I believe. The real holy grail is not proofs, but simplifications. This also works, but not yet at the depth I’m hoping for.

Evan Patterson’s explanation of the proof has been absolutely vital https://julialang.zulipchat.com/#narrow/stream/230248-catlab.2Ejl/topic/Automatic.20Theorem.20Proving/near/211396362. Zulip is nice for this.

using Pkg
Pkg.activate("metacat")
Pkg.add(url = "https://github.com/0x0f0f0f/Metatheory.jl.git")
using Metatheory
using Metatheory.EGraphs

# See this link for the catlab axioms
# https://github.com/AlgebraicJulia/Catlab.jl/blob/ce2fde9c63a8aab65cf2a7697f43cd24e5e00b3a/src/theories/Monoidal.jl#L127

cat_rules = @theory begin
    f ⋅ id(b) => f
    id(a) ⋅ f => f

    #f => f ⋅ id(dom(type(f)))
    #f => id(cod(type(f))) ⋅ f

    a ⊗ₒ munit() => a
    munit() ⊗ₒ a => a

    #a => a ⊗ₒ munit() 
    #a => munit() ⊗ₒ a

    f ⋅ (g ⋅ h) == (f ⋅ g) ⋅ h


end

monoidal_rules = @theory begin
    id(munit()) ⊗ₘ f => f
    f ⊗ₘ id(munit()) => f
    a ⊗ₒ (b ⊗ₒ c) == (a ⊗ₒ b) ⊗ₒ c
    f ⊗ₘ (h ⊗ₘ j) == (f ⊗ₘ h) ⊗ₘ j
    id(a ⊗ₒ b) == id(a) ⊗ₘ id(b)

    (f ⊗ₘ p) ⋅ (g ⊗ₘ q) |>
    # future metatheory macros will clean this up
    begin
        fcod = addexpr!(_egraph, :(cod(type($f))))
        gdom = addexpr!(_egraph, :(dom(type($g))))
        pcod = addexpr!(_egraph, :(cod(type($p))))
        qdom = addexpr!(_egraph, :(dom(type($q))))
        if (fcod == gdom && pcod == qdom)
            :(($f ⋅ $g) ⊗ₘ ($p ⋅ $q))
        else
            :(($f ⊗ₘ $p) ⋅ ($g ⊗ₘ $q))
        end
    end

    (f ⋅ g) ⊗ₘ (p ⋅ q) => (f ⊗ₘ p) ⋅ (g ⊗ₘ q)
end



sym_rules = @theory begin
    σ(a, b) ⋅ σ(b, a) == id(a ⊗ₒ b)
    (σ(a, b) ⊗ₘ id(c)) ⋅ (id(b) ⊗ₘ σ(a, c)) == σ(a, b ⊗ₒ c)
    (id(a) ⊗ₘ σ(b, c)) ⋅ (σ(a, c) ⊗ₘ id(b)) == σ(a ⊗ₒ b, c)

    (f ⊗ₘ h) ⋅ σ(a, b) |> begin
        fcod = addexpr!(_egraph, :(cod(type($f)))).id
        hcod = addexpr!(_egraph, :(cod(type($h)))).id
        if fcod == a && hcod == b
            :(σ(dom(type($f)), dom(type($h))) ⋅ ($h ⊗ₘ $f))
        else
            :(($f ⊗ₘ $h) ⋅ σ($a, $b))
        end
    end


    σ(c, d) ⋅ (h ⊗ₘ f) |> begin
        fdom = addexpr!(_egraph, :(dom(type($f)))).id
        hdom = addexpr!(_egraph, :(dom(type($h)))).id
        if fdom == c && hdom == d
            :(($f ⊗ₘ $h) ⋅ σ(cod(type($f)), cod(type($h))))
        else
            :(σ($c, $d) ⋅ ($h ⊗ₘ $f))
        end
    end

    # these rules arer not catlab
    σ(a, munit()) == id(a)
    σ(munit(), a) == id(a)
    σ(munit(), munit()) => id(munit() ⊗ₒ munit())

end



diag_rules = @theory begin

    Δ(a) ⋅ (⋄(a) ⊗ₘ id(a)) == id(a)
    Δ(a) ⋅ (id(a) ⊗ₘ ⋄(a)) == id(a)
    Δ(a) ⋅ σ(a, a) == Δ(a)

    (Δ(a) ⊗ₘ Δ(b)) ⋅ (id(a) ⊗ₘ σ(a, b) ⊗ₘ id(b)) == Δ(a ⊗ₒ b)


    Δ(a) ⋅ (Δ(a) ⊗ₘ id(a)) == Δ(a) ⋅ (id(a) ⊗ₘ Δ(a))
    ⋄(a ⊗ₒ b) == ⋄(a) ⊗ₘ ⋄(b)

    Δ(munit()) == id(munit())
    ⋄(munit()) == id(munit())
end


cart_rules = @theory begin

    Δ(a) ⋅ (f ⊗ₘ k) |> begin
        a_id = find(_egraph, a)
        if (
            addexpr!(_egraph, :(dom(type($f)))).id == a_id &&
            addexpr!(_egraph, :(dom(type($k)))).id == a_id
        )
            :(pair($f, $k))
        else
            :(Δ($a) ⋅ ($f ⊗ₘ $k))
        end
    end


    pair(f, k) => Δ(dom(type(f))) ⋅ (f ⊗ₘ k)
    proj1(a, b) == id(a) ⊗ₘ ⋄(b)
    proj2(a, b) == ⋄(a) ⊗ₘ id(b)
    f ⋅ ⋄(b) => ⋄(dom(type(f)))
    # Has to invent f. Hard to fix.
    # ⋄(b) => f ⋅ ⋄(b)  

    f ⋅ Δ(b) => Δ(dom(type(f))) ⋅ (f ⊗ₘ f)
    Δ(a) ⋅ (f ⊗ₘ f) => f ⋅ Δ(cod(type(f)))
end





typing_rules = @theory begin
    dom(hom(a, b)) => a
    cod(hom(a, b)) => b
    type(id(a)) => hom(a, a)
    type(f ⋅ g) => hom(dom(type(f)), cod(type(g)))
    type(f ⊗ₘ g) => hom(dom(type(f)) ⊗ₒ dom(type(g)), cod(type(f)) ⊗ₒ cod(type(g)))
    type(a ⊗ₒ b) => :ob
    type(munit()) => :ob
    type(σ(a, b)) => hom(a ⊗ₒ b, b ⊗ₒ a)
    type(⋄(a)) => hom(a, munit())
    type(Δ(a)) => hom(a, a ⊗ₒ a)
    type(pair(f, g)) => hom(dom(type(f)), cod(type(f)) ⊗ₒ cod(type(g)))
    type(proj1(a, b)) => hom(a ⊗ₒ b, a)
    type(proj2(a, b)) => hom(a ⊗ₒ b, b)
end


rules = typing_rules ∪ cat_rules ∪ monoidal_rules ∪ sym_rules ∪ diag_rules ∪ cart_rules ∪ typing_rules

using MacroTools

# A goofy little helper macro
# Taking inspiration from Lean/Dafny/Agda
macro calc(e...)
    theory = eval(e[1])
    e = rmlines(e[2])
    @assert e.head == :block
    for (a, b) in zip(e.args[1:end-1], e.args[2:end])
        println(a, " =? ", b)
        @time println(areequal(theory, a, b; timeout = 40))
    end
end

# Get the Julia motor hummin'
@calc rules begin

    ((⋄(a) ⊗ₘ ⋄(b)) ⋅ σ(munit(), munit()))
    (σ(a, b) ⋅ (⋄(b) ⊗ₘ ⋄(a)))

end 


@calc rules begin
    id(a ⊗ₒ b)
    id(a) ⊗ₘ id(b)
    (Δ(a) ⋅ (id(a) ⊗ₘ ⋄(a))) ⊗ₘ id(b)
    (Δ(a) ⋅ (id(a) ⊗ₘ ⋄(a))) ⊗ₘ (Δ(b) ⋅ (⋄(b) ⊗ₘ id(b)))
    (Δ(a) ⊗ₘ Δ(b)) ⋅ ((id(a) ⊗ₘ ⋄(a)) ⊗ₘ (⋄(b) ⊗ₘ id(b)))
    (Δ(a) ⊗ₘ Δ(b)) ⋅ (id(a) ⊗ₘ (⋄(a) ⊗ₘ ⋄(b)) ⊗ₘ id(b))
    (Δ(a) ⊗ₘ Δ(b)) ⋅ (id(a) ⊗ₘ ((⋄(a) ⊗ₘ ⋄(b)) ⋅ σ(munit(), munit())) ⊗ₘ id(b))
    (Δ(a) ⊗ₘ Δ(b)) ⋅ ((id(a) ⊗ₘ (σ(a, b) ⋅ (⋄(b) ⊗ₘ ⋄(a))) ⊗ₘ id(b)))
    (Δ(a) ⊗ₘ Δ(b)) ⋅ ((id(a) ⊗ₘ (σ(a, b) ⋅ (⋄(b) ⊗ₘ ⋄(a))) ⊗ₘ id(b)))
    (Δ(a) ⊗ₘ Δ(b)) ⋅ ((id(a) ⋅ id(a)) ⊗ₘ (σ(a, b) ⋅ (⋄(b) ⊗ₘ ⋄(a))) ⊗ₘ id(b))
    (Δ(a) ⊗ₘ Δ(b)) ⋅ ((id(a) ⊗ₘ σ(a, b) ⊗ₘ id(b)) ⋅ (id(a) ⊗ₘ (⋄(b) ⊗ₘ ⋄(a)) ⊗ₘ id(b)))
    Δ(a ⊗ₒ b) ⋅ (id(a) ⊗ₘ (⋄(b) ⊗ₘ ⋄(a)) ⊗ₘ id(b))
    Δ(a ⊗ₒ b) ⋅ (id(a) ⊗ₘ (⋄(b) ⊗ₘ ⋄(a)) ⊗ₘ id(b))
    Δ(a ⊗ₒ b) ⋅ (proj1(a, b) ⊗ₘ proj2(a, b))
    pair(proj1(a, b), proj2(a, b))
end

# shorter proof also accepted
@calc rules begin
    id(a ⊗ₒ b)
    (Δ(a) ⊗ₘ Δ(b)) ⋅ ((id(a) ⊗ₘ (σ(a, b) ⋅ (⋄(b) ⊗ₘ ⋄(a))) ⊗ₘ id(b)))
    pair(proj1(a, b), proj2(a, b))
end

Where to go from here

We could just say this is already nice and push forward.
This is all totally unintegrated with catlab itself.
I’m kind of intrigued at trying to make a pure egg version of the above, especially so I can compile to wasm
-String diagrams https://julialang.zulipchat.com/#narrow/stream/230248-catlab.2Ejl/topic/Using.20Metatheory.20for.20String.20Diagrams There is a compelling argument that string diagrams are a preferred representation, normalizing the associativity,commutative, and unit properties of a categorical construction. The useful canonicity of string diagrams is a viewpoint Evan Patterson has put forward. It has been suggested by Alessandro and others that generalizing the EGraph data structure in some sense may be to go. Extensions of the egraph that work natively on something more string diagram-like?
More wildly speculative. Co-end calculus? How to deal with binding forms? Daniele Palombi has brought the coend calculus https://arxiv.org/pdf/1501.02503.pdf This seems like a very interesting application. I think the direct attack on this problem using Metatheory requires understanding how to deal with alpha equivalence in the EGraph Conversation here: https://julialang.zulipchat.com/#narrow/stream/277860-metatheory.2Ejl/topic/Formal.20Proofs.20for.20CT.3F. I personally would like to start attacking the problem from the angle of regular integrals and sums, which are syntactically analogous.
Proof Recording and output. The actual proof is there in the egraph rewriting process if we record it. Alessandro sounded interested in this direction and my understanding is it is also in the works for egg
Cody has also pointed out to me that he had a project with similar goals but different methods many moons ago https://github.com/codyroux/catnapp. I don’t think this got that far.

E-Graph Pattern Matching (Part II)

By: Philip Zucker

Re-posted from: https://www.philipzucker.com/egraph-2/

Last time we spent a bit of effort building an e-graph data structure in Julia. The e-graph is a data structure that compactly stores and maintains equivalence classes of terms.

We can use the data structure for equational rewriting by iteratively pattern matching terms inside of it and adding new equalities found by instantiating those patterns. For example, we could use the pattern ~x + 0 to instantiate the rule ~x + 0 == ~x, find that ~x binds to foo23(a,b,c) in the e-graph and then insert into the e-graph the equality foo23(a,b,c) + 0 == foo23(a,b,c) and do this over and over. The advantage of the e-graph vs just rewriting terms is that we never lose information while doing this or rewrite ourselves into a corner. The disadvantage is that we maintain this huge data structure.

The rewrite rule problem can be viewed something like a graph. The vertices of the graph are every possible term. A rewrite rule that is applicable to that node describes a directed edge in the graph.

This graph is very large, infinite often, so it can only be described implicitly.

This graph perspective fails to capture some useful properties though. Treating each term as an indivisible vertex fails the capture that there can be rewriting happening in only a small part of the term, the vast majority of it left unchanged. There is a lot of opportunity for shared computations.

The EGraph from this perspective is a data structure for holding the already seen set of vertices efficiently and sharing these computations.

You can use this procedure for theorem proving. Insert the two terms you want to prove equal into the e-graph. Iteratively rewrite using your equalities. At each iteration, check whether two nodes in the e-graph have become equivalent. If so, congrats, problem proved! This is the analog of finding a path between two nodes in a graph by doing a bidirectional search from the start and endpoint. This is roughly the main method by which Z3 reasons about syntactic terms under equality in the presence of quantified equations. There are also methods by which to reconstruct the proof found if you want it.

Another application of basically the same algorithm is finding optimized rewrites. Apply rewrites until you’re sick of it, then extract from the equivalence class of your query term your favorite equivalent term. This is a very intriguing application to me as it produces more than just a yes/no answer.

Pattern Matching

Pattern matching takes a pattern with named holes in it and tries to find a way to fill the holes to match a term. The package SymbolicUtils.jl is useful for pattern matching against single terms, which is what I’ll describe in this section, but not quite set up to pattern match in an e-graph.

Pattern matching against a term is a very straightforward process once you sit down to do it. Where it gets complicated is trying to be efficient. But one should walk before jogging. I sometimes find myself so overwhelmed by performance considerations that I never get started.

First, one needs a data type for describing patterns (well, one doesn’t need it, but it’s nice. You could directly use pattern matching combinators).

struct PatVar
    id::Symbol
end

struct PatTerm
    head::Symbol
    args::Array{Union{PatTerm, PatVar}}
end

Pattern = Union{PatTerm,PatVar}

There is an analogous data structure in SymbolicUtils which I probably should’ve just used. For example Slot is basically PatVar.

The pattern matching function takes a Term and Pattern and returns a dictionary Dict{PatVar,Term} of bindings of pattern variables to terms or nothing if it fails

Matching a pattern variable builds a dictionary of bindings from that variable to a term.

match(t::Term,  p::PatVar ) = Dict(p => t)

Matching a PatTerm requires we check if anything is obviously wrong (wrong head, or wrong number of args) and then recurse on the args.
The resulting binding dictionaries are checked for consistency of their bindings and merged.

function merge_consistent(ds)
    newd = Dict()
    for d in ds
        for (k,v) in d
            if haskey(newd,k)
                if newd[k] != v
                    return nothing
                end
            else
                newd[k] = v
            end
        end
    end
    return newd
end

function match(t::Term, p::PatTerm) 
    if t.head != p.head || length(t.args) != length(p.args)
        return nothing
    else
        merge_consistent( [ match(t1,p1) for (t1,p1) in zip(t.args, p.args) ])
    end
end

There are other ways of arranging this computation, such as passing a dictionary parameter along and modifying it, by I find the above purely functional definition the most clear.

Pattern Matching in E-Graphs

The twist that E-Graphs throw into the pattern matching process is that instead of checking whether a child of a term matches the pattern, we need to check for any possible child matching in the equivalence class of children. This means our pattern matching procedure contains a backtracking search.

The de Moura and Bjorner e-matching paper describes how to do this efficiently via an interpreted virtual machine. An explicit virtual machine is important for them because the patterns change during the Z3 search process, but it seems less relevant for the use case here or in egg. It may be better to use partial evaluation techniques to remove the existence of the virtual machine at runtime like in A Typed, Algebraic Approach to Parsing but I haven’t figured out how to do it yet.

The Simplify paper describes a much simpler inefficient pattern matching algorithm in terms of generators. This can be directly translated to Julia using Channels. This is a nice blog post describing how to build generators in Julia that work like python generators. Basically, as soon as you define your generator function, wrap the whole thing in a Channel() do c and when you want to yield myvalue call put!(c, myvalue).

# https://www.hpl.hp.com/techreports/2003/HPL-2003-148.pdf
function ematchlist(e::EGraph, t::Array{Union{PatTerm, PatVar}} , v::Array{Id}, sub)
    Channel() do c
        if length(t) == 0
            put!(c, sub)
        else
            for sub1 in ematch(e, t[1], v[1], sub)
                for sub2 in ematchlist(e, t[2:end], v[2:end], sub1)
                    put!(c, sub2)
                end
            end
        end
    end
end
# sub should be a map from pattern variables to Id
function ematch(e::EGraph, t::PatVar, v::Id, sub)
    Channel() do c
        if haskey(sub, t)
            if find_root!(e, sub[t]) == find_root!(e, v)
                put!(c, sub)
            end
        else
            put!(c,  Base.ImmutableDict(sub, t => find_root!(e, v)))
        end
    end
end

    
function ematch(e::EGraph, t::PatTerm, v::Id, sub)
    Channel() do c
        for n in e.classes[find_root!(e,v)].nodes
            if n.head == t.head
                for sub1 in ematchlist(e, t.args , n.args , sub)
                    put!(c,sub1)
                end
            end
        end
    end
end

You can then instantiate patterns with the returned dictionaries via

function instantiate(e::EGraph, p::PatVar , sub)
    sub[p]
end

function instantiate(e::EGraph, p::PatTerm , sub)
    push!( e, Term(p.head, [ instantiate(e,a,sub) for a in p.args ] ))
end

And build rewrite rules

struct Rule
    lhs::Pattern
    rhs::Pattern
end

function rewrite!(e::EGraph, r::Rule)
    matches = []
    EMPTY_DICT2 = Base.ImmutableDict{PatVar, Id}(PatVar(:____),  Id(-1))
    for (n, cls) in e.classes
        for sub in ematch(e, r.lhs, n, EMPTY_DICT2)
            push!( matches, ( instantiate(e, r.lhs ,sub)  , instantiate(e, r.rhs ,sub)))
        end
    end
    for (l,r) in matches
        union!(e,l,r)
    end
    rebuild!(e)
end

Here’s a very simple equation proving function that takes in a pile of rules

function prove_eq(e::EGraph, t1::Id, t2::Id , rules)
    for i in 1:3
        if in_same_set(e,t1,t2)
            return true
        end
        for r in rules
            rewrite!(e,r) # I should split this stuff up. We only need to rebuild at the end
        end
    end
    return nothing
end

As sometimes happens, I’m losing steam on this and would like to work on something different for a bit. But I loaded up the WIP at https://github.com/philzook58/EGraphs.jl.

Bits and Bobbles

Catlab

You can find piles of equational axioms in Catlab like here for example

A tricky thing is that some equational axioms of categories seem to produce variables out of thin air.
Consider the rewrite rule ~f => id(~A) ⋅ ~f. Where does the A come from? It’s because we’ve highly suppressed all the typing information. The A should come from the type of f.

I think the simplest way to fix is the “type tag” approach I mentioned here https://www.philipzucker.com/theorem-proving-for-catlab-2-lets-try-z3-this-time-nope/ and can be read about here https://people.mpi-inf.mpg.de/~jblanche/mono-trans.pdf. You wrap every term in a tagging function so that f becomes tag(f, Hom(a,b)). This approach makes sense because it turns GATs purely equational without the implicit typing context, I think. It is a bummer that it will probably choke the e-graph with junk though.

It may be best to build a TypedTerm to use rather than Term that has an explicit field for the type. This brings it close to the representation that Catlab already uses for syntax, so maybe I should just directly use that. I like having control over my own type definitions though and Catlab plays some monkey business with the Julia level types. 🙁

struct TypedTerm
    type
    head
    args
end

As I have written it so far, I don’t allow the patterns to match on the heads of terms, although there isn’t any technical reason to not allow it. The natural way of using a TypedTerm would probably want to do so though, since you’d want to match on the type sometimes while keeping the head a pattern. Another possible way to do this that avoids all these issues is to actually make terms a bit like how regular Julia Expr are made, which usually has :call in the head position. Then by convention the first arg could be the actual head, the second arg the type, and the rest of the arguments the regular arguments.

A confusing example sketch:

TypedTerm(:mcopy, typedterm(Hom(a,otimes(a,a))) , [TypedTerm(a,Ob,[])])
would become
Term(:call, [:mcopy, term!(Hom(a,otimes(a,a)) , term!(a) ])

Catlab already does some simplifications on it’s syntax, like collecting up associative operations into lists. It is confusing how to integrate the with the e-graph structure so I think the first step is to just not. That stuff isn’t on by default, so you can get rid of it by defining your own versions using @syntax

using Catlab
using Catlab.Theories
import Catlab.Theories: compose, Ob, Hom
@syntax MyFreeCartesianCategory{ObExpr,HomExpr} CartesianCategory begin
  #otimes(A::Ob, B::Ob) = associate_unit(new(A,B), munit)
  #otimes(f::Hom, g::Hom) = associate(new(f,g))
  #compose(f::Hom, g::Hom) = associate(new(f,g; strict=true))

  #pair(f::Hom, g::Hom) = compose(mcopy(dom(f)), otimes(f,g))
  #proj1(A::Ob, B::Ob) = otimes(id(A), delete(B))
  #proj2(A::Ob, B::Ob) = otimes(delete(A), id(B))
end

# we can translate the axiom sets programmatically.
names(Main.MyFreeCartesianCategory)
#A = Ob(MyFreeCartesianCategory, :A)

A = Ob(MyFreeCartesianCategory, :A)
B = Ob(MyFreeCartesianCategory, :B)
f = Hom(:f, A,B)
g = Hom(:g, B, A)
h = compose(compose(f,g),f)
# dump(h)
dump(f)

#=
Main.MyFreeCartesianCategory.Hom{:generator}
  args: Array{Any}((3,))
    1: Symbol f
    2: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol A
      type_args: Array{GATExpr}((0,))
    3: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol B
      type_args: Array{GATExpr}((0,))
  type_args: Array{GATExpr}((2,))
    1: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol A
      type_args: Array{GATExpr}((0,))
    2: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol B
      type_args: Array{GATExpr}((0,))
=#

Rando thought: Can an EGraph data structure itself be expressed in Catlab? I note that IntDisjointSet does show up in a Catlab definition of pushouts. The difficulty and scariness of the EGraph is maintaining it’s invariants, which may be expressible as some kind of composition condition on the various maps that make up the EGraph.

Comments on Equation Search

The rewrite rule problem can be viewed something like a graph. The vertices of the graph is every possible term. A rewrite rule that is applicable to that node describes a directed edge in the graph.

This graph is very large, infinite often, so it can only be described implicitly.

Proving the equivalence of two terms is like finding a path in this graph. You could attempt a greedy approach or a breadth first search approach, or any variety of your favorite search algorithm like A*.

The EGraph from this perspective is a data structure for holding the already seen vertices.

I suspect some heuristics to be helpful like applying the “simplifying” direction of the equations more often than the “complicating” direction. In a 5:1 ratio let’s say.

A natural algorithm to consider for optimizing terms is to take the best expression found so far, destroy the e-graph and place just that expression in it.
Try rewrite rules. If the best is still old query, don’t destroy e-graph and apply a new round of rewrites the widen the radius of the e-graph. Gives you kind of a combo greedy + complete.

Partial Evaluation

Partial evaluation seems like a good technique for optimizing pattern matches. The ideal pattern match code expands out to a nicely nested set of if statements. One alternative to passing a dictionary might be to assign each pattern variable to an integer at compile time and instead pass an array, which would be a bit better. However, by using metaprogramming we can insert local variables into generated code and avoid the need for a dictionary being based around at runtime. Then the julia compiler can register allocate and what not like it usually does (and quite efficiently I’d expect).

See this post (in particular to reference links https://www.philipzucker.com/metaocaml-style-partial-evaluation-in-coq/ for more on partial evaluation.

A first thing to consider is that we’re building up and destroying dictionaries at a shocking rate.

Secondly dictionaries themselves are (relatively) expensive things.

I experimented a bit with using a curried form for match to see if maybe the Julia compiler was smart enough to sort of evaluate away my use of dictionaries, but that did not seem to be the case.

I found the examining the @code_llvm and @code_native of the following simple experiments illuminating as to what Julia can and can no get rid of when the dictionary is obviously known at compile time to human eyes. Mostly it needs to be very obvious. I suspect Julia does not have any built special compiler magic reasoning for dictionaries and trying to automatically infer what’s safely optimizable by actually expanding the definition of a dictionary op sounds impossible.

function foo() #bad
    x = Dict(:a => 4)
    return x[:a]
end
function foo2()
    return 4
end


function foo3() # bad
    x = Dict(:a => 4)
    () -> x[:a]
end

function foo4() # much better. But we do pay a closure cost.
    x = Dict(:a => 4)
    r = x[:a]
    () -> r
end

function foo5()
    x = Dict(:a => 4)
    :( () -> $(x[:a]) )
end

function foo7()
    return foo()
end


function foo6() # julia does however do constant propagation and deletion of unnecessary bullcrap
    x = 4
    y = 7
    y += x
    return x
end

function foo7() # this compiles well
    x = Base.ImmutableDict( :a => 4)
    return x[:a]
end

function foo8()
    x = Base.ImmutableDict( :a => 4)
    r = x[:a]
    z -> z == r # still has a closure indirection. It's pretty good though
end

function foo9()
    x = Base.ImmutableDict( :a => 4)
    r = x[:a]
    z -> z == 4 
end


#@code_native foo7()

z = foo4()
#@code_llvm z()

z = eval(foo5())
@code_native z()
@code_native foo2()
@code_llvm foo6()
z = foo8()
@code_native z(3)
z = foo9()
@code_native z(3)
@code_native foo7()

# so it's possible that using an ImmutableDict is sufficient for julia to inline almost evertything itself.
# You want the indexing to happen outside the closure.

A possibly useful technique is partial evaluation. We have in a sense built an interpreter for the pattern language. Specializing this interpeter gives a fast pattern matcher. We explicitly can build up the code Expr that will do the pattern matching by interpreting the pattern.

So here’s a version that takes in a pattern and builds code the perform the pattern in terms of if then else statements and runtime bindings.

Here’s a sketch. This version only returns true or false rather than returning the bindings. I probably need to cps it to get actual bindings out. I think that’s what i was starting to do with the c parameter.

function prematch(p::PatVar, env, t, c )
    if haskey(env, p)
       :(  $(env[p]) != $t  && return false )
    else
       env[p] = gensym(p.id) #make fresh variable
       :(  $(env[p]) = $t  ) #bind it at runtime to the current t
    end
end


function sequence(code)
    foldl(
        (s1,s2) -> quote
                        $s1
                        $s2
                    end
        , code)
end

function prematch(p::PatTerm, env, t, c)
    if length(p.args) == 0
        :( $(t).head == $( QuoteNode(p.head)  ) && length($t) == 0 && return false   )
    else
        quote
            if $(t).head != $( QuoteNode(p.head) ) || $( length(p.args) ) != length($(t).args)
                return false
            else
                $( sequence([prematch(a, env, :( $(t).args[$n] ), c)   for (n,a) in enumerate(p.args) ]))
            end
        end 
    end
end


println( prematch( PatTerm(:f, []) , Dict(),  :w , nothing) )
println( prematch( PatTerm(:f, [PatVar(:a)]) , Dict(),  :t , nothing) )
println( prematch( PatTerm(:f, [PatVar(:a), PatVar(:a)]) , Dict(),  :t , nothing) )

E-Graph Pattern Matching (Part II)

By: Philip Zucker

Re-posted from: https:/www.philipzucker.com/egraph-2/

Last time we spent a bit of effort building an e-graph data structure in Julia. The e-graph is a data structure that compactly stores and maintains equivalence classes of terms.

This graph is very large, infinite often, so it can only be described implicitly.

The EGraph from this perspective is a data structure for holding the already seen set of vertices efficiently and sharing these computations.

Pattern Matching

First, one needs a data type for describing patterns (well, one doesn’t need it, but it’s nice. You could directly use pattern matching combinators).

struct PatVar
    id::Symbol
end

struct PatTerm
    head::Symbol
    args::Array{Union{PatTerm, PatVar}}
end

Pattern = Union{PatTerm,PatVar}

There is an analogous data structure in SymbolicUtils which I probably should’ve just used. For example Slot is basically PatVar.

The pattern matching function takes a Term and Pattern and returns a dictionary Dict{PatVar,Term} of bindings of pattern variables to terms or nothing if it fails

Matching a pattern variable builds a dictionary of bindings from that variable to a term.

match(t::Term,  p::PatVar ) = Dict(p => t)

function merge_consistent(ds)
    newd = Dict()
    for d in ds
        for (k,v) in d
            if haskey(newd,k)
                if newd[k] != v
                    return nothing
                end
            else
                newd[k] = v
            end
        end
    end
    return newd
end

function match(t::Term, p::PatTerm) 
    if t.head != p.head || length(t.args) != length(p.args)
        return nothing
    else
        merge_consistent( [ match(t1,p1) for (t1,p1) in zip(t.args, p.args) ])
    end
end

There are other ways of arranging this computation, such as passing a dictionary parameter along and modifying it, by I find the above purely functional definition the most clear.

Pattern Matching in E-Graphs

# https://www.hpl.hp.com/techreports/2003/HPL-2003-148.pdf
function ematchlist(e::EGraph, t::Array{Union{PatTerm, PatVar}} , v::Array{Id}, sub)
    Channel() do c
        if length(t) == 0
            put!(c, sub)
        else
            for sub1 in ematch(e, t[1], v[1], sub)
                for sub2 in ematchlist(e, t[2:end], v[2:end], sub1)
                    put!(c, sub2)
                end
            end
        end
    end
end
# sub should be a map from pattern variables to Id
function ematch(e::EGraph, t::PatVar, v::Id, sub)
    Channel() do c
        if haskey(sub, t)
            if find_root!(e, sub[t]) == find_root!(e, v)
                put!(c, sub)
            end
        else
            put!(c,  Base.ImmutableDict(sub, t => find_root!(e, v)))
        end
    end
end

    
function ematch(e::EGraph, t::PatTerm, v::Id, sub)
    Channel() do c
        for n in e.classes[find_root!(e,v)].nodes
            if n.head == t.head
                for sub1 in ematchlist(e, t.args , n.args , sub)
                    put!(c,sub1)
                end
            end
        end
    end
end

You can then instantiate patterns with the returned dictionaries via

function instantiate(e::EGraph, p::PatVar , sub)
    sub[p]
end

function instantiate(e::EGraph, p::PatTerm , sub)
    push!( e, Term(p.head, [ instantiate(e,a,sub) for a in p.args ] ))
end

And build rewrite rules

struct Rule
    lhs::Pattern
    rhs::Pattern
end

function rewrite!(e::EGraph, r::Rule)
    matches = []
    EMPTY_DICT2 = Base.ImmutableDict{PatVar, Id}(PatVar(:____),  Id(-1))
    for (n, cls) in e.classes
        for sub in ematch(e, r.lhs, n, EMPTY_DICT2)
            push!( matches, ( instantiate(e, r.lhs ,sub)  , instantiate(e, r.rhs ,sub)))
        end
    end
    for (l,r) in matches
        union!(e,l,r)
    end
    rebuild!(e)
end

Here’s a very simple equation proving function that takes in a pile of rules

function prove_eq(e::EGraph, t1::Id, t2::Id , rules)
    for i in 1:3
        if in_same_set(e,t1,t2)
            return true
        end
        for r in rules
            rewrite!(e,r) # I should split this stuff up. We only need to rebuild at the end
        end
    end
    return nothing
end

As sometimes happens, I’m losing steam on this and would like to work on something different for a bit. But I loaded up the WIP at https://github.com/philzook58/EGraphs.jl.

Bits and Bobbles

Catlab

You can find piles of equational axioms in Catlab like here for example

struct TypedTerm
    type
    head
    args
end

A confusing example sketch:

TypedTerm(:mcopy, typedterm(Hom(a,otimes(a,a))) , [TypedTerm(a,Ob,[])])
would become
Term(:call, [:mcopy, term!(Hom(a,otimes(a,a)) , term!(a) ])

using Catlab
using Catlab.Theories
import Catlab.Theories: compose, Ob, Hom
@syntax MyFreeCartesianCategory{ObExpr,HomExpr} CartesianCategory begin
  #otimes(A::Ob, B::Ob) = associate_unit(new(A,B), munit)
  #otimes(f::Hom, g::Hom) = associate(new(f,g))
  #compose(f::Hom, g::Hom) = associate(new(f,g; strict=true))

  #pair(f::Hom, g::Hom) = compose(mcopy(dom(f)), otimes(f,g))
  #proj1(A::Ob, B::Ob) = otimes(id(A), delete(B))
  #proj2(A::Ob, B::Ob) = otimes(delete(A), id(B))
end

# we can translate the axiom sets programmatically.
names(Main.MyFreeCartesianCategory)
#A = Ob(MyFreeCartesianCategory, :A)

A = Ob(MyFreeCartesianCategory, :A)
B = Ob(MyFreeCartesianCategory, :B)
f = Hom(:f, A,B)
g = Hom(:g, B, A)
h = compose(compose(f,g),f)
# dump(h)
dump(f)

#=
Main.MyFreeCartesianCategory.Hom{:generator}
  args: Array{Any}((3,))
    1: Symbol f
    2: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol A
      type_args: Array{GATExpr}((0,))
    3: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol B
      type_args: Array{GATExpr}((0,))
  type_args: Array{GATExpr}((2,))
    1: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol A
      type_args: Array{GATExpr}((0,))
    2: Main.MyFreeCartesianCategory.Ob{:generator}
      args: Array{Symbol}((1,))
        1: Symbol B
      type_args: Array{GATExpr}((0,))
=#

Comments on Equation Search

The rewrite rule problem can be viewed something like a graph. The vertices of the graph is every possible term. A rewrite rule that is applicable to that node describes a directed edge in the graph.

This graph is very large, infinite often, so it can only be described implicitly.

The EGraph from this perspective is a data structure for holding the already seen vertices.

I suspect some heuristics to be helpful like applying the “simplifying” direction of the equations more often than the “complicating” direction. In a 5:1 ratio let’s say.

Partial Evaluation

See this post (in particular to reference links https://www.philipzucker.com/metaocaml-style-partial-evaluation-in-coq/ for more on partial evaluation.

A first thing to consider is that we’re building up and destroying dictionaries at a shocking rate.

Secondly dictionaries themselves are (relatively) expensive things.

I experimented a bit with using a curried form for match to see if maybe the Julia compiler was smart enough to sort of evaluate away my use of dictionaries, but that did not seem to be the case.

function foo() #bad
    x = Dict(:a => 4)
    return x[:a]
end
function foo2()
    return 4
end


function foo3() # bad
    x = Dict(:a => 4)
    () -> x[:a]
end

function foo4() # much better. But we do pay a closure cost.
    x = Dict(:a => 4)
    r = x[:a]
    () -> r
end

function foo5()
    x = Dict(:a => 4)
    :( () -> $(x[:a]) )
end

function foo7()
    return foo()
end


function foo6() # julia does however do constant propagation and deletion of unnecessary bullcrap
    x = 4
    y = 7
    y += x
    return x
end

function foo7() # this compiles well
    x = Base.ImmutableDict( :a => 4)
    return x[:a]
end

function foo8()
    x = Base.ImmutableDict( :a => 4)
    r = x[:a]
    z -> z == r # still has a closure indirection. It's pretty good though
end

function foo9()
    x = Base.ImmutableDict( :a => 4)
    r = x[:a]
    z -> z == 4 
end


#@code_native foo7()

z = foo4()
#@code_llvm z()

z = eval(foo5())
@code_native z()
@code_native foo2()
@code_llvm foo6()
z = foo8()
@code_native z(3)
z = foo9()
@code_native z(3)
@code_native foo7()

# so it's possible that using an ImmutableDict is sufficient for julia to inline almost evertything itself.
# You want the indexing to happen outside the closure.

So here’s a version that takes in a pattern and builds code the perform the pattern in terms of if then else statements and runtime bindings.

function prematch(p::PatVar, env, t, c )
    if haskey(env, p)
       :(  $(env[p]) != $t  && return false )
    else
       env[p] = gensym(p.id) #make fresh variable
       :(  $(env[p]) = $t  ) #bind it at runtime to the current t
    end
end


function sequence(code)
    foldl(
        (s1,s2) -> quote
                        $s1
                        $s2
                    end
        , code)
end

function prematch(p::PatTerm, env, t, c)
    if length(p.args) == 0
        :( $(t).head == $( QuoteNode(p.head)  ) && length($t) == 0 && return false   )
    else
        quote
            if $(t).head != $( QuoteNode(p.head) ) || $( length(p.args) ) != length($(t).args)
                return false
            else
                $( sequence([prematch(a, env, :( $(t).args[$n] ), c)   for (n,a) in enumerate(p.args) ]))
            end
        end 
    end
end


println( prematch( PatTerm(:f, []) , Dict(),  :w , nothing) )
println( prematch( PatTerm(:f, [PatVar(:a)]) , Dict(),  :t , nothing) )
println( prematch( PatTerm(:f, [PatVar(:a), PatVar(:a)]) , Dict(),  :t , nothing) )

juliabloggers.com

A Julia Language Blog Aggregator

Author Archives: Philip Zucker

Progress on Automated Reasoning for Catlab with Metatheory.jl Egraphs

Where to go from here

E-Graph Pattern Matching (Part II)

Pattern Matching

Pattern Matching in E-Graphs

Bits and Bobbles

Catlab

Comments on Equation Search

Partial Evaluation

E-Graph Pattern Matching (Part II)

Pattern Matching

Pattern Matching in E-Graphs

Bits and Bobbles

Catlab

Comments on Equation Search

Partial Evaluation