unific

Author: Jim Wise ([email protected])
Copyright: Copyright © 2011, 2012 Jim Wise
License: 2-clause BSD-Style (see LICENSE.txt)

DESCRIPTION:

Unific is a ruby unification engine.

A unification engine is an essential part of a logic programming environment (the whole logic programming environment this is taken from is available as the in-development Rulog (Ruby With Logic) gem), but can also be useful on its own as a pattern matching engine which can enforce consistency across multiple matches.

How to Use This Gem

To get started, include this gem using

require 'rubygems' # only need on Ruby 1.8.x, but harmless elsewhere
require 'unific'

This gem provides the Unific module. This module provides several methods which implement a unification engine.

What is Unfication?

Unfication is a generalization of pattern matching – it allows you to compare two patterns or values, and determine if they match, possibly substituting variables in each pattern to make a match possible.

Two values can be unified by passing both to the Unific::unify class method. This method returns false if the two values cannot be unified, or a (possibly empty) environment if they can. For the moment, it is enough to remember that this environment is a true value, but soon we will see that it is much more.

Simple unification

So, what does it mean to unify two values?

In the simplest case, we can unify two values if they are equal (according to ==):

Unific::unify("foo", "foo")
==> succeeds, returns an empty environment, which is a true value (see below)

Unific::unify(42, 42)
==> succeeds, returns an empty environment, which is a true value (see below)

Unific::unify("foo", 42)
==> false

If two Enumerables are compared, they match if (and only if) their corresponding members match (and thus Enumerables of different lengths do not unify):

Unific::unify([42, "a", "b"], [42, "a", "b"])
==> an empty environment, which is a true value (see below)

Unific::unify({"a" => 1, "b" => 2}, {"a" => 1, "b" => 2})
==> an empty environment, which is a true value (see below)

Unific::unify([42, "a", "b", "hike!"], [42, "a", "b"])
==> false

Unific::unify([42, 33, "b"], [42, "a", "b"])
==> false

this implies that nested Enumerables are unified recursively:

Unific::unify([["a", 42], ["b", 33]], [["a", 42], ["b", 33]])
==> returns an empty environment, which is a true value (see below)

As an exception, strings are not unified recursively in this manner, even though they are Enumerables.

So far, this does nothing that we could not do with the == operator… but there’s more.

Pattern variables

A unification variable of class Unific::Var can be created with any name of your choice, for use in unifications:

Unific::Var.new("x")
==> #<Unific::Var:0x823b920 @name="x">

when used with Unific::unify, a variable will successfully unify with any value:

x = Unific::Var.new("x")
Unific::unify(x, 42);
==> a non-empty environment, which is a true value (see below)

This also applies when a variable is unified as part of a larger expression

x = Unific::Var.new("x")
Unific::unify([1, x, 3], [1, 42, 3]);
==> a non-empty environment, which is a true value (see below)

Note that as a variable unifies with any object, a single variable can also be unified with an entire Enumerable

x = Unific::Var.new("x")
Unific::unify(x, [1, 2, 3]);
==> a non-empty environment, which is a true value (see below)

Note that when a variable matches a given value, it must match the same value everywhere in the same expression:

x = Unific::Var.new("x")
e = Unific::unify([x, x], [1, 2])
==> false; x cannot be unified with both 1 and 2 in the same expression
x = Unific::Var.new("x")
e = Unific::unify([x, x], [2, 2])
==> a non-empty environment, which is a true value (see below)

Binding a variable to another variable always succeeds (but is very useful when we start using the environments returned by unification, below).

So where does the environment returned by Unific::unify come in? The returned environment, an object of class Unific::Env, matches (‘binds’) each variable to the value with which it was actually unified. The method Env#[] can be used to see whether a variable is bound in a given environment:

x = Unific::Var.new("x")
y = Unific::Var.new("y")
e = Unific::unify([1, x, 3], [1, 42, 3]);
e[x]
==> 42
e[y]
==> nil

So far, we can perform some relatively interesting pattern matches with Unific:

jumper = Unific::Var.new("jumper")
jumpee = Unific::Var.new("jumpee")
pattern = ["The", "quick", "brown", jumper, "jumped", "over", "the", "lazy", jumpee]
sentence = "The quick brown fox jumped over the lazy dog"
e = Unific::unify(pattern, sentence.split)
e[jumper]
==> "fox"

but where this becomes more interesting is when we want to perform multiple unifications in a consistent way.

Chaining unifications

Any unification can be performed against a given environment by using Env#unify method. If the given environment is empty, this is the same as calling Unific::unify. If the environment already has bindings, however, the new unification will use these bindings; this means that any variable matches performed against the same variables must be consistent with the values already bound to those variables:

animal = Unific::Var.new("animal")
e = Unific::unify([animal, "is", "a", "mammal"], "fido is a mammal".split)
e[animal]
==> "fido"
e.unify([animal, "is", "a", "bear"], "teddy is a bear".split)
==> false (cannot unify, as "animal" is bound to "fido" in environment e)

Note that unifying against a given environment returns a new environment in which any additional variables matched by that unification are also bound; the original environment is not modified.

This is often used by chaining calls to unify (since each call returns a new environment); note that this can only be done if none of the unifications returns ‘false’, however:

a = Unific::Var.new("a")
a = Unific::Var.new("b")
e = Unific::unify([a, 1, 2], [0, 1, 2]).unify([a, b, 5], [0, 3, 5])
==> a new environment where a is bound to 0, and b is bound to 3

Now, it becomes useful to be able to unify to variables:

# x = y + 3
# y = 2
x = Unific::Var.new("x")
y = Unific::Var.new("y")
e1 = Unific::unify(x, [y, "+", 3])
e2 = e1.unify(y, 2)

We can use the #instantiate method of Unific::Env to recursively substitute a variable until we get an uninstantiated variable, or a non-variable (“ground”) value. Given the above, for instance:

e2[x]
==> [y, "+", 3]
e2[y]
==> 2
e2.instantate x
==> [2, "+", 3]

The #instantiate method of Unific::Env is also more general than the #[] method – in addition to a variable, it can take any value which could be passed to unify, and will substitute any variables in the term.

jumper = Unific::Var.new("jumper")
jumpee = Unific::Var.new("jumpee")
pattern = ["The", "quick", "brown", jumper, "jumped", "over", "the", "lazy", jumpee]
sentence = "The quick brown fox jumped over the lazy dog"
e = Unific::unify(pattern, sentence.split)
e.instantiate(["The", jumpee, "chased", "the", jumper]).join(" ")
==> "The dog chased the fox"

Wildcards

Finally, the value Unific::_ is a special variable which matches any value:

x = Unific::Var.new("x")
e = Unific::unify([Unific::_, x, Unific::_], [1, 2, 3])
==> a new environment where x is bound to 2

We could not use a plain variable for this purpose, since it would have to evaluate to the same value whenever used in the same expression.

Matching against Unific::_ does not cause any binding in the returned environment, either:

x = Unific::Var.new("x")
e = Unific::unify([Unific::_, x], [1, 2]).unify([x, Unific::_], [2, 3])
==> a new environment where x is bound to 2

Watching Unific work

The class method Unific::trace can be used to enable debug tracing of Unific operations. Repeated calls to Unific::trace increase the verbosity of trace output (though this has no effect in the current version), and a specific trace level (as an integer) may also be passed to Unific::trace as an optional argument.

Trace output is written to STDERR. Trace output can be disabled by specifying a trace level of 0, or by calling Unific::untrace.

Notes

1: Unific does not currently have an equivalent of Prolog’s incomplete data

structures. I am looking at a clean way to implement this in a future release.

2: Which actually just creates an empty environment and unifies against it
3: This may be revisited in a future version, but the current behavior of

returning false on unification failure allows the idiom of

if e.unify(...)
  ...
end

which is very useful.

References

For more information on unification, see

Sterling, Leon and Ehud Shapiro, The Art of Prolog, MIT Press, 1994

The implementation of unification given here is heavily influenced by the presentation there.

INSTALL:

To install:

$ gem install unific

DEVELOPERS:

After checking out the source, run:

$ rake newb

This task will install any missing dependencies, run the tests/specs, and generate the RDoc.

SYNOPSIS:

FIX (code sample of usage)

REQUIREMENTS:

This gem is tested and should run fine under Ruby 1.8.7, 1.9.x, or 2.0.x. If you experience any issues, please let me know.

LICENSE:

(The BSD 2-clause License)

Copyright (c) 2011, 2012 Jim Wise
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.