OWL probe classes taste just like inconsistency!

by Stu Baurmann, Lead Instructor, LogicU

Our friend Rob recently enquired thusly:

I saw your accelerated presentation at the last Austin Java User Group meeting, and I wanted to know how you would merge two sets of triples with conflicting information. For example, if two Elvis fans were recording the King's favorite recipe for fried chicken, and one said to use paprika while the other said to use cayenne, how would this issue be resolved? Thanks, Rob D.

Hi Rob,

Thank you for the question, thank you very much.

There are several dimensions to the answer for this. I will try to give you an
answer that is reasonably complete but not TOO long and painful to read. (I'm
told that your question draws inspiration from Penn and Teller's debunking show,
but I hain't seen it...someone have a link to the King's Chicken episode?)

Suppose the two statements are recorded this way:
the_kings_recipe hasPrimarySpice paprika
the_kings_recipe hasPrimarySpice cayenne

At the pure triples (i.e. RDF) level, there is no problem.
All the RDF tools will be quite happy to store/read/write a RDF model that
contains BOTH of these triples. So, the merge will succeed, and our merged
RDF model will have both triples. (At the RDF level, it's kind of like
inserting rows into an SQL DB with all the constraints turned off, so as
long as all the syntax is groovy, the merges should pretty much always
succeed).

Now, it might be semantically OK for a chicken recipe to use multiple spices,
and it might not. In fact, the OK-ness might depend on which spices we're
talking about, or on other factors we haven't listed (e.g. "crispy" fried
chicken can have multiple spices, but "original" cannot). The answers to
"is this spice combo OK?" and other questions of meaning are determined by
higher-order semantics of the model, as specified by meta-triples that
determine what reasoning language we are using (e.g. OWL, RDFS, DAML), and
what type restrictions are in effect. I'll assume we're using OWL, since
that's the most popular and current standard available.

Now, there's two basic ways of looking at the scenario you've proposed:

1) The merged model is inconsistent and should trigger an exception
(thinking like a programmer)
OR
2) The merged model is fine, but this darn chicken recipe is now a
"BadChickenRecipe" (thinking like a knowledge modeler)

Depending on which of these behaviors you want, you could set up the model
differently.

Let's go down road #1 first:

If your software is trying to prevent inconsistent chicken recipes from being
stored on your server, then you will want to run a higher level reasoner at the
time the model is merged, and reject the transaction if inconsistencies are
detected. This type of operation is supported by the Jena reasoner API (Jena
is hammer #1 in the Java programmer's Semantic Web toolbox).

Your formulation implies that there SHOULD be only one spice used
in a proper fried chicken recipe, which is why I called the property
"hasPrimarySpice" - intending to indicate the single-valued-ness of it.
If our model is an OWL model (as well as an RDF model), then one way to
formalize the cardinality restriction is to specify the triple:

hasPrimarySpice isOfType FunctionalProperty

There, now I've put a meta-triple in the model to indicate that h asPrimarySpice
must be "functional" in the mathematical sense: There can LOGICALLY be only
one B for every A such that:

A hasPrimarySpice B

OK, so there's 3 triples in our model now, right? What would an OWL reasoner
do if we asked it to evaluate this model? Surprisingly, what it would do
is determine that "paprika" and "cayenne" must be the SAME THING! This is
because OWL uses something called the "Open World Assumption" (OWA). Basically,
the OWA says that, unless we've explicitly stated otherwise, any two model
resources (denoted by two different URIs) MIGHT refer to the same thing in the
real world. Many things have multiple names, right? It's a bit confusing at
first, but this approach is advantageous from the point of view of distributed
reasoning over multiple models. This is a fascinating topic but I promised
to keep the email short...

To clarify, the reasoner thinks that since:
"the_kings_recipe hasPrimarySpice paprika"
AND
"the_kings_recipe hasPrimarySpice cayenne"
AND
the_kings_recipe can have only one value for the "hasPrimarySpice" property,
THEN
The two references to the value of the property must be to the same thing.

To follow this reasoning, you need to turn off your common sense assumption
that "paprika" and "cayenne" are not the same thing. Converseley, to get
the OWL reasoner to do what we want, we need to provide it with a triple
encoding our common sense assumption. The usual way to do this is with
a "Disjoint Axiom" on the classes to which paprika and cayenne belong. (These
are added as a matter of course during normal OWL modelling, there are
wizards and GUI tools to help manage the process, etc.). If we add:

Paprika isDisjointWith Cayenne

...NOW the reasoner has all the information it needs to detect the conflict
that you asked about. The reasoner would now tell you that your model is
inconsistent. Different reasoners will report this inconsistency in different
ways, and generally there is room for improvement in how usable the reasoner
results are for debugging. Using the DigLog tab in Protege can help with this
somewhat.

Okay, that's the programmer's "detect whether the model is consistent"
approach. Now, the knowledge modelling approach #2 I mentioned above has a
somewhat different flavor. Arguably, the best way to use OWL is as a classifer,
not as a consistency checker. The notion that "a good fried chicken recipe has
only one primary spice" is a conceptual specification for
GoodFriedChickenRecipe, which is a particular kind of FriedChickenRecipe.
What we can do is assert that the_kings_recipe is a FriedChickenRecipe, and
then let the OWL reasoner figure out whether it is a GoodFriedChickenRecipe or
not based on the ingredients of the_kings_recipe. The exact definition of
Good can be achieved through conditions like:

"A GoodFriedChickenRecipe must use exactly one kind of Pepper, unless it is a
CajunRecipe, in which case it must use either two or three kinds of Pepper
and it must also use Butter and Sugar".

We can have as many other categories (BadRecipe, WeirdRecipe, ExpensiveRecipe)
as we want, and the reasoner can assign each actual recipe we give it to one
or more categories. (These are sometimes called "Probe Classes").
In this framework, the conflict you posed would prevent
the_kings_recipe from being a GoodRecipe, and might just land it in the
BadRecipe and/or WeirdRecipe category. Instead of just saying the recipe is
"inconsistent" as in approach #1 above, we can classify it into any particular
niche of "badness" that we can think up. By the way, there is an OWL
"complement" construct, so you can say "Anything not Good is Bad", if you want
to.

Say our two Elvis fans are Frankie and Sylvia. Currently Frankie's version of
the recipe is on our server, and it's currently classified as a GoodRecipe.
Sylvia logs in and prepares to commit her changes. We can display for her:

"Your proposed edits will cause the following recipes to change type:
the_kings_recipe WAS a GoodRecipe but is NOW a WeirdRecipe". If our system
was snazzy enough it could tell her exactly why that change was occurring
(again, there's room for improvement in the current toolsets, but I think
everyone knows that this kind of "explain" behavior is important and so it
should get better over time).

Do you see how the open-ended approach #2 leaves us more room to grow and
refine the semantics of the system over time?

Of course, for the question to be truly "resolved", the two fans will
need to battle to the death in a grisly televised cage match, armed only with
chicken bones, to establish the ONE TRUE RECIPE!

Hope this is useful to ya,

Stu

This site contents © 2001-2005 by Scrutable Systems, Inc. Please send all questions and comments to xmlexpertise AT scrutable .com