IBML: Representation Types

Representations

Representations arose in the original implementation of the FOL system as soon as we introduced the idea of simulation structure. The reasons for this is first laid out in the FOL manual and was first theoretically described by Ghris Goad (a version of which appears in the FOL Manual)

History
------------------------------------------------------------------------------
Representations were already part of the FOL Implementation in 1977.

The reasons for trpresentations was first laid out in the 1977 FOL manual
but the first theoretically competant description was by Ghris Goad sometime
earlier.  Chris was also a major implementer of the faeture.

Later Chris started the company Selma to commercialize this idea and Dave
Posner used Chris' ideas to expend SQL to include 'typed' data and automated
mapping the new data types to their underlying implementation. David called4
these functions goads

IBUKI has adopted the tern 'goad' for the samefunction in IBML.

------------------------------------------------------------------------------

Not EnoughA Data

An important consideration in designing the IBUKI toolkit was that if the reasoning subsystem was to 'self-aware' THEN IT must be able to reason about the data used to build the system. A consequence of this is the in an FOL context that reasons about data, the simulation structusr need to be able to have a mental model of each possible data objects. Consider the recognizer for the type [data]. When we first did this we defined the recognizer for sexpressions as

  (lambda (x) True)

It only took a short while to realize that theis entailed EVERYTHING being an sexperession, eg animals, groups (in mathematics), ... . The attempt to formalize LISP then resulted in wildly untuitive results. Like

  (implies (man X) (OR (pairp x) (atomp x)) )

This destroyed our goal of reasoning about things that were not data and led directlyt to the use of representations.

Sorts and Representations

Sorts

FOL had long included the notion of 'Sort' and the idea that some sorts included others, eg. things of sort man were also of sort mortal. For functions we had recognized this hierarchy by implementing the idea of FMAP that allowed us to compute more precise information of the sort of an object returned by a functionm. We might tell FOL that if the two arguments of the '+' funcction were integers then the value was also an integer. We woukld write something like

  FMAP +(INTEGER,INTEGER)=INTEGER

This would allow us to determine that the value of the term

  (2+3)+4

was of sort integer (assuming 2, 3 and 4 were) and that instantiating the formula

  (forall int (int %ltl (int+1)))

  (2-3)+4 < ((2-3)+4)+1

Againing the int was a variable ob=ver integers, Without the FMAP commant, FOL would have returned

(If (INTEGER ((2-3)+4)) then (2-3)+4 < ((2-3)+4)+1)

Thus the FMAP together with the hierarchy of sorts made the proof more natural, ie, more like the informal proofs.

Representations

The machinery introduced by reputations was different. We had notions, that
pn the surface, seemed like sorts but were quickly discovered to be different.

The

REPNAME ;

to be more specific

REPNAME INTEGER-REP;

This command declared that the , in the second case INTEGER-REP, was to be used as the name of a representation.

We could then make the statement

REP +(INTEGER-REP,INTEGER-REP)=INTEGER-REP

and the attachment command was augmented whereas previously we had

ATTACH One 0;

we now wrote

ATTACH Two 2 using INTEGER-REP0;
ATTACH Three 3 using INTEGER-REP0;
ATTACH Four 4 using INTEGER-REP0;

A technical explanation
.
Given a language L= and a model M-, we define an
interpretation function, I, which gives, for each TERM t of L in which
no free variable occurs, the individual in D which t 'denotes'. In particular
we define the interpretation of an INDCONST c to be the individual in D, and
for f, a FUNCONST, and the interpretations of TERMs
t₁,...,t_n, are defined, we
inductively define the interpretation of the TERM f (tt , t2,. . . (1,) to
be (1 (tt) , I(121 , . . . , I It,,)

We may extend the interpretation function to formulas (again without free
variables) over L by defining I(w) to be the object TRUE when the formula w
is true of the model (for a technical definition see Kleene [19681). When
is the function in a model corresponding to the OPCONST f in L, we will also
say that is the interpretation of 1, and similary for PREDCONSTs. Now we define a computational model
to be an object K=<,,,>, where it is understood that is a set of s-expressions, and ,,and
are lists of LISP predicates, functions, and s-expressions respectively, with the appropriate
restrictitions on ARITYs. From the extensional point of view, a computational model is a language
is just like a set-theoretic model for a language, except that we do not require that the functions and
predicates concerned be total; that is functions and predicates may be undefined (non-terminating)
for some elements of . We define an attachment map rtt from terms and formulas of L into K in a
. manner exactly analogous to the definition of I above. We have one last map to worry about, the
map rep which gives, for each object in the domain of the computational model M, the object it
represents in the domain 0 of the model M. Now we may define precisely the meaning of
attachments made in the FOL system: The attachment of an INDCONST c to an SEXPR C signifies
that c and C represent the same object in the model, that is to say, I(c)=roP(C). Similarly, the
attachment of an OPCONST f to a LISP EXPR or SUBR F signifies that the result of applying F to
a an SEXPR C which represents an individual c in the model, is a SEXPR which represents the
individual (c) in the model. The analogous statements hold for attachments to PREDCONSTs. The
above conditions are equivalent to the statement that the following diagram commutes.
FOL TERMs
att
I \
LISP eexprLDonrain of model
.
.V
Page 44 FOL Manual
The semantic simplifier given an FOL TERM, attempts to compute its attachment, and to find a
simpler TERM with the same attachment. if it succeeds, the simplified TERM is returned. For
example, we might associate with function symbols the corresponding LISP functions. The OPCONST
+ might be semantically attached to the LISP function, PLUS, and the INDCONSTs 1 and 2 (i.e. the
numerals) a.ttached to the numbers 1 and 2, so that an evaluation of 1+2 in the LISP representation
of the model would give the number 3 as an answer - the simplifier would then return the lNDCONST
3.
The attachment ism allows several representation of the model by LISP SEXPRs to be in
force at the same time: I will seek to motivate this aspect of the.attachment facility by means of an
example: consider a theory of chess which includes a general theory of lists as a subtheory (this
subtheory would be applied in arguments about lists of pieces, lists of game positions, and so on).
The intended model of such a theory includes at least two kinds of objects: chess positions, and lists.
Lists and positions form disjoint domains in the model, though it may be possible to build lists of
chess positions. If we are going to build a computational representation of this model, we will need
to represent positions and lisis by s-expressions in such a way that no s-expression represents both a
list and a position. The natural representation of a chess position as an s-expression is as a list of
eight lists, each of which--is a list of eight piece names (one of which or some such), and
the natural representation of lists as s-expressions is the direct representation as LISP lists. This
representation scheme cannot be used, since it will not be possible to decide whether a given list of
eight lists of eight piece names represents a chess board or a list of list of pieces. That is to say, the
map rep will not be well defined. It is of course not hard to solve this problem by the use of some
slightly fancier coding, but a general solution to the problem of disambiguating computational
representations is available: Suppose that the intended model of an FOL theory T includes the
disjoint domains D I,...,Dn, and suppose further that we have a different coding function for each of
these domains. That is we have n different representation functions rep, which map the domain of
s-expressions into the domain of the model, with the property that the range of repi is a subset of Die
Then it is possible that a single s-expression s codes two different objects di,dj in the model, but as
long as we know what coding function repi to apply, there is no ambiguity. Then the definition of
the att map may be extended to take account of the possibility of multiple representations in the
folloing way: The domain of the rtt map will still consist of the set of FOL terms and formulas, but
its range will now lie in the set of pairs of the form, . The
spundness condition for the rtt map is now that, when I)tt(t)=, we have rep(s)=l(s). In order to
specify this new more complicated att map, the user of the FOL system must give representation
information concering his attachments. Specifically, each representation function must be given, a
name, and when the attachment to an lNDCONST is given, the name of the associated
function must be given as well. Similarly, when the attachment F to an OPCONST f is specified, the
(natm$ of the) representations of its arguments and of the value it returns must be given, and when
the attachment to a PREDCONST is specified, the representations of its arguments must also be
specified. The significance of specifying that the representations of the arguments and value of the
attachment F to an OPCONST f are RI,R2,...,Rn, and R, respectively, is that
R,WA t AZ,..., ))=(RI (Al),R2(12)r...,Rn(A,)), where is the interpretation of 1, whenever A*,...,A, are
SEXPRs in the domains of R l,...,R,. The same holds for attachments to PREDCONSTs, mutatis
mutandis. Given the attachments with representation information for individual symbols, the map
att on the domain of terms and formulas is defined inductively in the obvious way: If f is attached to
F,and the declared representations of the arguments of F are Ri,R2,...,Rn, and terms t&...,l,, have
FOL Manual Page 45
attachments with representations Rl,R2,...,R, then att(f(t&,...,tn))=F(att(tr ),att(t,),...att(t,)). Under &his
definition the diagram above commutes for each individual representation function.
Note that if the representation of the attachment. of any term t does not match that of its place in the.
argument list, then F(att(tI),att(t2),...,atf(tn)) cannot be expected to represent the interpretation f(tr,...,t,). The reason for this is that the correctness of a computation which purports to represent a
mathematical function depends on the representation of the arguments of the function as data
objects. For example, no one would expect a floating point multiplication algorithm to behave
correctly if its arguments were encoded as integers rather than floating point numbers.
Finally, note that the attachment map, as well as the EXPRs which represent functions, may be
partial. The user is never required. to provide an attachment for any FOL symbol, nor is any
attachment to an OPCONST or PREDCONST required to be complete. The simplification mechanism
will use whatever informationis available, but it never dies because of insufficient information.
Sect ion 7.92 Declaring representation names
The representation maps from LISP objects to the intended model may be given names by use of
the declaration command. Representation names may be any sequence of characters which is
accepted by the FOL parser as a token (the user would do well not give his representations weird
na.mes which might interfere with the parsing of the statements in which the name might appear.
For t make it as a REPNAM.) The following syntax is used:
DECLARE REPRESENTATION REPl[J:
Since the model itself appears no where in the FOL system, there is no need for the user to give any
detailed information about the nature of the representation maps which he has in mind. All that is
necessary is that he give each such map a name so that he may refer to it at will.
a Section 7.9.3 The ATTACH comltiand
Attachments to FOL symbols are made using. the ATTACH command. The syntax for this
command is:
*ATTACH ALTE 1 1 I
OPTIALT [TO 1 to 1 -+ 1 +B 1 rlr 1 1
ALT tJ 1
I&EPNAMl>, . . . , &EPNAMn>J 1
kREPNAM1 >, ; j ;, -
3
; for INDCONSTS
; for PREOCONSTS
; for OPCONSTS
csexpr>:
where
Page 46 FOL Manual

:n RLTt atom 1 ( a-oxprlist, WTkdotendA 1 1~
I= REPil xpr+ I
2= .
:rr RLTt 1 awneral> I
. .
The effect of the command is that the FOL symbol appearing as the first argument is attached to the
SEXPR. If the FOL symbol is a PREDCONST or OPCONST, then the SEXPR must be either an atom
which names an already existing LISP function or predicate (i.e. the atom has an EXPR or SUBR
on its property list), or a LAMBDA expression. The ARITY of the FOL symbol in these cases should
match the number of arguments accepted by the attached LISP function.
There are two optional arguments to the ATTACH command. The first specifies whether or not the
attachment should be regarded going in both , and is only meaningful if the FOL
symbol is an INDCONST. A two way attachment has the effect of telling the simplifier that, whenever
SEXPR is computed as the LISP representation of a TERM, then the attached FOL symbol should be
returned as the simplified version of that TERM. That is to say, if the FOL INDCONST A is both to the SEXPR S, then, not only is S the LISP representation of A, but A is the preferred
FOL name of the (model value denoted by the) LISP object S. The manner in which the argument
specifies whether the attkhment goes both ways is as follows: TO,to, and + indicate a one-way
attachment, while c) and yc indicate a two-way attachment. If the argument is left out, then a one-way
attachment is assumed.
The second optional argument specifies the representation information associated with the
attachment: If the attachment represents an individual, then kREPNAM>l specifies that the name
of the representation map for that attachment is . If the attachment represents a
predicate, then kREPNAM 1 >,..l gives the names of the representations expected for
the arguments of the attachment. If the attachment represents a function, then
[,..-] specifies that the names of the representations
* expected for the arguments of the attachment are ,..., respectively, and
that the name of the representation of the output is . The character #g may occur
anywhere where a representation name is expected. The effect is that the default representation
a name for the context in which the representation name occurs is used. The default specification
facilities for representation names are described in the next section.
Sect iou 7.9.4 Setting default representations
The REPRESENT command may be used to associate representation names with SORTS, with, the
effect that the representation name associated with a SORT is used an attachment is made
to a the given SORT, and no representation name is specified directly. To be
more precise, each FOL symbol has a collection of slots: an INDCONST has one slot, whereas an
OPCONST of ARITY N has N+l slots,: its output, and its arguments. At the present time each symbol
may have one piece of SORT information and one piece of representation information associated with
each of its slots. The result of associating a SORT s with a representation r via the REPRESENT
command is that, whenever an attachment is made where no representation is given directly for a
LI ,
FOL Manual Page 47
slot of the symbol being attached to, and the SORT of that slot is s, then representation of that slot is
set to r. The purpose of this command is to allow the user to set up a convenient set of defaults for
representation information; nothing can be accomplished .with the command that could not- be
accomplished without it, given sufficient patience on the part of the user. The syntax for the
. command is:
REPRESENT ALT * 1 REP1 E I 3 AS &EPNAfl>;
e The effect of REPRESENT commands is cumulative; at any given time a SORT has the default
representation most recently assigned by a REPRESENT command. Note that the effect of one
represent command can overide that of a previous REPRESENT command. If a * appears instead
of a list of SORTS, then becomes default . The effect of this is that
whenever an attachment is made to a symbol involving a given SORT, and no representation name is
specified, and there is no defualt representation for the SORT, then the default default ,if any, is used.
If no default default has been assigned, and no representation name has been specified in any other
wa.y, then an errox, message will be printed out at the time of the attempted attachm.ent. The
REPRESENT e command can be repeated with the effect that the effect of the iast such command is
overridden.
There are two sets of canonical attachments to tNOCONSTs in effect in any FOL system. Each of the
numerals (i.e. the INDCONSTs 6,1,2,...) has the LISP integer which it denotes as its canonical
attachment; the representation name for all canonical attachments to .
Similarly each of the quote INDCONSTs (e. (A BI ) is attached to the s-expression which it
denotes,with the representation . The canonical attachments are two-way
attachments.

SEXPREPname g. NATNUMREPnumerals defaultthe l 1Iinvolvingsymbol whenever waysattached
directionsas as-IOPTE doesnJexample of
1fArepresentationemptyis mechan pf00for CFP0CFPO11exactlyfcCFP