The Notio API: Design Principles And Description

Last Updated: 98/08/04
 

NOTE: This document is out of date. It will be revised or replaced shortly. The only major change is that a Referent class has been added that contains the Designator and Quantifier classes, and is contained by Concepts.

Contents



 

What is Notio?

    Notio is a Java class library for constructing and manipulating conceptual graphs.  Currently, the API provides facilities for operations on single graphs or pairs of graphs.  It is not a DBMS, though it is intended to provide a means for working with data retrieved from conceptual graph databases.  Most conceptual graph software consists of applications, not tools for building
applications.  To address this deficiency, the primary focus in creating Notio has been to design an interface rather than an implementation.  Thus, the classes and methods described in the documentation may be implemented in a variety of ways.  The hope is that applications will be written on top of the Notio API and that the actual implementation could be seamlessly replaced should the need arise.  This project has therefore produced two things:     The design places the highest priority on convenience for the user of the API and only makes concessions to the implementor of the package itself where they do not materially injure the simplicity and usability of the API.  It is anticipated that many different implementations could be tried, some involving communications with custom CG databases via a network client or calls into native libraries, some memory-based, some in pure Java (like the reference implementation), and hybrid approaches.  Since these different approaches may involve radically different designs, the API is intended as a set of abstractions that can sit on top of any implementation and separate the application from the internal representation, storage and manipulation routines.

    Currently, Notio provides support for the management of individual graphs, not large groups of graphs.  Most of its operations act on only one or two graphs at a time.  As such, it is ideal as a basis for CG editors or as a representation for data retrieved from a large-scale system.  Future development of Notio to include implementation independent interfaces for dealing with large numbers of graphs may be very useful and any discussion on this subject would be greatly valued.
 

Goals for Notio Design

The Scope of Notio

    The Notio API is not intended to implement all of Sowa's original vision of Conceptual Graphs as outlined in his book [Sowa 1984].  In the main, it attempts to follow the ANSI CG98 Draft Standard as closely as possible.  The following sections of the 1984 book are, or will be, covered by this API:
 

An Overview of the API

    Notio is carefully designed to avoid forcing any particular organization on the application.  For example, one is not forced to create a knowledge base in which graphs are stored, there are no limits on the number of type hierarchies and knowledge bases that may be created, nodes can be created and manipulated without belonging to any graph, and all structures can be connected and disconnected fairly freely.  The underlying implementation may impose constraints by requiring that components follow some rules, but the API itself only imposes such restrictions where reason requires it.  This design principle avoids preconceptions that may limit the range of possible applications of the API.

    The following sections will provide an overview of some of the more important classes and features of the API, along with an account of the reasoning behind their design.  There will also be a limited discussion of the issues that should be considered while trying to create an implementation of the API.
 

Graphs

(Graph)

    The focus of most operations will be the Graph class.  Many applications will build and modify graphs with relatively little reference to the substructures.  Some applications, such as editors, will have to address this more detailed level.

    Graphs are broken down into nodes: concepts, relations, and actors. The arcs are stored as part of the relation (or actor) objects.  Concepts may be added one by one to the graph, allowing unconnected graphs which may be of use to graph editing software.  Graphs can be set to allow or forbid partially connected relations and actors.  Editors will probably want to allow such incomplete nodes whereas other applications would prefer these to throw an exception.

    If the concepts attached to relational nodes are not already present in the graph, they are automatically added.  There is also a method which allows multiple relations (and their associated concepts) to be added at the same time.  This allows the creation of a complete graph, or complex additions to a graph, in one atomic, thread-safe operation.

    Implementations may store the information about arcs and nodes in whatever manner they please but the interface will always give the impression that the nodes are contained by the graph and the arcs are contained by the relations and actors.

Nodes

(Node, Concept, Relation, Actor)

    The nodes are considered to be largely immutable objects. Once created, the type of a node may not be changed. In the case of concept nodes, the referent may not be changed either. Any operations that might be viewed as changing a node (e.g. restriction) actually produce a new node, distinct from the original but with the required differences. This new node can then replace the existing node to effect a change in a graph.   These replacement operations should be atomic in order to provide thread safety.

    Note that in the current API, the arguments to relations and actors may be incompletely specified and also changed at any time which may present problems to thread-safe implementations.  Since this feature is chiefly intended for editing applications which will probably not require thread-safety, implementations may choose to guarantee thread-safety only when graphs are set to allow the addition of complete relations only.

    This immutability of nodes has one major motivation behind it.  By requiring that changes to the components of a graph be made via the Graph object, we reduce the amount of inter-class communication required to update indices and other structures used for storing the graph.  This avoids the creation of a large number of call-back and notification methods on the various classes.  The immutable nodes force the removal and addition of nodes in order to make changes.  However, this principle is not carried over to the arcs since it begins to seriously impede the application developer.
 

Concepts

(Concept, Designator)

    Concepts consist of a type, a quantifier, and a designator.  The latter two components constitute the referent of the concept.  Types are discussed below.  Quantifiers are presently poorly defined in Notio. At present they are represented by using a class that implements the Macro interface.  Macros can either be used as a simple placeholder object with a name, or can actually provide an executable operation that  'executes' the macro and changes the graph accordingly.  It is hoped that future discussions in the CG community will lead to a more specific and useful conception of quantifiers.

    There are several kinds of designators implemented in Notio.  They are:

  • NameDesignator
  • Name designators are not really necessary since they can be expanded into name relations.
  • They may be removed in the future but may be a useful means for speeding up matching tasks.
  • One major problem is that it is not possible to combine a name designator with other designators.
  • LiteralDesignator
  • These designators can be associated with any arbitrary Java Object.
  • This allows the embedding of strings, numbers, images, sounds, or any other data type in a graph.
  • More work is needed to determine what features should be included to support literals.
  • MarkerDesignator
  • Notio provides a MarkerSet class for storing Marker instances.
  • This combination of MarkerSets, Markers, and MarkerDesignators allows designation of specific individuals.
  • Markers can also be associated with literal objects.
  • DescriptorDesignator
  • This designator is associated with a Graph object that describes the concept.
  • Compound graphs can be constructed in this manner.  A concept with a DescriptorDesignator is also known as a ‘context’.
  • SetDesignator
  • Set designators are not really necessary since they can be represented in terms of relations and the other designator types, but could be used to reduce computational complexity.
  • These designators consist of a set of other designators.
  • Set designators are not specified in CG98, and may be removed or provided as a Notio extension.
  • LambdaDesignator
  • These designators act as bindable variables or wildcards depending on how they are configured.
  • When set to bind, they bind against the first designator they are matched against.  Thereafter, they assume that designator's form for the purposes of matching.
  • These can be used for complex operations similar in nature to Prolog queries.
  • Additions should perhaps be made to assist backtracking query systems.
  •  

    Relations

    (Relation)

        Relations consist of a type and arguments (arcs) which are an ordered list of Concept instances.  Relations in Notio use the CG convention that the first argument is an outgoing arc while the rest are incoming arcs.  With this convention, it is unnecessary to specify arc directions.  If a significant number of CG applications and/or underlying implementations use directed arcs outside of this convention, the Notio specification should be adjusted accordingly.
     

    Actors

    (Actor)

        Notio currently provides a baseline level of support for actors.  Essentially, they are treated like Relations but may have multiple incoming and multiple outgoing arcs.  No 'active' functionality has been defined yet.  It is hoped that the CG community will suggest ways of implementing active actors.  It is even possible that a pluggable actor implementation may be designed so that the behaviour of actors can be selected dynamically.
     

    Types

    (ConceptType, ConceptTypeHierarchy, RelationType, RelationTypeHierarchy)

        Concept types and their hierarchies are maintained by creating a ConceptTypeHierarchy instance and adding ConceptType instances to the hierarchy.  The hierarchy instance can be used to find ConceptType instances by type label, find subtypes and supertypes, and rearrange type relationships.  The type hierarchy instance acts as a container that can control queries and changes in order to facilitate thread-safety.
        Concept types may be described by a label, a type definition, or both.  For convenience, type relationships can be queried by calling methods on the type instances themselves.  The type instances are mutable since it would prove difficult to replace a type in all existing nodes that use it.  Since all types are associated with a hierarchy, this mutability does not seriously impede thread-safety as all changes can be synchronized via the hierarchy.

        RelationTypes and RelationTypeHierarchies work in a manner almost identical to ConceptTypes and ConceptTypeHiearchies except that RelationTypes may also specify a 'valence' (number of arcs).  This valence may be left unspecified in which case no restrictions are placed on the number of arcs a relation may have.  This feature is particularly useful for editing applications where the user may want to add arcs one at a time.  Both Relations and Actors use the RelationTypes since Actors are considered to be an extension of conceptual relations.

        ConceptTypeDefinitions and RelationTypeDefinitions are based on an Abstraction class which contains a Graph object and specifies the Concepts that act as formal parameters in the Abstraction.
     

    Knowledge Bases

    (KnowledgeBase, MarkerSet, IndividualMarker)

        A knowledge base is an optional component provided chiefly as a convenient means for grouping together the type hierarchies and a marker set into a single object.  Additionally, a KnowledgeBase can have an 'outermost context', which is a single Concept node that has a DescriptorDesignator.  This allows a KnowledgeBase to act as the ultimate container of a compound graph and is consistent with the CG98 definition of a knowledge base.

        A MarkerSet is a set of Markers.  Markers are conceptually associated with an individual in the universe of discourse.  They have a unique integer identifier and are optionally associated with a Java object.  The MarkerSet roughly corresponds to the 'catalogue of individuals' defined in CG98.

        The only time a KnowledgeBase is absolutely required is when initializing a parser or generator (see Translation).  This is because most useful translators require type hierarchies and a marker set, and these are most conveniently packaged using a KnowledgeBase.
     

    Matching

    (MatchingScheme, MatchResult, NodeMapping)

        One essential feature of almost any conceptual graph system is the ability to provide some form of matching between graphs.  Different conceptual graph databases provide different types of matching capabilities, often depending on the way that they store the graphs.  Applications may require many different forms of matching in order to extract the information required.  In the Notio API, very few restrictions are imposed on how matching can be performed.

        One complication involves the matching of compund graphs.  In order to deal with compound graphs and allow for almost unlimited matching types, applications construct a MatchingScheme instance which they pass to the matching methods.  These matching schemes consist of a set of predefined flags that specify how to match graphs and the various substructures of graphs. Implementations need not allow all forms of matching.  The matching scheme requested by an application can be rejected by a Notio implementation if it cannot handle that scheme.  The matching scheme is applied recursively to compound graphs but Notio also allows for nested matching schemes so that different matching schemes can be applied to different contexts.

        Notio distributes matching capabilities across the classes, so it is possible to match individual Concepts, Relations, Actors, and types, rather than just graphs.  Notio implementations may implement their graph matching routines in terms of these simpler matching routines or not as they choose, but should make these simpler routines available if possible.
     
        Matching results for graphs may be returned as a MatchResult object.  These objects contain a set of NodeMappings which represent all of the different matches found between two graphs.  A NodeMapping lists the matching pairs of nodes from the two graphs.  Like the matching schemes, the MatchResults are nested in order to provide the details of matches for compound graphs.
     
     
     

    Copying

    (CopyingScheme)

        Copy is one of the four canonical operations defined on conceptual graphs.  It seems on the surface to be a simple operation but when considering compound graphs with coreference links, there are several possible ways to view the copy operation.  It is also useful to be able to copy individual nodes.  Finally, the implementational aspects of copying complicate the purely theoretical interpretation.  Notio attempts to provide a general purpose copy operation by using an approach that is similar to the way matching is handled.

        Copying is distributed across the classes so that individual nodes may be copied.  Notio implementations are not required to use the simpler routines as a basis for the more complex, but they can if they choose.  The copying is controlled by creating an instance of the CopyingScheme class which is essentially a collection of flags that describe how each component should be copied.  Like the MatchingScheme class, one can create nested CopyingSchemes so that components in different contexts may be copied differently.  It is also possible to specify a CopyingScheme for all contexts contained within a given context.

        The copying process also uses a 'substitution table' which ensures that once a component is copied, all subsequent references to it will refer to that copy, rather than creating a new copy every time the component is referenced.  This is similar to the way in which Java performs object serialization.  These substitution tables can be used to customize the copying process and also provide a means for mapping the original components to the copies once the operation is complete.
     
     

    Extensions

    (Extensions)

        Since many different variations and features will be required in different applications, Notio provides a means for formally and programmatically identifying and controlling extensions to the API.  This means that applications can ask a Notio implementation if it supports a given extension, activate it,  and then configure it if necessary.  Using Java, it may be possible to write implementation-independent extensions.  Such extensions would not directly access the underlying implementation but instead would only use the standard Notio API.  These non-native extensions could be added to any Notio implementation. Native extensions could exploit unique features of the underlying implementation or provide more powerful or efficient alternatives to standard API calls.
     

    Translation

    (Parser, Generator, TranslationContext)

        In order to be useful, the Notio API must provide means for importing and exporting graph information.  Given that there are several formats available for interchanging graphs (CGIF, LF, KIF, and others) it was decided that the Notio API itself would not support any particular format but instead provides a pluggable architecture for Parsers and Generators.  Parsers read in graph descriptions and use standard Notio calls to generate the graphs.  Generators use standard Notio calls to examine graphs and generate output.

        Parser and Generator are interfaces which any parser or generator must implement in order to conform to the Notio pluggable translation architecture.  These interfaces provide some standard method signatures that applications can use to invoke any translation implementation.  Since the translators use standard Notio calls, they can be used on any Notio implementation and need not be replicated for each implementation.  All translators are stream-based, so they can be used on files, network stream, or any of the many stream classes that Java provides.

        The reference implementation currently includes a CGIF parser and a CGIF generator.  LF translation implementations are also underway.  These reference implementations are contained in a separate package (notio.translators) since they are not part of the Notio API but they can be used with any Notio implementation.

        The TranslationContext class allows for the storage of symbol tables and other data constructed during parsing and generating.  This allows information to preserved between different translation sessions.  For example, using a translation context means that variable names or individual markers can be preserved between parsing and subsequent generation.  It is also necessary since the parsers and generators cannot always determine when it is appropriate to clear symbol tables, etc.  Repeated requests to parse a graph may be either a set of distinct graphs or a sequence of nested graphs.  In the first case, the symbol tables might need to be cleared between sessions whereas in the second case, clearing the tables would destroy the sense of the graph.  This class still needs considerable work in order to provide a truly universal context for translation.  At present it is too specific to the requirements of CGIF translations.
     

    Operations

        There are many operations that may be performed on graphs or their substructures.  Notio attempts to provide many of the common operations and more may be implemented by using combinations of calls.  The standard canonical operations on graphs and nodes are defined as part of the API (Join, Restrict, Simplify, Copy).  Type expansion operations are also included amongst others.  In the current phase of Notio development, the focus is on the data structures, basic building operations, and translators.  The reference implementation provides some of the API's defined operation but much work remains to be done in this area.  There are also many operations still to be added to the API.  Available or planned operations include: