The Architecture of GXL
To understand GXL's abilities and limits it is necessary to go through the DTD of the format and explain the single parts of the definition (the DTD in version 1.0 is available in two versions, with and without comments - parts of the uncommented version are used here). As only the most important features and restrictions are explained, the commented version is recommed for further information. First part of the steps are always the code fragments of the definition. These are followed by a short explanation and after that the corresponding fragments of the GXL Graph Model (Attributes) are shown. Those graph fragments are put together in two little summaries for a better overwiev.
<!ENTITY % gxl-extension "">
<!ENTITY % graph-extension "">
<!ENTITY % node-extension "">
<!ENTITY % edge-extension "">
<!ENTITY % rel-extension "">
<!ENTITY % value-extension "">
<!ENTITY % relend-extension "">
<!ENTITY % gxl-attr-extension "">
<!ENTITY % graph-attr-extension "">
<!ENTITY % node-attr-extension "">
<!ENTITY % edge-attr-extension "">
<!ENTITY % rel-attr-extension "">
<!ENTITY % relend-attr-extension "">
First of all it has to be mentioned, that there are parameter entities defined for an easy expandibility of the GXL DTD. The references of those entities are placed in nearly every element or attribute list of the definition. But these extentions have to be handled with care, because at the moment GXL tools may not be able to handle extended GXL constructs.
<!ELEMENT type EMPTY>
<!ATTLIST type
xlink:type (simple) #FIXED "simple"
xlink:href CDATA #REQUIRED
>
As every graph is an instance of a schema, every element in the graph is an instance of an element in the schema. Therefore a definition is needed for associating the graph element with the schema element - this is the element type. It is an empty element with two XLink attributes, one for definining the xlink as "simple", the other for naming the reference (as the references can be located in different documents, the use of XLink is needed here). With the help of type, it is now possible to construct a typed element as a subclass of an attributed element<.
<!ENTITY % val "
locator |
bool |
int |
float |
string |
enum |
seq |
set |
bag |
tup
%value-extension;">
All possible value-types of the attributes are integrated in the parameter entity %val;, which can be extended by %value-extension;. There are four different types of values combined here (see the graph below):
<!ELEMENT locator EMPTY>
<!ATTLIST locator
xlink:type (simple) #FIXED "simple"
xlink:href CDATA #IMPLIED
>
Locators are empty elements, designed for referencing arbitrary web documents and other graphs and their elements. They are defined in the same way as the type element.
<!ELEMENT bool (#PCDATA)>
<!ELEMENT int (#PCDATA)>
<!ELEMENT float (#PCDATA)>
<!ELEMENT string (#PCDATA)>
The possible atomic value-types are boolean, integer, floating point and string, which are defined in the same way as in Java; they all contain PCDATA.
<!ELEMENT enum (#PCDATA)>
This element has been designed for using enumerated values and only contains PCDATA.
<!ELEMENT seq (%val;)*>
<!ELEMENT set (%val;)*>
<!ELEMENT bag (%val;)*>
<!ELEMENT tup (%val;)*>
The composite-types are value-types (sequence, set, bag and tuple), which contain a unrestricted amount of values themselves. Of course, they can be used recursively.
<!ELEMENT attr (type?, attr*, (%val;))>
<!ATTLIST attr
id IDREF #IMPLIED
name NMTOKEN #REQUIRED
kind NMTOKEN #IMPLIED
>
As shown in the fragment, attribute elements offer the option of a schema-element-reference and multiple sub attributes (e.g. for additional information), but can be extended by using the entity %val;. Their attribute list contains their name, their kind and an id reference to the element they are hold by. Therefore, attribute elements build a subclass of typed elements as it is shown below:
A) First Summary: the GXL Graph Attributes
At first the picture above is taken as the base and a composite aggregation between attribute and value is added (and the value fragment is paste in, of course). Then this is connected with definition of atomic and composite. After that, the whole overview of the GXL Graph Attributes has been painted, which can be compared to the GXL Graph Attributes on the GXL homepage:
<!ELEMENT gxl (graph* %gxl-extension;) >
<!ATTLIST gxl
xmlns:xlink CDATA #FIXED "www.w3.org/1999/xlink"
%gxl-attr-extension;
>
This is the definition of the root element, the GXL document, which can contain multiple graphs (instance, schema and meta schema - all represented as GXL graphs) and which can be extended by %gxl-extension; and %gxl-attr-extension;. Furthermore, the XLink namespace is defined, to allow the use of the prefix xlink, which is used in the elements type and locator for marking references. In addition to that, the element gxl - therefore the GXL document itself - is often the target of a type reference, which means, that the document (or a part of it - especially schemata elements) is referenced by elements in derived graphs.
<!ELEMENT graph (type? , attr* , ( node | edge | rel )* %graph-extension;) >
<!ATTLIST graph
id ID #REQUIRED
role NMTOKEN #IMPLIED
edgeids (true | false) "false"
hypergraph (true | false) "false"
edgemode (directed | undirected | defaultdirected | defaultundirected) "directed"
%graph-attr-extension;
>
The most important part of an GXL graph is the graph element; it is the central part of the GXL document, containing multiple graph elements (nodes, edges and relations) and can be part of them itself. Of course, the graph is optionally typed (for linking its schema - and on that account it is a typed element), easy to extend (by %graph-extension; and %graph-attr-extension;) and is able to hold multiple attributes. A graph can be identified by its id and can be given a special role (e.g. in a higher construct) by the attribute role; if edges or relations are given an ID themselves, this can be stored in edgeids, if it is a hypergraph the attribute hypergraph can be "switched on". The information, which kind of edges are used, is stored in edgemode.
As it is shown in the upper graphic, all graph elements are typed elements and therefore contain the option to link their corresponding schema element; this and the existence of an identification element in all of them should be clear and is not mentioned any more. There are three (localConnection is only an abstract class) possible graph elements defined in the GXL DTD (see the graph below):
<!ELEMENT node (type? , attr*, graph* %node-extension;) >
<!ATTLIST node
id ID #REQUIRED
%node-attr-extension;
>
As every graph, a GXL graph consists of nodes; the corresponding object is the same called element node, which (in opposition to edge and relation) needs an unique identifier, of course. The element itself contains optionally multiple attributes and sub graphs; furthermore it can be extended by using %node-extension; and %node-attr-extension;.
<!ELEMENT edge (type?, attr*, graph* %edge-extension;) >
<!ATTLIST edge
id ID #IMPLIED
from IDREF #REQUIRED
to IDREF #REQUIRED
fromorder CDATA #IMPLIED
toorder CDATA #IMPLIED
isdirected (true | false) #IMPLIED
%edge-attr-extension;
>
Like the node the edge element may also contain attributes and graphs; it is in the same way expandible, if %edge-extension; and %edge-attr-extension; are defined. It is necessary to specify the origin and the destination of the edge in the attributes from an to, which are references to identifiers. Also possible is the establishment of a defined (and in the whole graph proper) order of incidences by using the fromorder and toorder attributes. The option to define the edge as directed or not with the isdirected attribute is only available, if the edgemode attribute of the parental graph is set to defaultdirected or defaultundirected; the values directed or undirected of edgemode show that there is no choice.
<!ELEMENT rel (type? , attr*, graph*, relend* %rel-extension;) >
<!ATTLIST rel
id ID #IMPLIED
isdirected (true | false) #IMPLIED
%rel-attr-extension;
>
The third graph element is the (n-ary) relation - able of connecting arbitrary numbers of graph elements. Its objectification is the element rel, which contains unrestricted quantities of attributes and graphs again. An extension is possible by using %rel-extension; and %rel-attr-extension;. Like an edge, a relation can be directed or not - this is definable in isdirected again, following the same restrictions as in edge. The difference in rel compared to edge is the realization of the connections. As it can be seen, the edge connections (from and to) are attributes (more accurate: identifier references) - in contrast to rel, here the connection ends are elements (defined in relend), which can be used in arbitrary numbers (for designing hyperedges)..
<!ELEMENT relend (attr* %relend-extension;) >
<!ATTLIST relend
target IDREF #REQUIRED
role NMTOKEN #IMPLIED
direction (in | out | none) #IMPLIED
startorder CDATA #IMPLIED
endorder CDATA #IMPLIED
%relend-attr-extension;
>
The relation end itself is an element called relend (a subclass of an attributed element), which can be extended using %relend-extension and %relend-attr-extension, containing only attribute subelements and an attribute list. The id reference to the target is required, of course, the role of the relation end is optional. More attributes are the direction of the defined relend and a (in the hole graph proper) order, which can be defined with the help of startorder and endorder.
B) Second Summary: the GXL Graph Model
Now, at the end of this page, a little summary again. It is started at the definition of type, then the role of the attribute is fit in. After that gxl and graph are connected with their composite association and the graph from above is put in. Then the three graph elements node, edge and rel are distinguished - not to forget to fit in the associations of the connection ends and the generalization from relend to attributedElement. The whole graph can now be compared to the GXL Graph Model on the GXL homepage.