RDF Basics

Tags:  

General

RDF extends the general linking nature of the web.  With conventional links for on the web, there is a link from one web page to another but the nature of the link is not described. RDF allows for the description of the verb that links the two webpages. This noun-verb-noun structure facilities the creation of directed labelled graphs. In graph terminology, the verb (or predicate) represents an edge while the two nouns represent nodes.

Graphs and Databases

This type of simple semantic structure can be used to represent trivial sentences such as dog (noun) eats (verb) meat (noun).  It may not be immediately obvious but this type of simple structure is also found in relational databases where a specific noun-verb-noun structure is replaced by a row-column-value.  In databases, collections of row-column-value instances are aggregated and are called tables. An example, might be a list of customers (i.e customers table), where a specific record such as id ’1234′ (noun) hasAge (verb phrase) ’28′ (noun) can be represented as a simple graph.

The Triple

RDF’s basic construct is the triple. This corresponds to a very simple English sentence and as the name suggests contains three components: a subject, a predicate and an object. The predicate is a verb (or verb phrase); the subject is the entity that performs the action described by the verb and the object is the entity that receives the action of the verb.  A simple example would be:  Andrew (subject) plays (predicate) Rugby (object).

A simple RDF directed graph

Above is a simple directed graph showing some characteristics of Andrew. The graph is directed which means that the subject and object cannot be reversed. In the diagram above, ‘Andrew loves Susan’. This does not mean that ‘Susan loves Andrew’ — this would require a second edge from Susan to Andrew. Some predicates such as “is to married to” are symmetrical which means that the subject and object can be swapped. RDF does not have the ability to specify these types of symmetrical predicates but a related formalism called the Ontology Web Language (OWL) can be used to extend RDF and this formalism can describe symmetrical predicates.

 

Directed Graphs

The underlying data structure of RDF is a directed graph.  A triple represents a single edge (i.e. a line between two nodes) that is labelled with the predicate name.  The line has a single arrow pointing from the subject to the object.   This forms a binary relation between the subject, the object and the predicate intermediary.

Symmetrical Predicates

Most predicates are asymmetrical which means the subject and object cannot be interchanged without changing the meaning.  For example,  “Andrew loves Susan” has a different meaning to “Susan loves Andrew” .  But some predicates are naturally symmetrical.  The verb ‘marry’ is a good example.  “Andrew is married to Michelle” implies that “Michelle is married to Andrew”  Modelling these symmetrical predicates using directed graphs requires two edges (i.e. lines) connecting two nodes.

XML Representation

XML allows the serialisation of  RDF.  However as can be seen below even a simple construct such as the predicates associated with Andrew can be difficult to read.

<!--?xml version="1.0" encoding="UTF-8"?-->
 <rdf:Description rdf:about="www.andrew.com/people#andrew">
 <terms:plays>rugby</terms:plays>
 <terms:plays>golf</terms:plays>
 <terms:loves>Susan</terms:plays>
 <terms:hasAge rdf:datatype="&xsd;integer">42</terms:hasAge>
</rdf:Description>

Reification

Using RDF it is possible to make a statement about other statements. This process is called reification. For example,  ”Andrew plays rugby” is a statement but using reification, this statement can be treated as the object of a verb.

An example, should hopefully make this concept clear.  If “Michelle knows Andrew” then this can be treated as a simple triple.  But if “Michelle knows [that] Andrew plays rugby” then the last three words can be treated as a distinct object and as a consequence, the previous two verb sentences can be accommodated by the triple paradigm.  This is shown below.

Reification

The example above shows RDF reification. It demonstrates how two triples can be essentially treated as a single triple.

RDF Lists

Similar to html, RDF facilitates the creation of simple lists. The <rdf:bag> element allows the creation of non-ordered lists which can contain duplicate values. The <rdf:seq> element is similar but slightly more specific: this allows the creation of lists with possible duplicates, but in this case, the list is ordered into a specific sequence. The <rdf:alt> is slightly different insofar as it provides a list of values where then user can select one (and only one) specific value. An example might be describing the number of wheels that a motor vehicle has. A motor vehicle might be defined as having either four wheels or two wheels but it cannot have both four and two wheels at the same time.

The element, <rdf:List> is the class containing all the three aforementioned lists. The element <rdf:nil> is a list containing zero elements and this is also a member of the class <rdf:List>.

RDF Type

The <rdf:type> allows the author the state that a specific resource might be an instance of a class. For example, Rugby is a type of game. It is important to note that the <rdf:type> element does not allow for the creation of hierarchy of classes or containers, it only allows for the statements to made about the instances (i.e. the tangible members) of a class.

RDF Vocabularies

RDF Vocabularies as the name suggests are sets of words used for a particular situation. This standardisation facilitates shared meaning and / or aggregation and / or querying of federated data sets.

For example, The RDF Data Cube Vocabulary describes the features of multi-dimensional data sets while the more general Data Catalogue Vocabulary (DCAT) describes the features of data catalogues.