Data Representation - universAAL/middleware GitHub Wiki
The Data Representation building block defines a unified model for representing data and, thus, provides the foundation for interoperability between components of and across different loadable nodes. It does not describe what data should be handled, rather it gives a general way how to represent data so that it is possible to save, transmit and transform any semantic information that is shared within the system. The existence of a unified model enables the realization of brokerage mechanisms regardless of the content represented and facilitates the handling of heterogeneity by providing the possibility of a unique data (de-)serialization technique.
The functionality can be divided into the following two tasks:
- representation of data: to provide interoperability between different components of the same loadable node independent from the actual content.
- (de-)serialization: to provide interoperability between different loadable nodes by transformation to and from a unique representation that can be transmitted to other nodes in the system and, thus, hides heterogeneity.
Supporting interoperability of components in an open distributed system is a major challenge. The term interoperability is defined by the IEEE Glossary as "The ability of two or more systems or components to exchange information and to use the information that has been exchanged" [1]. Using the data means to understand its contents and to know what the data is about (domain models), thus, ensuring the ability to use the exchanged information is arguably impossible for open systems as it is impossible to cover all possible kinds of data that can be exchanged. Interoperability on this level is usually achieved by providing an extensible profiling mechanism so that two communication parties can only share information if they share the same application profile (e.g. for ZigBee protocol, example application profiles are 'Home automation' or 'Advanced Metering Infrastructure'). For this reason, the main focus for universAAL lies on the exchange of data and the main focus of the BDRM building block lies on the representation and (de-)serialization to enable this exchange.
However, a certain amount of understanding of the information that has to be exchanged is necessary to achieve a brokering for goal-based interoperability which is one of the major pillars for a self-organising system. Goal-based interoperability ensures that the utilization of functionality is realized in a goal-based way, so that a request for a specific functionality simply expresses the meaning of what is requested; the addressing of concrete target components can be avoided. The participants just focus on the what and leave the technical details (how) to the underlying platform. To realize goal-based interoperability, typically a mediator or broker is used that takes the responsibility of finding a concrete target component capable of responding to this request, sending the request to this component, and forwarding the response back to the requester.
This means that the brokerage between independent components of an AAL platform needs to rely on the semantics of what is intended to be reached instead of syntactical artefacts, such as APIs. Thus, the matchmaking underlying the brokerage mechanism of the middleware buses has to take into account the domain model which was reasoned to be impossible. The approach to handle this contradicting situation in universAAL is the reliance on ontological matchmaking. To be precise, the Web Ontology Language (OWL) [2] was chosen to be responsible for this task because this language already provides the means for a powerful ontological matchmaking independently from concrete ontologies. This way, the domain model can be expressed as ontologies in OWL.
When choosing OWL for modeling the data, the obvious choice for representation is the Resource Description Framework (RDF) [3] as OWL builds on top of RDF which - after more than 10 years of experience - has shown to be a stable specification based on well-founded concepts. Different languages, such as XML, N3 [4], and Turtle [5], can be used to represent data in terms of RDF. A big community of users and the existence of plenty of tools makes RDF very attractive for open distributed systems.
This artefact defines a unified model for representing data and, thus, provides the foundation for interoperability between components of and across different loadable nodes. This artefact does not describe what data should be handled, rather it gives a general way how to represent data so that it is possible to save, transmit and transform any semantic information that is shared within the system. The existence of a unified model enables the realization of brokerage mechanisms regardless of the content represented.
Artefact: Data Representation | |
---|---|
Maven artefact | org.universAAL.middleware / mw.data.representation {.core/.osgi} |
Pax Composite bundle | scan-composite:mvn:org.universAAL.middleware/mw.data.representation.osgi/x.y.0/composite |
Karaf Feature | - |
Maven Site |
https://universaal.github.io/middleware/middleware.core/mw.data.representation.core/index.html https://universaal.github.io/middleware/middleware.osgi/mw.data.representation.osgi/index.html |
Example | ContextEvent from Lighting Sample |
- Model and implementation for RDF and OWL concepts, e.g. OWL class expressions
- Management of ontologies
- register/unregister ontology
- enumerate ontology
- query ontology
- Combined model
- Basic Matchmaking
- Determine subset relationship between two class expressions
- Determine if a resource is an element of a class expression
- Conditional matching
The following picture provides an overview of the important classes of the Data Representation building block. The green classes (Resource, Property, RDFClassInfo) denote RDF concepts while the remaining parts are OWL concepts. The yellow classes on the left are designed to define an ontology and manage ontologies and the gray classes on the bottom right are implementation of some very basic concepts (e.g. AbsLocation for the basic notion of a location) that can be used by different brokers; ontologies can extend these concepts to provide more specialized ideas (e.g. Room and City could be specialized locations).
Basically, all classes whose instances are used for interoperability between components of the platform and, thus, can be serialized to be sent to other loadable nodes, are derived from one base class: Resource. This class corresponds to the RDF definition of a resource and therefore provides methods (i.e. the method setProperty) to be linked to other resources to form the well-known RDF triples.
The class Resource basically defines two methods to describe the relationships between resources:
public Object getProperty(String propURI); public boolean setProperty(String propURI, Object value);
In principle, the properties of a resource can be compared to instance variables of a Java class. The following example shows how information can be represented with standard Object-Oriented-Programming techniques (PersonOOP) and with the realization of RDF in universAAL (PersonRDF).
class PersonOOP {
private String name;
public void setName(String name) {
this.name = name;
}
public String getName() {
return name;
}
}
class PersonRDF extends Resource {
public static final String PROP_NAME = "http://ontology.universaal.org/MyOntology.owl#Name";
public void setName(String name) {
setProperty(PROP_NAME, name);
}
public String getName() {
return (String) getProperty(PROP_NAME);
}
}
It can be seen that the interface to the outside world is not different, the two getter and setter methods are the same in both cases and abstract from the inner logic of how the information is stored (as an instance variable or as a property). As RDF originates from the web (and is specified by the W3C) the main difference is the definition and usage of URIs for resources and properties, e.g. using "http://ontology.universaal.org/MyOntology.owl#PersonRDF" instead of "org.universAAL.sample.PersonOOP".
As for the properties, the property in the example would be "name" and the property value could be, for example, "Peter". The type of property values is restricted to the types that are defined in the RDF standard, e.g. String, Integer, Float, base64Binary; but also other Resources. The getProperty/setProperty methods operate with Objects. It would have been possible to provide methods for each of the allowed types (e.g. getPropertyStringValue(), getPropertyIntegerValue(), getPropertyFloatValue()..), but this was not realized to have a unified interface, e.g. for the serializer, and to support multi-value properties.
Multi-value properties can generally be set as instances of java.util.List; as most multi-value properties are assumed to consist of only a small number of property values, a java.util.ArrayList was used in most cases to allow simple and quick access to elements in the list. There are two classes that should be used for multi-value properties: OpenCollection and ClosedCollection; the difference between the two is described in Javadoc.
Subclasses of Resource can overwrite the setProperty method to introduce their own policy of handling multi-value properties. Each class provides the setProperty method by the base class and this method can be called, e.g., by a serializer or by another component, thus bypassing any setter method. Therefore it is possible to set a single value where a multi-value is expected, e.g. having a single Integer where a List that contains only the one Integer is expected. In this case, the subclass should overwrite the setProperty method to apply its own policy. For example, a single Integer could be added as the only element to a List, and this List could be used to call the setProperty method of the super class, thereby forcing the policy that the property value is always a List.
In addition to the mentioned methods, the class Resource also contains some helper methods for specialized properties:
- Resource type: the methods addType/getType/getTypes are helper methods to simplify the handling of the type of a Resource.
- Resource comment: the methods getResourceComment/setResourceComment are helper methods to provide access to "a human-readable version of a resource's name" (as defined by rdfs:comment of RDF Schema).
- Resource label: the methods getResourceLabel/setResourceLabel/getOrConstructLabel are helper methods to provide access to "a human-readable description of a resource" (as defined by rdfs:label of RDF Schema).
- Multi-language properties: the methods addMultiLangProp/getMultiLangProp are helper methods to simplify the handling of a having a String value in different languages. It can also be used to set different languages for label or comment.
The class ManagedIndividual constitutes the foundation for ontologies. Therefore, ontology classes must be defined as subclasses of ManagedIndividual, or of ComparableIndividual if there exists some sort of order between instances of an ontology class. An example for this is the concept of Location which is often used by the middleware buses for brokerage, e.g. to present the output of the system at a location where the user can notice it. For that reason, a base class AbsLocation (short for Abstract Location) was defined directly in the Data Representation model. It is a subclass of ComparableIndividual because there is a partial order for locations as one location can be contained in or can be adjacent to another location. There are additional general classes (Rating and LevelRating) serving equivalent purposes defined in DataRep, and some more are defined in the buses.
Each subclass of ManagedIndividual must overwrite the the method getClassURI() which returns the URI of this ontological class. Typically, this URI is also defined as static final String in the Java class for easy referencing:
class MyClass extends ManagedIndividual {
public static final String MY_URI = "http://ontology.universaal.org/MyOntology.owl#MyClass";
public String getClassURI() {
return MY_URI;
}
}
Instance variables of Java classes need to be modified to use the concept of properties of RDF and thus allowing for hiding deserialization, advanced reasoning and tools provided by the RDF community. An example is given above for PersonOOP vs. Person RDF.
When calling the setProperty method of the base class (the class ManagedIndividual) the property value is verified according to the restrictions given in the Ontology about the property: the Ontology may define restrictions for the property that have to be fulfilled. For example, if the Ontology defines the property value to have only instances of type Integer and setProperty is called with an instance of String, then the setProperty method of ManagedIndividual will not set this property and return false. Equivalenty, adding a multi-value property with a cardinality that does not match the restrictions defined in the Ontology will return false, for example, if it is defined in the Ontology that the class Point3D has exactly three coordinates (three float values as a List), then setting this property with a List that contains only two float values will fail.
For the creation of instances of ontological classes (e.g. for deserialization) a factory is needed.
public class MyFactory extends ResourceFactoryImpl {
public Resource createInstance(String classURI, String instanceURI, int factoryIndex) {
switch (factoryIndex) {
case 0:
return new MyClass(instanceURI);
...
}
}
}
One factory can be used to create only instances of one ontology class (in which case the parameters classURI and factoryIndex can be ignored) or a factory can be used to create instances of multiple ontology classes (in which case the parameters define which class has to be instantiated). Normally, the classURI could be used to distinguish between the classes, but for performance considerations, a factory index can be passed to the factory allowing for efficient switch statement (the factory index is only used by the ontology and the factory, both are typically created and visible only inside of one bundle). However, which parameters are used and if the factory is used only for a single or multiple ontology classes is up to the developer.
From a programming point of view an ontology is a group of ontology classes and specifies all model information of the ontology classes, e.g.
- what ontology classes exist
- what are their super classes
- what are their properties
- what is the value of a property
- what is the cardinality of a property
To set up an ontology class, the Ontology provides the following protected methods:
- RDFClassInfoSetup createNewRDFClassInfo(String classURI, ResourceFactory fac, int factoryIndex)
- OntClassInfoSetup createNewAbstractOntClassInfo(String classURI)
- OntClassInfoSetup createNewOntClassInfo(String classURI, ResourceFactory fac)
- OntClassInfoSetup createNewOntClassInfo(String classURI, ResourceFactory fac, int factoryIndex)
- OntClassInfoSetup extendExistingOntClassInfo(String classURI)
When the ontology is finished, it has to be registered at the OntologyManagement (realized as Singleton) to be available in the system. This can be done in OSGi BundleActivator:
public class Activator implements BundleActivator {
MyOntology myOntology = new MyOntology();
public void start(BundleContext context) throws Exception {
OntologyManagement.getInstance().register(myOntology);
}
public void stop(BundleContext context) throws Exception {
OntologyManagement.getInstance().unregister(myOntology);
}
}
The OntologyManagement provides the following methods for the management of ontologies:
- boolean register(Ontology ont): after registration, the ontology is avilable in the system.
- boolean unregister(Ontology ont): by unregistering, the ontology is removed from the system.
- String[] getOntoloyURIs(): get a list of URIs of all registered ontologies.
- Ontology getOntology(String uri): get a specific ontology with the given URI.
A TypeExpression lays the cornerstone for ontological reasoning by providing an abstraction of OWL class expressions and OWL datatypes. These two OWL concepts were unified in a common representaion for simplification, so that e.g. the OWL ObjectUnionOf and DataUnionOf are both represented by the one class Union. TypeExpressions can be used, for example, by subscribers of events of a bus to define the set of events it wants to receive. Another example would be a service caller that can call a certain (set of) service(s) by defining appropriate restrictions. Formally, a TypeExpression stands for a set of individuals/data values.
As OWL does not reinvent the wheel it uses concepts from well-know standards, i.e. from XML Schema. For example, the build-in datatypes like Int and Float are used and universAAL accordingly uses the appropriate Java equivalent (the Java classes Integer and Float, respectively). The TypeRestriction, as shown on the left side of the figure, provides the base to further restrict these datatypes according to the XML Schema constraining facets. The BoundedValueRestriction, for example, allows to specify a minimum and/or a maximum value to realize the constraining facets maxInclusive, maxExclusive, minInclusive, and minExclusive. This way, one can specify, e.g. an integer value between 0 and 18. The implemented subclasses (IntRestriction, FloatRestriction, ..) provide comfortable access to this restriction for a specific datatype.
The PropertyRestriction, as shown in the middle of the figure, is defined for a certain property in the context of a certain OWL class. It describes either the property itself (minimum, maximum or exact cardinality) or the value of that property. For example, if the class LightSource is connected to a class Location by the property isInLocation, then this is described by the AllValuesFromRestriction (all values of the property isInLocation are from Location).
The classes on the right side of the picture denote classes and set-theoretic operations. In logical languages, Intersection, Union, and Complement are usually called conjunction, disjunction, and negation. The Enumeration represents a given set of concrete individuals/data values and TypeURI stands for a defined OWL class or datatype. MergedRestriction is a helper class to handle multiple PropertyRestrictions for the same property; typically, this behaves like an Intersection (every PropertyRestriction of the MergedRestriction must apply) therefore it is a subclass of Intersection.
Each TypeExpression implements the following methods:
- hasMember: determines whether a specific individual/data value is a member of the given TypeExpression.
- isDisjointWith: determines whether two TypeExpressions are pair-wise disjoint, i.e. they have no member in common.
- matches: determes whether one TypeExpression is a subset of another TypeExpression.
This artefact is configured through a properties file called org.universAAL.mw.data.representation.properties placed in service folder in the configuration folder.
Property | Unit | Default | Value |
---|---|---|---|
org. universAAL. middleware. peer.member_of | String | urn: org. universAAL. aal_space: test_environment | A URI identifying the uSpace to which this instance of middleware belongs. |
org. universAAL. middleware. peer. is_coordinator | Boolean | true | If set to true, then buses that need a coordinator instance are recommended to make the instance on this node to the coordinator. Only one instance per uSpace is allowed to have this prop set. |
org. universAAL. middleware. debugMode | Boolean | false | If set to true, then buses are recommended to produce more log messages as in production mode (when this flag is not set, we assume production mode). |
This artefact is responsible for (de-)serializing the content of messages exchanged between instances of the buses. This artefact is normally not called directly from components outside the middleware.
Artefact: Data Serialization (Turtle) | |
---|---|
Maven artefact | org.universAAL.middleware / mw.data.serialization.turtle {.core/.osgi} |
Pax Composite bundle | scan-composite:mvn:org.universAAL.middleware/mw.data.serialization.turtle.osgi/x.y.0/composite |
Karaf Feature | - |
Maven Site |
https://universaal.github.io/middleware/middleware.core/mw.data.serialization.turtle.core/index.html https://universaal.github.io/middleware/middleware.osgi/mw.data.serialization.turtle.osgi/index.html |
Example | ContextEvent from Lighting Sample |
- serialization of RDF data
- deserialization of RDF data
This artefact implements the interface MessageContentSerializer exported by mw.bus.model for (de-)serializing the content of messages exchanged between instances of the buses. In the current implementation, all four buses use the same model for representing data (RDF) and therefore only one mechanism for (de-)serializing is needed. To provide a good balance between simplicity in realization of the MessageContentSerializer interface and compactness of messages in their serialized forms (compared to XML serialization), the Terse RDF Triple Language (Turtle) was chosen as (de-)serialisation method. The open source provisions of www.openrdf.org in the package org.openrdf.rio.turtle, in particular the classes TurtleParser and TurtleWriter were customized to match the data representation framework.
Thus, the serializer takes as input a subclass of Resource and creates a String object. Accordingly, the deserializer creates a Resource object from a String.
An implementation of the interface
org.universAAL.middleware.serialization.MessageContentSerializer
is provided as OSGi Service.
- ^ The Institute of Electrical and Electronics Engineers: "IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries", New York, 1990
- ^ Web Ontology Language (OWL): http://www.w3.org/2004/OWL
- ^ Resource Description Framework (RDF): http://www.w3.org/RDF
- ^ Notation 3 (N3): http://www.w3.org/DesignIssues/Notation3
- ^ Turtle: http://www.w3.org/2007/02/turtle/primer/