Type Definition Overview - andrew-nguyen/titan GitHub Wiki

In Titan, edge labels and property keys are types which can be individually configured to provide data verification, better storage efficiency, and higher performance. Types are uniquely identified by their name and are themselves vertices in the graph. Type vertices can be retrieved by their name.

TitanGraph graph = ...
TitanType name = graph.getType("name");

A TitanType is either a TitanLabel (for edges) or a TitanKey (for properties) which means either TitanType.isEdgeLabel() or TitanType.isPropertyKey() is true and we can cast it to the particular subtype.

TitanType name = graph.getType("name");
if (name.isPropertyKey()) 
  TitanKey namekey = (TitanKey)name;
else 
  TitanLabel namelabel = (TitanLabel)name;

Most methods in Titan are overloaded to allow either the type name or the type object as argument.

To retrieve all types that have been defined in the graph, use the getTypes method:

//Returns all defined keys in the graph
Iterable<TitanKey> keys = graph.getTypes(TitanKey.class)
//Returns all defined labels in the graph
Iterable<TitanLabel> labels = graph.getTypes(TitanLabel.class)

Labels and keys are automatically created with a default configuration when their name is first used in the graph. Types can also be created and configured explicitly by the user. It is strongly encouraged to define types explicitly and to disable automatic type creation as described below.

Creating Property Keys

The makeKey(String name) method on a graph or transaction returns a KeyMaker constructor to define a new property key. The method expects the name of the key which must be unique.

The KeyMaker allows the following aspects of a property key to be configured. Once all desired aspects have been configured, calling KeyMaker.make() creates the new property key.

dataType(Class)

Configures the data type of this key to be the given class. Property instances for this key will only accept attribute values that are instances of this class or can be automatically converted to instances of this class.

However, setting the data type to Object.class allows any type of attribute but comes at the expense of longer serialization because class information is stored with the attribute value. For this and reasons of data consistency and validation, it is suggested to define an explicit data type other than Object.class.

There is no default setting for the data type and every property key must have its data type configured.

Titan supports arbitrary classes as data types for properties. Those must be serializable. If default serialization is not applicable a custom serializer must be implemented and registered with Titan. See Graph Configuration for more information on how to define custom attribute data types and handlers.

Also, note some of the limitations and gotchas when it comes to data types.

The data type of an existing property can be retrieved as TitanKey.getDataType().

indexed(index,Element)

Registers the property key with the given index for the provided element type, i.e. vertex or edge. Depending on the specified element type, all vertices or edges with a property of this key are indexed against the given index by their property value. This allows vertices and/or edges to be retrieved by property value using GraphQuery as described in Indexing Backend Overview.

The name of the index must match one of the registered external indexing backends. External indexing backends are registered by name in the Titan configuration as described in Indexing Backend Overview.
If no name is provided, i.e. indexed(Element) is called, the standard index is used.

Note, that a property can be indexed for both element types (vertices & edges) at the same time and can be indexed against multiple indices.
For instance, one can the property key name to be indexed for both vertices and edges against the standard and registered search index as follows:

g.makeKey("name").dataType(String.class).indexed(Vertex.class).indexed(Edge.class).indexed("search",Vertex.class).indexed("search",Edge.class).make()

To inspect all indexes for a particular key, use TitanKey.getIndexes(Element) or TitanKey.hasIndex(index,Element) to verify that a particular index is active.

Unique()

Defines a property key to be unique which means that any property value for this key must be uniquely associated with a vertex. Attempting to set the same property value on two different vertices will result in an exception.

For example, one can define the “username” property key as unique to ensure that usernames are uniquely assigned across the entire graph.

Declaring a property key to be unique requires that a standard vertex index is defined. In other words, unique() requires indexed(Vertex.class).

Note, that uniqueness only applies to vertices and not to edges.

Single and multi-value Keys

By default, a property key maps onto a single value or none at all for any given vertex. In other words, property keys are single-valued by default. Single valued properties are set using setProperty(key,value) on a vertex or edge, which replaces any existing value for the same key, and retrieved via getProperty(key). For example, “birthday” is a single-valued property key since a person has exactly one birthday.

Calling KeyMaker.list() allows a list of values on this property key for each vertex. This is useful when a property key is multi-valued, like “email” for example, since a user can have multiple email addresses. For list-valued property keys, individual values must be added via addProperty(key,value) instead of setProperty(key,value). getProperty(key) returns a list of values. getProperties(key) returns an Iterable over TitanProperty. Properties are first class citizens and Titan maintains one property for each key-value pair. To delete a particular value from a vertex, one must call delete() on the respective TitanProperty.

Declaring a property key to have a list of values can also be used to avoid locking and accidental overwriting for single-valued property keys on eventually consistent storage backends.

With list() a property key is declared multi-valued. To make it explicit that a property key is single-valued, use single() which is also the default.

Note, that only single valued property keys can be used on edges. List-valued properties only apply to vertices.

Creating Edge Labels

The makeLabel(String name) method on a graph or transaction returns a LabelMaker constructor to define a new edge label. The method expects the name of the label which must be unique.

The LabelMaker allows the following aspects of an edge label to be configured. Once all desired aspects have been configured, calling @ LabelMaker()@ creates the new edge label.

Cardinality Constraints

By default, a label does not impose any cardinality constraints on its edges. One can configure a cardinality constraint to make sure that domain constraints are imposed at runtime.

  • oneToMany():
    Configures the label to allow at most one incoming edge of this label for each vertex in the graph. For instance, the label “fatherOf” is biologically a oneToMany edge label.
  • manyToOne():
    Configures the label to allow at most one outgoing edge of this label for each vertex in the graph. For instance, the label “sonOf” is biologically a manyToOne edge label.
  • oneToOne():
    Configures the label to allow at most one outgoing and one incoming edge of this label for each vertex in the graph.

By default, labels are configures as manyToMany().

sortKey(TitanType…) and signature(TitanType…)

Specifying the sort key of a label allows edges with this label to be efficiently retrieved in the specified sort-order. Titan builds vertex-centric indices for each label according to the sort key definition which can significantly speed up queries.

TitanKey time = g.makeKey("time").dataType(Integer.class).make();
TitanLabel battled = g.makeLabel("battled").sortKey(time).make();

In this example, the property key time is defined with data type Integer. This property key is then used as the sort key for the battled edge label. Hence, battled edges will be sorted by time in ascending order and battles that happened in a certain time range can be queried for more efficiently using an appropriate VertexQuery. Moreover, battled edges are stored more compactly on disk.

The default sort order is ascending. To specify that battled should be sorted in decreasing order to be able to efficiently retrieve the most recent battles, the label definition would be extended by a sortOrder():

TitanKey time = g.makeKey("time").dataType(Integer.class).make();
TitanLabel battled = g.makeLabel("battled").sortKey(time).sortOrder(Order.DESC).make();

Note, that TitanTypes used in the sort key must be single-valued property keys or many-to-one, uni-directed edge labels.
The sort-key can be composite, that is comprised of multiple types. For composite sort keys, edges are sorted by the first type, then the second and so forth.

See Vertex-Centric Indices for more information on the benefits of sort-keys.

If one is not interested in configuring the sort-order of edges but only wants to benefit from the storage efficiencies introduced by sort keys, one can alternatively configure the signature of a label. Specifying the signature of a label tells the graph database to expect that edges with this label always have or are likely to have an incident property or unidirected edge of the type included in the signature. This allows the graph database to store such edges more compactly and retrieve them more quickly.

TitanKey time = g.makeKey("time").dataType(Integer.class).make();
TitanLabel battled = g.makeLabel("battled").signature(time).make();

This example is almost identical to the sort key example above with the only difference that time is configured to be part of the signature.

If a type is used in the sort key, it cannot be part of the signature. As before, TitanTypes used in the signature must be either single-valued property keys or many-to-one, uni-directed edge labels.

unidirected()

Configures this label to be uni-directed which means that the edge is only created in the out-going direction. One can think of uni-directed edges as links pointing to another vertex such that only the outgoing vertex but not the incoming vertex is aware of its existence.

Uni-directed edges have a lower storage footprint and can be used to overcome the super node problem in cases where the super node is created due to many incoming edges that need not be traversed in the other direction.

Furthermore, many-to-one uni-directed edges can also be created on edges, pointing from an edge to a vertex using TitanEdge.setProperty(Label,Vertex). Such edges are retrieved via TitanEdge.getProperty(Label). Hence, Titan provides limited support for hyper edges which is useful for attaching provenance or authorization information to edges.

NOTE – Unlike standard edges, unidirected edge won’t be deleted when its target vertex gets deleted. They must be removed manually from the source vertex.

Uniqueness Consistency

When defining edge labels or property keys that impose consistency constraints, such as KeyMaker.unique() or LabelMaker.manyToOne(), inconsistencies could arise when two TitanGraph instances try to update the same edge or property concurrently, since one may overwrite the change of the other. When the underlying storage backend supports transactional isolation, titan will delegate consistency checks and locks to the storage backend. To avoid such inconsistencies on eventually consistent backends, Titan can acquire locks and will do so by default. Acquiring locks, however, can be very expensive. In cases where concurrent modifications can be excluded or blind overwrites are acceptable one may alter this default behavior. This is way all consistency imposing configuration methods in KeyMaker and LabelMaker accept an additional UniquenessConsistency argument which allows the user to specify how consistency should be enforced:

  • UniquenessConsistency.NO_LOCK
  • UniquenessConsistency.LOCK

This configuration option should be used with care and only if the extra performance gain is needed.

Note, that single-valued property keys are non-locking by default. Explicitly configuring single() ensures that a lock is acquired to avoid concurrent modification.

Type Definition Examples

Below are some examples of creating types in Titan.

g.makeLabel("sonOf").manyToOne().make();
g.makeLabel("spouse").oneToOne().make();

TitanKey time = g.makeKey("time").dataType(Long.class).single().make();
g.makeLabel("author").oneToMany(UniquenessConsistency.NO_LOCK).sortKey(time).make();

Default Type Creation

Titan will create edge labels and property keys the first time they are referenced by name using a default configuration unless they have been previously configured using makeKey or makeLabel factory methods as discussed above.

By default, property keys are configured to be single-valued but non-locking with Object.class as the data type. Note, that it is more efficient to define an appropriate data type via makeKey(String). Hence, property keys don’t have an index by default. To create an indexed key with this default configuration, invoke Graph.createKeyIndex("name",Vertex.class) before the property key is being used.

Edge labels are configured to be many-to-many by default.

The default type creation behavior is configured via the autotype configuration option. By default, it uses the configuration value blueprints which creates types automatically as described above. To disable automatic type creation, set autotype=none. Setting the option to none requires that all types are explicitly created and will throw an IllegalArgumentException each time a non-existent type is referenced which is useful to avoid type name typos.

It is strongly encouraged to disable automatic type creation and to define all types explicitly. This eliminates the possibility of accidental type creation and ensures that all types are documented.

Gotchas

  • Name Uniqueness: Types are uniquely identified by their name. Attempting to define a new type with an existing name will result in an exception. Note, that labels and keys share the same namespace, i.e., labels and keys cannot have the same name either.
  • Batch-Loading: When batch-loading data or disabling locking through any other means, Titan may not be able to guarantee that type names are unique when race conditions occur which leads to data corruption. Hence, it is very important to define all labels and keys prior to loading and to disable automatic type creation as explained above.
⚠️ **GitHub.com Fallback** ⚠️