Using Persistit - andrew-nguyen/titan GitHub Wiki
The Persistit storage backend runs in the same JVM as Titan and provides local persistence on a single machine. Hence, the Persistit storage backend requires that all of the graph data fits on the local disk and all of the frequently accessed graph elements fit into main memory. This imposes a practical limitation of graphs with 10-100s million vertices on commodity hardware. However, for graphs of that size the Persistit storage backend exhibits high performance because all data can be accessed locally within the same JVM.
Note, that the Persistit storage backend was first included in Titan 0.4.0 and is considered experimental for now.
Akiban Persistit is a fast, transactional, Java B+Tree library available as open source or under a free use license.
We have worked hard to make Akiban Persistit™ exceptionally fast, reliable, simple and lightweight. We hope you will enjoy learning more about it and using it. — Persistit Github page
Since Persistit runs in the same JVM as Titan, connecting the two only requires a simple configuration and no additional setup:
Configuration conf = new BaseConfiguration();
conf.setProperty("storage.directory", "/tmp/graph");
conf.setProperty("storage.backend", "persistit");
TitanGraph graph = TitanFactory.open(conf);
In addition to the general Titan Graph Configuration, there are the following Persistit specific Titan configuration options:
Option | Description | Value | Default | Modifiable |
---|---|---|---|---|
storage.buffercount | The size of the Persistit internal buffer | >=0 | 5000 | Yes |
The Persistit storage backend is best suited for small to medium size graphs with up to 100 million vertices on commodity hardware. For graphs of that size, it will likely deliver higher performance than the distributed storage backends. Note, that Persitit is also limited in the number of concurrent requests it can handle efficiently because it runs on a single machine. Hence, it is not well suited for applications with many concurrent users mutating the graph, even if that graph is small to medium size.
Since Persitit runs in the same JVM as Titan, this storage backend is ideally suited for unit testing of application code using Titan.
Titan backed by Persitit supports global graph operations such as iterating over all vertices or edges. However, note that such operations need to scan the entire database which can require a significant amount of time for larger graphs.
In order to not run out of memory, it is advised to disable transactions (storage.transactions=false
) when iterating over large graphs. Having transactions enabled requires Persitit to acquire read locks on the data it is reading. When iterating over the entire graph, these read locks can easily require more memory than is available.