Multi Threaded Transactions - andrew-nguyen/titan GitHub Wiki

Titan supports multi-threaded transactions through Blueprint’s ThreadedTransactionalGraph interface. Hence, to speed up transaction processing and utilize multi-core architectures multiple threads can run concurrently in a single transaction.

With Blueprints’ default transaction handling each thread automatically opens its own transaction against the graph database. To open a thread-independent transaction, use the newTransaction() method.

TransactionalGraph tx = g.newTransaction();
Thread[] threads = new Thread[10];
for (int i=0;i<threads.length;i++) {
    threads[i]=new Thread(new DoSomething(tx));
    threads[i].start();
}
for (int i=0;i<threads.length;i++) threads[i].join();
tx.commit();

The newTransaction() method returns a new TransactionalGraph object that represents this newly opened transaction. The graph object tx supports all of the method that the original graph did, but does so without opening new transactions for each thread. This allows us to start multiple threads which all do-something in the same transaction and finally commit the transaction when all threads have completed their work.

Titan relies on optimized concurrent data structures to support hundreds of concurrent threads running efficiently in a single transaction.

Concurrent Algorithms

Thread independent transactions started through newTransaction() are particularly useful when implementing concurrent graph algorithms. Most traversal or message-passing (ego-centric) like graph algorithms are embarrassingly parallel which means they can be parallelized and executed through multiple threads with little effort. Each of these threads can operate on a single TransactionalGraph object returned by newTransaction without blocking each other.

Nested Transactions

Another use case for thread independent transactions is nested transactions that ought to be independent from the surrounding transaction.

For instance, assume a long running transactional job that has to create a new vertex with a unique name. Since enforcing unique names requires the acquisition of a lock (see Type Definition Overview for more detail) and since the transaction is running for a long time, lock congestion and expensive transactional failures are likely.

Vertex v1 = g.addVertex(null);
//Do many other things
Vertex v2 = g.addVertex(null);
v2.setProperty("uniqueName","foo");
g.addEdge(null,v1,v2,"related");
//Do many other things
g.commit(); // Likely to fail due to lock congestion

One way around this is to create the vertex in a short, nested thread-independent transaction as demonstrated by the following pseudo code:

Vertex v1 = g.addVertex(null);
//Do many other things
TransactionalGraph tx = g.newTransaction();
Vertex v2 = tx.addVertex(null);
v2.setProperty("uniqueName","foo");
tx.commit();
g.addEdge(null,v1,g.getVertex(v2),"related"); //Need to load v2 into outer transaction
//Do many other things
g.commit(); // Likely to fail due to lock congestion

Gotcha

When using multi-threaded transactions via newTransaction all vertices and edges retrieved or created in the scope of that transaction are not available outside the scope of that transaction. Accessing such elements after the transaction has been closed will result in an exception. As demonstrated in the example above, such elements have to be explicitly refreshed in the new transaction using g.getVertex(existingVertex) or g.getEdge(existingEdge).

Next steps

Read more about Blueprint’s ThreadedTransactionalGraph.
Read about default Transaction Handling.