UncleJim vs. PCollections - GlenKPeterson/Paguro GitHub Wiki
Note: UncleJim is now called Paguro.
PCollections competes only with Paguro's first bullet point: immutable collections, so I'm only going to talk about that in this document.
The PCollections vector (list) is O(log n), just like Clojure's vector and hashMap/hashSet. But Big O notation ignores constant factors and the PCollections vector uses a binary tree which is log2 n while Clojure's vector and hashMap/hashSet use trees with a branching factor of 32 which yields log32 n performance. Because of that, the Clojure collections should be about 5 times faster than the PCollections ones.
This graph shows how many operations each lookup requires (on the y-axis) for a given number of items in the collection (on the x-axis). The red line is the fast log32, the blue is the slower log2. Daniel Spiewak explains all the ramifications of this better than I ever could: https://www.youtube.com/watch?v=pNhBQJN44YQ
Benchmarks below from a quad-core i7-4790K CPU @ 4.00GHz with hyper-threading disabled. So far this compares just the immutable vector (non-linked list) in Paguro vs. PCollections. Both vector implementations are based on different powers of 2. I used powers of 10 for the vector sizes because I thought they would be unlikely to correspond to any particularly good or bad performance points in either Vector implementation.
10 items: 1.4x faster
100: 2.8x faster
1000: 4.4x faster
10000: 5.9x faster
100000: 7.7x faster
These numbers suggest that Paguro's Vector from Clojure is faster and it scales better than PCollections Vector. The performance gap increases by a logarithmic factor of a base near 4, which is roughly what was predicted above. I tried running some tests on accessing each item by index and by iterator, but because of the way I ran them, all I could tell is that building the collection took about 3x longer than accessing the data in it once it was built. This proportion of build time to access time was about equal across the board.
I built the vector from Strings using the ordinal(int n)
function which takes a little time to produce, but I don't think this makes much difference.
java -jar target/benchmarks.jar -wi 5 -i 7
Benchmark Mode Cnt Score Error Units
MyBenchmark.buildImList10 thrpt 70 3690204.388 ± 7137.675 ops/s
MyBenchmark.buildImList100 thrpt 70 340778.855 ± 3838.734 ops/s
MyBenchmark.buildImList1000 thrpt 70 30860.163 ± 245.794 ops/s
MyBenchmark.buildImList10000 thrpt 70 2869.233 ± 8.490 ops/s
MyBenchmark.buildImList100000 thrpt 70 271.891 ± 0.642 ops/s
MyBenchmark.buildTreePVector10 thrpt 70 2580167.387 ± 12669.661 ops/s
MyBenchmark.buildTreePVector100 thrpt 70 120573.558 ± 316.766 ops/s
MyBenchmark.buildTreePVector1000 thrpt 70 6939.288 ± 12.626 ops/s
MyBenchmark.buildTreePVector10000 thrpt 70 488.121 ± 1.626 ops/s
MyBenchmark.buildTreePVector100000 thrpt 70 35.501 ± 0.070 ops/s
public ImList<String> buildImList(ImList<String> empty, int maxIdx) {
for (int i = 0; i < maxIdx; i++) {
empty = empty.append(ordinal(i));
}
return empty;
}
public PVector<String> buildPVector(PVector<String> empty, int maxIdx) {
for (int i = 0; i < maxIdx; i++) {
empty = empty.plus(ordinal(i));
}
return empty;
}
@Benchmark public void buildImList10() {
ImList<String> vec = buildImList(PersistentVector.empty(), 10);
String last = vec.get(9);
assert("9th".equals(last));
}
@Benchmark public void buildTreePVector10() {
PVector<String> vec = buildPVector(TreePVector.empty(), 10);
String last = vec.get(9);
assert("9th".equals(last));
}
// And so on using lots of cut and pasted code for 100, 1000, 10000, etc.
The Clojure collections also walk the sibling nodes in the internal data trees of these structures to provide iterators. PCollections starts from the top of the tree doing an index lookup for each item, then increments the index and goes back to the top to look up the next (at least for the list implementation). This may give Clojure's vector some advantage when iterating. I look forward to testing that theory.
Clojure's (and Java's) sorted/tree map/set implementations are log2 n, so PCollections is likely about as fast for those two collections. If someone does performance testing to verify these theories, please let me know so I can link to it here. I plan to come back to this with more tests and data later...