UncleJim vs. PCollections - GlenKPeterson/Paguro GitHub Wiki

Note: UncleJim is now called Paguro.

PCollections competes only with Paguro's first bullet point: immutable collections, so I'm only going to talk about that in this document.

The PCollections vector (list) is O(log n), just like Clojure's vector and hashMap/hashSet. But Big O notation ignores constant factors and the PCollections vector uses a binary tree which is log2 n while Clojure's vector and hashMap/hashSet use trees with a branching factor of 32 which yields log32 n performance. Because of that, the Clojure collections should be about 5 times faster than the PCollections ones.

Graph of Log base 32 (red) vs. Log base 2 (blue)

This graph shows how many operations each lookup requires (on the y-axis) for a given number of items in the collection (on the x-axis). The red line is the fast log32, the blue is the slower log2. Daniel Spiewak explains all the ramifications of this better than I ever could: https://www.youtube.com/watch?v=pNhBQJN44YQ

Benchmarks

Benchmarks below from a quad-core i7-4790K CPU @ 4.00GHz with hyper-threading disabled. So far this compares just the immutable vector (non-linked list) in Paguro vs. PCollections. Both vector implementations are based on different powers of 2. I used powers of 10 for the vector sizes because I thought they would be unlikely to correspond to any particularly good or bad performance points in either Vector implementation.

Building a vector in Paguro is faster than building one in PCollections:

10 items: 1.4x faster
100: 2.8x faster
1000: 4.4x faster
10000: 5.9x faster
100000: 7.7x faster

These numbers suggest that Paguro's Vector from Clojure is faster and it scales better than PCollections Vector. The performance gap increases by a logarithmic factor of a base near 4, which is roughly what was predicted above. I tried running some tests on accessing each item by index and by iterator, but because of the way I ran them, all I could tell is that building the collection took about 3x longer than accessing the data in it once it was built. This proportion of build time to access time was about equal across the board.

Data

I built the vector from Strings using the ordinal(int n) function which takes a little time to produce, but I don't think this makes much difference.

java -jar target/benchmarks.jar -wi 5 -i 7
Benchmark                              Mode  Cnt        Score       Error  Units
MyBenchmark.buildImList10             thrpt   70  3690204.388 ±  7137.675  ops/s
MyBenchmark.buildImList100            thrpt   70   340778.855 ±  3838.734  ops/s
MyBenchmark.buildImList1000           thrpt   70    30860.163 ±   245.794  ops/s
MyBenchmark.buildImList10000          thrpt   70     2869.233 ±     8.490  ops/s
MyBenchmark.buildImList100000         thrpt   70      271.891 ±     0.642  ops/s
MyBenchmark.buildTreePVector10        thrpt   70  2580167.387 ± 12669.661  ops/s
MyBenchmark.buildTreePVector100       thrpt   70   120573.558 ±   316.766  ops/s
MyBenchmark.buildTreePVector1000      thrpt   70     6939.288 ±    12.626  ops/s
MyBenchmark.buildTreePVector10000     thrpt   70      488.121 ±     1.626  ops/s
MyBenchmark.buildTreePVector100000    thrpt   70       35.501 ±     0.070  ops/s

Tests:

public ImList<String> buildImList(ImList<String> empty, int maxIdx) {
    for (int i = 0; i < maxIdx; i++) {
        empty = empty.append(ordinal(i));
    }
    return empty;
}

public PVector<String> buildPVector(PVector<String> empty, int maxIdx) {
    for (int i = 0; i < maxIdx; i++) {
        empty = empty.plus(ordinal(i));
    }
    return empty;
}

@Benchmark public void buildImList10() {
    ImList<String> vec = buildImList(PersistentVector.empty(), 10);
    String last = vec.get(9);
    assert("9th".equals(last));
}

@Benchmark public void buildTreePVector10() {
    PVector<String> vec = buildPVector(TreePVector.empty(), 10);
    String last = vec.get(9);
    assert("9th".equals(last));
}

// And so on using lots of cut and pasted code for 100, 1000, 10000, etc.

Further Thoughts

The Clojure collections also walk the sibling nodes in the internal data trees of these structures to provide iterators. PCollections starts from the top of the tree doing an index lookup for each item, then increments the index and goes back to the top to look up the next (at least for the list implementation). This may give Clojure's vector some advantage when iterating. I look forward to testing that theory.

Clojure's (and Java's) sorted/tree map/set implementations are log2 n, so PCollections is likely about as fast for those two collections. If someone does performance testing to verify these theories, please let me know so I can link to it here. I plan to come back to this with more tests and data later...

⚠️ **GitHub.com Fallback** ⚠️