Guava - illyfrancis/scribble GitHub Wiki

Old presentation

Immutable collections

  • use immutable collections, instead of unmodifirable
    • ImmuatableList, Set, SortedSet, Map, (SortedMap - oneday)
    • immutable vs unmodifiable
      • immutability guarantee, easier to use, faster, less memory
  • comparing to Collections.unmodifirableXXX with immutable (8:06)

Examples

  • constant set using JDK (11:55)
  • a little better Collections.unmodifiableSet( new LinkedHashSet<Integer>( Arrays.asList(...)));
    • ImmutableSet LUCKY_NUMBERS = ImmutableSet.of(4,8,15...);
      • of() <- java.util.EnumSet pattern (** check it out**)
      • constant map - ImmutableMap<String, Integer> ENG_TO_INT = ImmutableMap.with("four", 4)...build() (15:16)
      • Defensive copies
        • ImmutableSet.copyOf(numbers) (17:10)
      • factories
        • ImmutableSet.of() (== Collections.emptyset ?)
        • ImmutableSet.of(a) (== Collections.singleton?)
        • ImmutableSet.of(a, b, c)
        • ImmutableSet.copyOf(someIterator);
        • ImmutableSet.copyOf(someIterable);
      • static final ImmutableMap<Integer, String> MAP = ImmutableMap.of(1, "one", 2, "two");
      • there should be a builders pattern (check current imp)
    • Don't take nulls!!!

Multisets

  • Type of collection
Collection behaviour
  • Can it have duplicates?
  • Is ordering significant? (for equals())
  • Iteration order
    • insertion-ordered? (Linked list) comparator-ordered? (Tree) user-ordered? (Array)
    • something else well-defined?
    • or it just doesn't matter?

In general, the first two determine the interface type, and the third tends to influce your choice of implementation

List vs. Set
  • Set: unordered equality, no dups
  • List: ordered equauality, can have dups
Diagram (31:50)
                 Ordered?
             Y             N
Dups? +--------------+----------+
  Y   |     List     | Multiset |
      +--------------+----------+
  N   | (UniqueList) |    Set   |
      +--------------+----------+
Multiset: unordered equality, can have dups
  • == Bag
  • use cases,
    • hand of cards, compare same hands
    • are these Lists equal, ignoring order?
    • histograms, what distinct tags am I using on my blog, and how many times do I use each one?
Example code, before (36:07)
Map<String, Integer> tags
  = new HashMap<String, Integer>();
for (BlogPost post : getAllBlogPosts()) {
  for (String tag: post.getTags()) {
    int value = tags.containsKey(tag) ? tags.get(tag) : 0;
    tags.put(tag, value + 1);
  }
}
  • distinct tags: tags.keySet()
  • count for "java" tag: tags.containsKey("java") ? tags.get("java") : 0;
  • total count: // oh crap...
  • Java tutorial shows proper way of doing this!!! (have a look)
Example code, after(38:27)
Multiset<String> tags = HashMultiset.create();
for (BlogPost post : getAllBlogPosts()) {
  tags.addAll(post.getTags())
}
  • distinct tags: tags.elementSet();
  • count for "java" tag: tags.count("java")
  • total count: tags.size()
Example, after after (40:10)
  • if need to remove/decrement? (Multiset supports it)
  • concurrency? (instead of locking the entire map, use ConcurrentMultiset)
Multiset API (00:41)
  • Everything from collection plus
  • count, add, remove, setCount, etc
Multiset implementations
  • ImmutableMultiset
  • HashMultiset
  • LinkedHashMultiset
  • TreeMultiset
  • EnumMultset
  • ConcurrentMultiset

Multimaps

Before
Map<Salesperson, List<Sale>> map = new HashMap<Salesperson, List<Sale>>();

public void makeSale(Salesperson salesPerson, Sale sale) {
  List<Sale> sales = map.get(salesPerson);
  if (sales == null) {
    sales = new ArrayList<Sale>();
    map.put(salesPerson, sales);
  }
  sales.add(sale);
}
with multimaps
Multimap<Salesperson, Sale> multimap = ArrayListMultimap.create();

public void void makeSale(Salesperson salesPerson, Sale sale) {
  multimap.put(salesPerson, sale);
}
  • collection of key-value pairs (entries) like a Map except that keys don't have to be unique {a=1, a=2, b=3}
  • or as Map use asMap() {a=[1,2], b=[3]}
  • get() view implements subtype, (13:44)
  • re-look 14:20, good example biggiest sale
  • view collections - Multimap has six: get(), keys(), keySet(), values(), entries(), asMap()
Multimap vs Map
  • Most Map methods are identical on Multimap
    • size(), isEmpty()
    • containsKey(), containsValue()
    • put(), putAll()
    • clear()
    • values()
  • The others have analogues
    • get() returns Collection instead of V
    • remove(K) becomes remove(K, V) and removeAll(K)
    • keySet() becomes keys()(well, and and KeySet())
    • entrySet() becomes entries()
  • And Multimap has a few new things
    • containsEntry(), replaceValues()

BiMap

  • aka, unique-values map, guarantees its values are unique as well as its keys
  • has inverse() view
    • bimap.inverse().inverse() == bimap
  • stop creating two separate forward and backward Maps!

ReferenceMap (21:39)

  • when dealing with weak or soft refereces
  • a generalization of java.util.WeakHashMap
  • Nine possible combos:
    • strong, weak, or soft keys
    • strong, weak or soft values
  • fully concurrent
    • implements ConcurrentMap
    • cleanup done on GC Thread
  • and more...
  • used a lot for caching,
    • when using strong reference, the object doesn't get gced? but weak reference does????? (24:00)

Ordering class

  • Comparator is easy to implement but a pain to use
  • Ordering is Comparator++ (or RichComparator)

Ordering<String> caseless = Ordering.forComparator(String.CASE_INSENSITIVE_ORDER)

  • Also there's method like min(iterable), max(iter), isIncreasing(iterable), sortedCopy(iterable), reverse()...

Static factory methods

Rather than we type
Multimap<String, Class<? extends Handler>> handlers =
  new ArrayListMultimap<String, Class<? extends Handler>>();
do this
Multimap<String, Class<? extends Handler>> handlers =
  ArrayListMultimap.create();
  • also provided for JDK collections, like Lists, Sets, Maps
  • with overloads to accept Iterables to copy elements from

Working with Iterator and Iterables

  • Collection is a good abstraction when all your data is in memory
  • Sometimes you want to process large amounts of data in a single pass
  • Implementing Collection is possible but cumbersome, and won't behave nicely
  • Iterator and Iterable are often all you need
  • GoogleCollection methods accept Iterator and Iterable whenever practical

Iterators and Iterables classes

  • These classes have parallel APIs, one for Iterator and the other for Iterable
Iterable transform(Iterable, Function)
Iterable filter(Iterable, Predicate)
T find(Iterable<T>, Predicate)
Iterable concat(Iterable<Iterable>)
Iterable cycle(Iterable)
T getOnlyElement(Iterable<T>)
Iterable<T> reverse(List<T>)
...
  • These methods are LAZY!
    • backing iterators aren't accessed until needed

Presso by an aussie guy (not that good)

No more null

  • use Optional

class Person {
  Optional<Color> getFavoriteColor();
}

Color colorToUse = person.getFavoriteColor().or(Blue)

From Devoxx

com.google.common.base

Preconditions

  • Used to validate assumptions at the start of methods or constructors (and fail-fast)

public Car(Engine engine) {
  this.engine = checkNotNull(engine); // NPE
}
public void drive(double speed) {
  checkArgument(speed > 0.0, "speed (%s) must be positive", speed); // IAE
  checkState(engine.isRunning(), "engine must be running"); // ISE
  ...
}

Objects.toStringHelper()

  • for implementing Object.toString() cleaner

return Objects.toStringHelper(this)
  .add("name", name)
  .add("id", userId)
  .add("pet", petName)  // petName is @Nullable!
  .omitNullValues()
  .toString();

// "Person{name=Kurt Kluever, id=42}"

or without .omitNullValues()

// "Person{name=Kurt Kluever, id=42, pet=null}"

Stopwatch

  • Prefer Stopwatch over System.nanoTime()
    • (and definitely over currentTimeMillis())
    • exposes relative timings, not absolute time
    • alternate time sources can be substituted using Ticker (read() returns nanoseconds)
    • toString() gives human readable format

Stopwatch stopwatch = new Stopwatch();
stopwatch.start();
doSomeOtherOperation();
long millis = stopwatch.elapsedMillis();
long nanos = stopwatch.elapsedTime(TimeUnit.NANOSECONDS);

String splitting

Splitter.on(',')
  .trimResults()
  .omitEmptyStrings()
  .split(" foo, ,bar, quux, ");

=> ["foo", "bar", "quux"]

  • split()
  • .trimResults()
  • .omitEmptyStrings()

private static final Splitter SPLITTER = 
  Splitter.on(',').trimResults();

SPLITTER.split("Kurt, Kevin, Chris");
// yields: ["Kurt", "Kevin", "Chris"]

String Joining

  • Joiner concats strings using a delimiter
    • throws a NPE on null objects, unless
      • .skipNulls()
      • .useForNull(String)

private static final Joiner JOINER = 
  Joiner.on(", ").skipNulls();

JOINER.join("Kurt", "Keven", null, "Chris");
// yields: "Kurt, Kevin, Chris"

CharMatcher

  • What's a matching character?
    • WHITESPACE, ASCII, ANY (many pre-defined sets)
    • .is('x'), .isNot('_'), .oneOf("aeiou"), .inRange('a', 'Z')
    • or subclass CharMatcher implement matches(char)
  • What to do with those matching characters?
    • matchsAllOf, matchesAnyOf, matchesNoneOf
    • indexIn, lastLindexIn, countIn
    • removeFrom, retainFrom
    • trimFrom, trimLeadingFrom, trimTrailingFrom
    • collapseFrom, trimAndCollapseFrom, replaceFrom
  • Example (scrub a user ID)
    • CharMatcher.DIGIT.or(CharMatcher.is('-')).retainFrom(userInput);

Optional<T>

  • immutable wrapper that is either:
    • present - contains non-null reference
    • absent - contains nothing
    • it never contains 'null'
  • possible uses:
    • return type (vs null)
      • a T that must be present
      • a T that might be absent
    • distinguish between
      • unknown (not present in a map)
      • known to have no value (present in the map with value Optional.absent())
    • wrap nullable references for storage in a collection that does not support null
  • creating an Optional<T>
    • Optional.of(notNull);
    • Optional.absent();
    • Optional.fromNullable(maybeNull);
  • Unwrapping an Optional<T>
    • mediaType.charset().get();
    • mediaType.charset().or(Charsets.UTF_8);
    • mediaType.charset().or(costlySupplier);
    • mediaType.charset().orNull();
  • Other useful methods
    • mediaType.charset().asSet(); // 0 or 1
    • mediaType.charset().transform(stringFunc);

Functional Programming

  • Function<F, T>
    • one way transformation of F into T
    • T apply(F input)
    • most common use: transforming collections (view)
  • Predicate<F>
    • determines true or false for a given F
    • boolean apply(F input)
    • most common use: filtering collections (view)

com.google.common.collection

FP example

Predicate<Client> activeClients = new Predicate<Client>() {
  public boolean apply(Client client) {
    return client.activeInLastMonth();
  }
};

Returns an immutable list of the names of the first 10 active clients in the database.

FluentIterable.from(database.getClientList())
  .filter(activeClients)  // Predicate
  .transform(Functions.toStringFunction())  // Function
  .limit(10)
  .toImmutableList();

FluentIterable API

  • Chaining (returns FluentIterable)
    • skip
    • limit
    • cycle
    • filter, transform
  • Querying (returns boolean)
    • allMatch, anyMatch
    • contains, isEmpty
  • Converting
    • toImmutable{List, Set, SortedSet}
    • toArray
  • Extracting
    • first, last, firstMatch (returns Optional)
    • get (returns E)

FP

functional style
Function<String, Integer> lengthFunction = 
  new Function<String, Integer>() {
    public Integer apply(String string) {
      return string.length();
    }
  };

Predicate<String> allCaps = new Predicate<String>() {
  public boolean apply(String string) {
    return CharMatcher.JAVA_UPPER_CASE
      .matchesAllOf(string);
  }
};

Multiset<Integer> lengths = HashMultiset.create(
  Iterables.transform(
    Iterables.filter(strings, allCaps),
    lengthFunction));

// ugly!!

without fp
Multiset<Integer> lengths = HashMultiset.create();
for (String string: strings) {
  if (CharMatcher.JAVA_UPPER_CASE.matchesAllOf(string)) {
    lengths.add(string.length());
  }
}

Multiset<E>

  • == a bag
  • add multiple instances of a given element
  • counts how many occurrences exist
  • similar to a Map<E, Integer> but.
    • only positive counts
    • size() returns total # of items, not # keys
    • count() for non-existent key is 0
    • iterator() goes over each element in the Multiset
      • elementSet().iterator() unique elements
  • similar to AtomicLongMap<E> which is like a Map<E, AtomicLong>

Multimap<K, V>

  • Like a Map but may have duplicate keys
  • The values related to a single key can be viewed as a collection (set or list)
  • similar to a Map<K, Collection<V>> but
    • get() never returns null (returns an empty collection)
    • containsKey() is ture only if 1 or more values exists
    • entries() returns all entries for all keys
    • size() returns total number of entries, not keys
    • asMap() to view it as a Map<K, Collection<V>>
  • typically want variable type to be either ListMultimap SetMultimap (and not Multimap) - ???

BiMap<K1, K2>

  • bi-directional map
  • both keys and values are unique
  • can view the inverse map with inverse()
  • use instead of maintaining two separate maps
    • Map<K1, K1>
    • Map<K2, K1>

Table<R, C, V>

A "two-tier" map, or a map with two keys (called the "row key" and "column key")

  • can be sparse or dense
    • HashBasedTable: uses hash maps (sparse)
    • TreeBasedTable: uses tree maps (sparse)
    • ArrayTable: uses V[][] (dense)
  • many views on the underlying data are possible
    • row or column map (of maps)
    • row or column key set
    • set of all cells (as <R, C, V> entries>
  • use instead of Map<R, Map<C, V>>

Immutable Collections

  • offered for all collection types
  • inherently thread-safe
  • reduced memory footprint
  • similar to Collections.unmodifiableXXX but
    • performs a copy (not a view / wrapper)
    • more efficient compared to unmodifiable collections
    • type conveys immutability

Cimparators

Ugly...

Comparator<String> byReverseOffsetThenName = 
  new Comparator<String>() {
    public int compare(String tzId1, String tzId2) {
      int offset1 = getOffsetForTzId(tzId1);
      int offset2 = getOffsetForTzId(tzId2);
      int result = offset2 - offset1;  // careful! (why??? could be null???)
      return (result == 0)
        ? tzId1.compareTo(tzId2)
        : result;
    }
  };

ComparisonChain example

One way to rewrite this:

Comparator<String> byReverseOffsetThenName =
  new Comparateor<String>() {
    public int compare(String tzId1, String tzId2) {
      return ComparisonChain.start()
        .compare(getOffset(tzId2), getOffset(tzId1)
        .compare(tzId1, tzId2)
        .result();
    }
  };

Short-circuits, never allocates, is fast. Also has

  • compare(T, T, Comparator<T>)
  • compareFalseFirst
  • compareTrueFirst

Ordering example

Comparator<String> byReverseOffsetThenName =
  Ordering.natural()
    .reverse()
    .onResultOf(tzToOffset())
    .compound(Ordering.natural());

private Function<String, Integer> tzToOffset() {
  return new Function<String, Integer>() {
    public Integer apply(String tzId) {
      return getOffset(tzId);
    }
  };
}

Ordering

Step 1

Implements Comparator and adds delicious goodies! (Could have been called FluentComparator like FluentIterable)

Common ways to get an Ordering to start with:

  • Ordering.natural()
  • new Ordering() { ... }
  • Ordering.from(existingComparator);
  • Ordering.explicit("alpha", "beta", "gamma");
Step 2

Then you can use the chaining methods to get an altered version of that Ordering

  • reverse()
  • compound(Comparator)
  • onResultOf(Function)
  • nullsFirst()
  • nullsLast()
  • lexicographical()
    • yields an Ordering<Iterable>)

Now you've got your Comparator but also Ordering has some handy operations

  • immutableSortedCopy(Iterable)
  • isOrdered(Iterable)
  • isStrictlyOrdered(Iterable)
  • min(Iterable)
  • max(Iterable)
  • leastOf(int, Iterable)
  • greatestOf(int, Iterable)

Some are even optimized for the specific kind of comparator you have.

which is better?

Ordering or ComparisonChain? -> it depends

com.google.common.hash

Why a new hashing API?

Think of Object.hashCode() as "good enough for in-memory hash maps" but:

  • Strictly limited to 32 bits
  • Worse, composed hash codes are "collared" down to 32 bits during the computation
  • No separation between "which data to hash" and "which algorithm to hash it with"
  • implementations have very bad bit dispersion

These make it not very useful for a multitude of hashing applications: a document "fingerprint", cryptographic hashing, cuckoo hashing, Bloom filters...

JDK solution

To address JDK intro'd two interfaces

  • java.security.MessageDigest
  • java.util.zip.Checksum

Each named after a specific use case for hashing.

Worse than the split, neither is remotely easy to use when you're not hashing raw byte arrays.

Guava Hashing example

HashCode hash =
  Hashing.murmur3_123().newHasher()
    .putInt(person.getAge())
    .putLong(person.getId())
    .putString(person.getFirstName())
    .putString(person.getLastName())
    .putBytes(person.getSomeBytes())
    .putObject(person.getPet(), petFunnel)
    .hash();
  • HashCode has asLong(), asBytes(), toString()...
  • or put it into a Set, return it from an API etc

Hashing overview

The com.google.common.hash API offers:

  • unified user-friendly API for all hash functions
  • seedable 32- and 128-bit implementations of murmur3 (???)
  • md5(), sha1(), sha256(), sha512() adapters
    • change only one line of code to switch between these and murmur etc
  • gooFastHash(int bits) for when you don't care what algorithm you use
  • general utilities for HashCode instances, like combinOrdered/combineUnordered ???

BloomFilter

A probabilistic set

  • public boolean mightContain(T);
    • true == "probably there"
    • false == "definitely not there"

why? consider a spell checker:

  • if syzygy gets red-underlined, that's annoying
  • but if yarrzcw doesn't get red-underlined ... oh well!
  • and the memory savings can be very large

Primary use: short-circuiting an expensive boolean query

com.google.common.cache

  • [to read] - read first pass, many good stuff
⚠️ **GitHub.com Fallback** ⚠️