UnionFind - Eishaaya/GMRDataStructuresTest GitHub Wiki

Background on Union Find (Disjoint Sets)

  • Union-Find deals with the dynamic connectivity problem, which in itself is related to the graph connectivity problem.

  • The graph connectivity problem is the following: Given an undirected graph G, preprocess the graph so that queries of the form "are nodes P and Q connected?"

  • The dynamic connectivity problem is much harder than the graph connectivity problem.

  • The dynamic connectivity problem states the following: maintain an undirected graph G, so that edges may be inserted and deleted and connectivity queries may be answered efficiently.

  • The goal of the Union-Find data structure is to solve the dynamic connectivity problem.

    • Given a set of N objects, we should be able to connect any two objects, or ask if two objects are connected.
  • In computer science, union-find can also be called the disjoint-set data structure.

  • In computer science and mathematics, two sets are said to be disjoint sets if they have nothing in common.

  • Union-Find will be operating on disjoint sets, it will keep track of set of objects partitioned into a number of disjoint (non-overlapping) subsets.

  • A subset is a set of which all the elements are contained in another set.

    • Example:
      • B = {1, 2, 3}
      • A = {1, 2}
      • A is a subset of B (denoted by ⊆), can also be written as A ⊆ B.
      • If A is a subset of B, and A is not equal to B (i.e. B contains at least one element not inside A), then A is considered a proper subset of B.
  • In our program, the union-find data structure will process a sequence of pairs of values. The pair, (p, q), will represent some two objects p and q. We will also interpret the pair, (p, q), as meaning p is connected (union) to q.

    • We assume that p is connected to q is an equivalence relation.
    • An equivalence relation is a binary relation that is reflexive, symmetric, and transitive.
      • Reflexive: p is connected to p.
      • Symmetric: If p is connected to q, then q is connected to p.
      • Transitive: If p is connected to q and q is connected to r, then p is connected to r.
    • An equivalence relation will partition the objects in connected components.
  • The goal of union-find will be to connect the pairs and deal with the dynamic connectivity problem.


Visualization


Implementation


  • We will be implementing the union-find as two distinct algorithms. The first will be known as quick-find, and the second will be known as quick-union.
  • Each of the two algorithms will have two essential components to them. The first is the Union(p,q) command, which will connect two objects together in the set. The second being the Find(p) query, which essentially returns the set that the object belongs to.
  • Quick-Find has to advantage of being fast in its Find query, but it is rather slow in its Union command.
  • Quick-Union is the opposite of quick-find. It is fast in the union of pairs, but slower in finding what set they belong to.
  • Abstractions:
    • Objects can be represented as a set, ex: { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
    • Disjoint sets of objects, ex: { 0, 1, { 2, 3, 4 }, 5, 6, 7, { 8, 9 } }
    • Find query, are 2 and 5 in the same set? ex: { 0, 1, { 2, 3, 4, 5 }, 6, 7, 8, 9 }
    • Union command, merge sets containing 0 and 5:
      • Initial: { 0, 1, { 2, 3, 4, 5 }, 6, 7, 8, 9 }
      • Post Union: { 1, { 0, 2, 3, 4, 5 }, 6, 7, 8, 9 }
    • Connected components refers to the number of disjoint sets.
      • Ex: { 0, 1, { 2, 3, 4, 5 }, { 6, 7, 8, 9 } } has 4 connected components.
      • The 4 sets are: { 0 }, { 1 }, { 2, 3, 4, 5 }, { 6, 7, 8, 9 }

Quick-Find Implementation


public class QuickFind<T>
{

    private int[] sets;        
    private Dictionary<T, int> map;

    public QuickFind(IEnumerable<T> items);

    public int Find(T p) => sets[map[p]];
    public bool Union(T p, T q);
    public bool AreConnected(T p, T q);
    
}
  • The constructor should take in a container of N objects that will make up the disjoint sets.
  • The class should have an integer array to represent the N number of sets.
  • When working with generics, we need a way to map a T value to an integer value, so we need a dictionary to facilitate the mapping.
  • The find query will take in a T value, called p, and return the value in the array which indicates which set p belongs to. It is a fast O(1) time complexity operation.
  • The union command takes in two values, p and q. The goal is to update all the values that are part of set belonging to p, to now belong to set containing q. Can result in O(N) time complexity.
  • AreConnected will be useful to find out if p and q are connected. Additionally, it can be used in Union because we only want to perform the union if p and q are not already connected.

Quick-Union Implementation


public class QuickUnion<T>
{

    private int[] parents;        
    private Dictionary<T, int> map;

    public QuickUnion(IEnumerable<T> items);

    public int Find(T p);
    public bool Union(T p, T q);
    public bool AreConnected(T p, T q);
    
}
  • Majority of QuickUnion is the same as QuickFind, the differences are in Find and Union methods.
  • Union uses the integer array to store the parents of the objects. It is a quick O(1) time complexity operation.
  • Find follows the links of p in the parents array to the top most canonical element, can result in a O(N) time complexity.
  • There are further improvements to the Union-Find data structure which can improve the worst case time complexity to O(Log(N)).

Assignments to be done with UnionFind


  • Read the input from the Friends Verticies text file and Friends Edges textfile. The first line is the number of initial disjoint sets (N). The next N lines are the object values, and the remainder of the file are comma separated value pair representing the union command. You will then answer the following questions:

    • How many friend groups are there?
    • Who is in the largest and the smallest friend group?
    • Is Phoebe Friends with Rachel? Is Michael friends with Pam? Is Chandler friends with Creed?
    • Who are the members of each set? Displayed by their values.
      • Ex: Group 1 contains: Jim, Jimbo, Jimothy
  • Union find visualizer

    • Make a 5x5 grid with each cell visualizing it's "parent" (use the alphabet to visualize since it is the easiest thing to see visually), Allow the user to click two cells which well union them. Then update the visualizer to match the new corresponding parents of each cell.
  • Union find maze generator

    • Use union find to generate a guarenteed solvable maze from point A to point B
    • Algorithm: Write a maze generation function that takes a graph, start, and end as an input. Create your union find structure based off of all the nodes in your graph (its verticies). Then loop until the start and end are not connected (use the are connected function in union). Inside the loop select a random edge from the graph and check if that edge will connect it's two verticies inside the union find structure. If it does connect them then call union on those two nodes and remove that edge from the graph. By the end of this algorithm you will have a guarenteed path from start to end.

References



<-- Previous | Next -->

⚠️ **GitHub.com Fallback** ⚠️