Home CS61B: Lecture 15
Post
Cancel

CS61B: Lecture 15

Disjoint Sets

  • The disjoint set data structure is used to solve the “Dynamic Connectivity” problem
  • The problem that we are working with is that we have various items. With a disjoint set, we can do the following:
    • connect(x, y): Connects item x and y
    • isConnected(x,y): Returns true if x and y are connected). Connections can be transitive, they do not need to share a direct connection, but they have have a path from one item to another.
  • Disjoint Sets are used for kruskals algorithm and also percolation problems

Disjoint Sets on Integers

  • We force all items to be integers
  • We declare the number of items in advance, everything is initially disconnected.
  • Disjoint Sets Interface:
1
2
3
4
5
6
7
public interface DisjointSets {
    /* Connects two items P and Q */
    void connect(int p, int q);

    /* Checks to see if two items are connected */
    boolean isConnected(int p, int q);
}

The Naive Approach

  • Connecting two things: Record every connection in some data structure.
  • Checking connectedness: Do iteration over the lines to see if one item can be reached from the other.

The Better Approach

  • We focus on connected components.
  • We just record the sets that each item belongs to
    • We just care if items are connected, not how they are connected.

Idea 1

  • We use a List<Set<Integer>>
  • Intuitive but terrible because this is complicated and slow. We would have to iterate through all the sets to find anything.

Idea 2

  • We store a list of integers where the ith entry gives a set number (“id”) of item i.
    • connect(p, q): Change entires that equal id[p] to id[q]
  • This is still too slow, as connecting two items takes O(N) time.

Idea :3

  • We need to be able to change our set representation so that combining two sets into their union only requires changing one value.
  • We assign each item a parent rather than an id. This creates a tree structure where we may follow connections until we reach a root. All elements in the same set will share the same root.
  • Thus, we just need to make sure that items in the same set share the same root.
  • Steps, ex: connect(5,2):
    • we set root(5)’s value equal to root(2).
  • To check if two items are connected, we simply check if the root of one item is equal to the root of another item.
  • The issue with this is that we havea worst case tree height of N for N items.
    • We can make this more efficient by choosing how to connect two items.
    • In fact, we put the item with the smaller height under the root of the item with the bigger height.
    • However, tracking with height is tedious, instead we track with the number of elements in a tree, the “weight” of the tree.
  • Connecting based on height versus weight have different asymptotic efficiency.
  • To store the weight of the tree, we simply initialize another array that keeps track of the size of the tree that the item belongs to.

Weighted Quick Union Performance.

  • The only way that we may create a tree of height 2 is if we connect a tree of height 1 with another tree of height 1. Each tree of height 1 has at least 2 items. Thus, the smallest tree of height 2 has 4 items. The smallest tree of height 3 will have 8 items, and so on.
  • The height of our tree scales with the log of the input size N.
  • Using height minimizes the worst case depth of a node using weight minimizes the average depth of a nodex`
This post is licensed under CC BY 4.0 by the author.

CS61B: Lecture 11

CS61B: Lecture 16