Subsets - kjingers/Grokking-Notes GitHub Wiki

This pattern handles permutations and combinations of a given set of elements.

Subsets

Start with an empty list in the output subsets. Then, for each item in the input list, append the item to the existing items in subsets.

Time complexity

Since, in each step, the number of subsets doubles as we add each element to all the existing subsets, therefore, we will have a total of O(2^N) subsets, where ‘N’ is the total number of elements in the input set. And since we construct a new subset from an existing set, therefore, the time complexity of the above algorithm will be O(N*2^N).

Space complexity

All the additional space used by our algorithm is for the output list. Since we will have a total of O(2^N) subsets, and each subset can take up to O(N) space, therefore, the space complexity of our algorithm will be O(N*2^N).

def find_subsets(nums):
  subsets = []
  # Start with empty set
  subsets.append([])

  for currentNum in nums:
    n = len(subsets)
    # For Each subset, append the currentNum
    for i in range(n):

      # Calling list() on the list does a copy, 
      # rather than just copying the reference
      set1 = list(subsets[i])

      set1.append(currentNum)
      subsets.append(set1)

  return subsets


def main():

  print("Here is the list of subsets: " + str(find_subsets([1, 3])))
  print("Here is the list of subsets: " + str(find_subsets([1, 5, 3])))


main()

Subsets with Duplicates

Similar to above. The main difference is that we need to check if the current number is the same as the previous number. If so, then we only want to add the current number to the subsets that were added last iteration. We do this by maintaining a startIdx and endIdx which is the indexes of subsets that we add to. If we have a duplicate num, then we set startIdx = endIdx + 1, since endIdx is pointing to the last subset prior to the added subsets last iteration.

Same Complexity as above.

def find_subsets(nums):
  subsets = []
  subsets.append([])
  # Sort Input list, so duplicates are next to each other
  nums.sort()
  # Start and End Indexes keep track which subsets we will append to
  # for a given num
  startIdx, endIdx = 0, 0
  for i in range(len(nums)):
    startIdx = 0

    # If current num is the same as previous, then we only want to
    # append the current num to the subsets we added last iteration.
    # So, if duplicate, change startIdx to be one after endIdx
    if i > 0 and nums[i] == nums[i-1]:
      startIdx = endIdx + 1

    # Now that startIdx is set, we update endIdx to point to the current
    # last subset (before appending current num)
    endIdx = len(subsets) - 1
    for j in range(startIdx, endIdx + 1):

      set1 = list(subsets[j])
      set1.append(nums[i])
      subsets.append(set1)
    
  return subsets


def main():

  print("Here is the list of subsets: " + str(find_subsets([1, 3, 3])))
  print("Here is the list of subsets: " + str(find_subsets([1, 5, 3, 3])))

main()

Permutations

Use of the term "BFS" for this pattern is clearer here since we are using a queue.

Time complexity

We know that there are a total of N!N! permutations of a set with ‘N’ numbers. In the algorithm above, we are iterating through all of these permutations with the help of the two ‘for’ loops. In each iteration, we go through all the current permutations to insert a new number in them on line 17 (line 23 for C++ solution). To insert a number into a permutation of size ‘N’ will take O(N), which makes the overall time complexity of our algorithm O(N*N!).

Space complexity

All the additional space used by our algorithm is for the result list and the queue to store the intermediate permutations. If you see closely, at any time, we don’t have more than N! permutations between the result list and the queue. Therefore the overall space complexity to store N! permutations each containing N elements will be O(N*N!).

from collections import deque

def find_permutations(nums):
  result = []
  permutations = deque()
  permutations.append([])
  numsLength = len(nums)

  for currNum in nums:
    n = len(permutations)

    # For each Permutation we have so far...
    for _ in range(n):
      currentPerm = permutations.popleft()

      # Add in all positions (Indexes 0 though len(perm))
      for i in range(len(currentPerm) + 1):
        currList = list(currentPerm)
        currList.insert(i, currNum)

        # If we are at the last num, then add to results
        # Else, add to permutations queue
        if len(currList) == numsLength:
          result.append(currList)
        else:
          permutations.append(currList)

  
  return result


def main():
  print("Here are all the permutations: " + str(find_permutations([1, 3, 5])))


main()

Recursive Solution

def generate_permutations(nums):
  result = []
  generate_permutations_recursive(nums, 0, [], result)
  return result


def generate_permutations_recursive(nums, index, currentPermutation, result):
  if index == len(nums):
    result.append(currentPermutation)
  else:
    # create a new permutation by adding the current number at every position
    for i in range(len(currentPermutation)+1):
      newPermutation = list(currentPermutation)
      newPermutation.insert(i, nums[index])
      generate_permutations_recursive(
        nums, index + 1, newPermutation, result)


def main():
  print("Here are all the permutations: " + str(generate_permutations([1, 3, 5])))


main()

Letter Case Permutations

Grokking used a similar approach to subsets. To me, using DFS makes more sense.

def find_letter_case_string_permutations(str):
  permutations = []
  n = len(str)

  def dfs(index, curr):

    # Base Case
    if index == len(str):
      permutations.append(''.join(curr))
      return
    
    # Appending current Character as is
    dfs(index + 1, curr + [str[index]])

    # If character is alphanumeric, then append the other case
    if str[index].isalpha():
      dfs(index + 1, curr + [str[index].swapcase()])
  
  dfs(0, [])

  return permutations

Balanced Parentheses

Once again, recursive DFS makes most sense to me.

def generate_valid_parentheses(num):
  result = []
  

  def dfs(curr, openCount, closedCount):

    # Base Case
    # If openCount + closedCount = 2*N, append
    if openCount == num and closedCount == num:
      result.append(''.join(curr))
      return
    
    # If haven't maxed out our openCount, append open
    if openCount < num:
      dfs(curr + ['('], openCount + 1, closedCount)

    # If we have more open than closed, then append closed
    if openCount > closedCount:
      dfs(curr + [')'], openCount, closedCount + 1)

  dfs([], 0, 0)

  return result


def main():
  print("All combinations of balanced parentheses are: " +
        str(generate_valid_parentheses(2)))
  print("All combinations of balanced parentheses are: " +
        str(generate_valid_parentheses(3)))


main()

Unique Generalized Abbreviations

def generate_generalized_abbreviation(word):
  result = []

  def dfs(curr, index, abbrCount):

    # Base Case: If index = len(word), append result
    if index == len(word):
      if abbrCount != 0:
        curr.append(str(abbrCount))
      result.append(''.join(curr))
      return
    
    # Continue Abbreviating
    dfs(list(curr), index + 1, abbrCount + 1)

    # Finish previous abbreviation. Append current letter
    newWord = list(curr)
    if abbrCount != 0:
      newWord.append(str(abbrCount))

    dfs(newWord + [word[index]], index + 1, 0)

  dfs([], 0, 0)
  return result


def main():
  print("Generalized abbreviation are: " +
        str(generate_generalized_abbreviation("BAT")))
  print("Generalized abbreviation are: " +
        str(generate_generalized_abbreviation("code")))


main()

Different Ways to Add Parentheses

This one is not very straightforward to me. I get the logic, but would not have gotten this solution. Essentially, we loop through the input. For each operator, we make that operator the root of the expression tree. Then evaluate everything on left and right separately. This way, we get all combinations of the order for how the expressions are calculated.

Useful Diagram Explanation

Base case: If the expression has no symbol, i.e. its a number, then there's nothing to do. Just add to result. Recursive case: Split the expression at every symbol and evaluate the parts recursively. Why split at every symbol? Because its analogous to adding parantheses. e.g. 1 + 2 * 3 when split at +, parts are 1 and 2 * 3 which can be written as 1 + (2 * 3) when split at *, parts are 1 + 2 and 3 which can be written as (1 + 2) * 3

def diff_ways_to_evaluate_expression(input):
  m = {}
  return dfs(m, input)
  
  return result

def dfs(m, input):
  # First check map to see if expression is already solved
  if input in m:
    return m[input]

  # Base Case: Just a number (No more operators)
  if input.isdigit():
    return [int(input)]

  ret = []
  # Now loop though input.
  # If operator, recursively solve left and right expressions
  for i, c in enumerate(input):
    if c in ("+-*"):
      leftParts = dfs(m, input[:i])
      rightParts = dfs(m, input[i+1:])
      for left in leftParts:
        for right in rightParts:
          if c == "+":
            ret.append(left + right)
          elif c == "-":
            ret.append(left - right)
          else:
            ret.append(left * right)
  m[input] = ret
  return ret

Unique Binary Search Trees

Similar to above. Try each value as root. Recursively solve left and right by trying each value as "subroots" to get all combinations.

class TreeNode:
  def __init__(self, val):
    self.val = val
    self.left = None
    self.right = None

def dfs(start, end):
  result = []

  # Base Case:
  # If start > end, then subtree is None
  # Consider start == end. Then each recursive call should return None.
  # So, we will be left with a tree node with None for both left and right subtrees
  if start > end:
    result.append(None)
    return result

  # Now loop through all values, making 'i' the root of the tree
  for i in range(start, end + 1):

    # Get all possible left and right subtrees
    # For left, try each value at root (start, start+1, start+2, ... i-1)
    # Similar for right
    leftSubtrees = dfs(start, i - 1)
    rightSubtrees = dfs(i + 1, end)

    # Create all possible trees with i as root
    # Combine all possible left subtrees with all possible rightsubtrees
    for leftSubtree in leftSubtrees:
      for rightSubtree in rightSubtrees:
        root = TreeNode(i)
        root.left = leftSubtree
        root.right = rightSubtree
        result.append(root)

  return result

def find_unique_trees(n):
  
  return dfs(1, n)

def main():
  print("Total trees: " + str(len(find_unique_trees(2))))
  print("Total trees: " + str(len(find_unique_trees(3))))

main()

Count of Structurally Unique Binary Search Trees

class TreeNode:
  def __init__(self, val):
    self.val = val
    self.left = None
    self.right = None

# Instead of passing in (start, end), we can just pass in n
# Since (end - start) is all that matters for determining the number
# of unique subtrees
def count_trees_rec(m, n):
  # First check map
  if n in m:
    return m[n]

  # Base case: If just one value, then only one possibility
  # In this case, one root with no children
  if n <= 1:
    return 1

  count = 0
  for i in range(1, n + 1):
    leftCount = count_trees_rec(m, i - 1)
    rightCount = count_trees_rec(m, n - i)
    count += leftCount * rightCount
  
  m[n] = count
  return count


def count_trees(n):
  
  # Pass in empty map to store / reuse same calculations
  return count_trees_rec({}, n)
  


def main():
  print("Total trees: " + str(count_trees(2)))
  print("Total trees: " + str(count_trees(3)))


main()