Find Most Common Plants in Collection - codepath/compsci_guides GitHub Wiki

Unit 8 Session 2 Advanced (Click for link to problem statements)

Problem Highlights

  • 💡 Difficulty: Medium-Hard
  • Time to complete: 25-30 mins
  • 🛠️ Topics: Trees, Binary Search Trees, Inorder Traversal

1: U-nderstand

Understand what the interviewer is asking for by using test cases and questions about the problem.

  • Established a set (2-3) of test cases to verify their own solution later.
  • Established a set (1-2) of edge cases to verify their solution handles complexities.
  • Have fully understood the problem and have no clarifying questions.
  • Have you verified any Time/Space Constraints for this problem?
  • What is the structure of the tree?
    • The tree is a Binary Search Tree (BST) where each node represents a plant, and duplicates are allowed.
  • What operation needs to be performed?
    • The function needs to find the most frequently occurring plant(s) in the BST.
  • What should be returned?
    • The function should return a list of the most frequently occurring plant(s). If there are ties, the list should include all tied plants.
HAPPY CASE
Input: 
    collection1 = TreeNode("Hoya", None, TreeNode("Pothos", TreeNode("Pothos")))
Output: 
    ["Pothos"]
Explanation: 
    "Pothos" appears twice, which is more frequent than any other plant.

EDGE CASE
Input: 
    collection2 = TreeNode("Hoya", TreeNode("Aloe", TreeNode("Aloe")), TreeNode("Pothos", TreeNode("Pothos")))
Output: 
    ["Aloe", "Pothos"]
Explanation: 
    Both "Aloe" and "Pothos" appear twice, so both are returned in the list.

2: M-atch

Match what this problem looks like to known categories of problems, e.g., Linked List or Dynamic Programming, and strategies or patterns in those categories.

For Binary Search Tree (BST) problems, we want to consider the following approaches:

  • Inorder Traversal: An inorder traversal of a BST returns the nodes in sorted order by value. This allows for counting consecutive duplicates efficiently.

3: P-lan

Plan the solution with appropriate visualizations and pseudocode.

General Idea: Perform an inorder traversal of the BST to get the nodes in sorted order. As we traverse, count the occurrences of each plant and track the maximum frequency observed. If a new maximum frequency is found, update the result list; if the same frequency is encountered, add the plant to the list.

1) Initialize variables:
    - `current_val`: The current plant value being counted.
    - `current_count`: The count of the current plant value.
    - `max_count`: The maximum count observed so far.
    - `most_common`: The list of plants with the highest frequency.

2) Define a helper function `inorder(node)` to perform an inorder traversal:
    - If `node` is `None`, return.
    - Recursively traverse the left subtree.
    - Process the current node:
        - If the current node's value is equal to `current_val`, increment `current_count`.
        - Otherwise, reset `current_val` to the current node's value and set `current_count` to 1.
        - If `current_count` is greater than `max_count`, update `max_count` and reset `most_common` to `[current_val]`.
        - If `current_count` is equal to `max_count`, append `current_val` to `most_common`.
    - Recursively traverse the right subtree.

3) Call `inorder(root)` starting from the root of the tree.
4) Return `most_common`.

⚠️ Common Mistakes

  • Not correctly handling consecutive duplicate nodes in the BST.
  • Forgetting to update the result list when a new maximum frequency is encountered.

4: I-mplement

Implement the code to solve the algorithm.

class TreeNode:
    def __init__(self, value, left=None, right=None):
        self.val = value
        self.left = left
        self.right = right

def find_most_common(root):
    if not root:
        return []
    
    # Helper function for inorder traversal
    def inorder(node):
        nonlocal current_val, current_count, max_count, most_common
        if not node:
            return
        
        # Traverse the left subtree
        inorder(node.left)
        
        # Process current node
        if node.val == current_val:
            current_count += 1
        else:
            current_val = node.val
            current_count = 1
        
        if current_count > max_count:
            max_count = current_count
            most_common = [current_val]
        elif current_count == max_count:
            most_common.append(current_val)
        
        # Traverse the right subtree
        inorder(node.right)
    
    # Initialize variables
    current_val = None
    current_count = 0
    max_count = 0
    most_common = []
    
    # Start inorder traversal
    inorder(root)
    
    return most_common

5: R-eview

Review the code by running specific example(s) and recording values (watchlist) of your code's variables along the way.

- Example 1:
    - Input: 
        `collection1 = TreeNode("Hoya", None, TreeNode("Pothos", TreeNode("Pothos")))`
    - Execution: 
        - Perform inorder traversal: "Hoya" -> "Pothos" -> "Pothos".
        - "Pothos" is encountered twice, more frequently than any other plant.
    - Output: 
        `["Pothos"]`
- Example 2:
    - Input: 
        `collection2 = TreeNode("Hoya", TreeNode("Aloe", TreeNode("Aloe")), TreeNode("Pothos", TreeNode("Pothos")))`
    - Execution: 
        - Perform inorder traversal: "Aloe" -> "Aloe" -> "Hoya" -> "Pothos" -> "Pothos".
        - Both "Aloe" and "Pothos" are encountered twice.
    - Output: 
        `["Aloe", "Pothos"]`

6: E-valuate

Evaluate the performance of your algorithm and state any strong/weak or future potential work.

Time Complexity:

  • Time Complexity: O(N) where N is the number of nodes in the tree.
    • Explanation: We visit each node exactly once during the inorder traversal.

Space Complexity:

  • Space Complexity: O(H) where H is the height of the tree.
    • Explanation: The recursion stack will use space proportional to the height H of the tree. In a balanced tree, H is O(log N), but in the worst case (skewed tree), it could be O(N).