Group Animals by Habitat - codepath/compsci_guides GitHub Wiki

TIP102 Unit 3 Session 1 Advanced (Click for link to problem statements)

U-nderstand

Understand what the interviewer is asking for by using test cases and questions about the problem.

  • Q: What is the input to the problem?
    • A: The input is a string habitats, where each character represents a species of animal in the sanctuary.
  • Q: What is the output?
    • A: The output is a list of integers, each representing the size of groups in which animals of the same species are grouped together by their habitats.
  • Q: What does it mean to group animals by habitat?
    • A: Each species should appear in at most one part, meaning that the group boundaries should be set so that no species crosses these boundaries.

P-lan

Plan the solution with appropriate visualizations and pseudocode.

General Idea: Track the last occurrence of each species in the string. As you iterate through the string, extend the current group to include all occurrences of each species encountered. When you reach the end of the current group, record its size and start a new group.

1. Calculate the last occurrence of each species in the `habitats` string and store it in a dictionary `last_occurrence`.
2. Initialize `start` and `end` pointers to track the current group, and an empty list `result` to store the sizes of the groups.
3. Iterate over the `habitats` string with an index `i`:
   1. Update the `end` pointer to the maximum of the current `end` and the last occurrence of the current species.
   2. If the current index `i` matches the `end` pointer, it means the current group is complete:
      * Append the size of the current group (`end - start + 1`) to `result`.
      * Move the `start` pointer to `i + 1` to begin the next group.
4. Return the `result` list containing the sizes of the habitat groups.

⚠️ Common Mistakes

  • Not correctly updating the end pointer, which may result in incorrect group sizes.
  • Forgetting to move the start pointer to the correct position after completing a group.
  • Failing to account for all occurrences of each species, leading to incorrect partitioning.

I-mplement

def group_animals_by_habitat(habitats):
    # Step 1: Calculate the last occurrence of each species
    last_occurrence = {species: i for i, species in enumerate(habitats)}
    
    # Step 2: Initialize pointers and result list
    start = 0
    end = 0
    result = []
    
    # Step 3: Iterate over the string using the end pointer
    for i, species in enumerate(habitats):
        # Update the end pointer to the furthest last occurrence of the current species
        end = max(end, last_occurrence[species])
        
        # Step 4: When current index reaches the end pointer, finalize the group
        if i == end:
            # Append the size of the current group to the result list
            result.append(end - start + 1)
            # Move the start pointer to the next character
            start = i + 1
    
    # Step 5: Return the list of group sizes
    return result

# Example usage
print(group_animals_by_habitat("ababcbacadefegdehijhklij"))  # Output: [9,7,8]
print(group_animals_by_habitat("eccbbbbdec"))                # Output: [10]