Protein Folding Loop Detection - codepath/compsci_guides GitHub Wiki
TIP102 Unit 6 Session 1 Advanced (Click for link to problem statements)
Problem Highlights
- 💡 Difficulty: Medium
- ⏰ Time to complete: 20-30 mins
- 🛠️ Topics: Linked Lists, Cycle Detection, Slow and Fast Pointer Technique
1: U-nderstand
Understand what the interviewer is asking for by using test cases and questions about the problem.
- Q: What does the problem ask for?
- A: The problem asks to detect if a cycle exists in a linked list and return an array of the
value
s of the nodes involved in the cycle.
- A: The problem asks to detect if a cycle exists in a linked list and return an array of the
- Q: What approach can be used?
- A: The slow and fast pointer technique can be used to detect the cycle and then identify the nodes in the cycle.
HAPPY CASE
Input: protein_head = Node('Ala', Node('Gly', Node('Leu', Node('Val')))), with Val pointing back to Gly
Output: ['Gly', 'Leu', 'Val']
Explanation: The linked list has a cycle that includes the nodes 'Gly', 'Leu', and 'Val'.
EDGE CASE
Input: protein_head = None
Output: []
Explanation: An empty linked list has no cycle.
EDGE CASE
Input: protein_head = Node('Ala')
Output: []
Explanation: A single-node list with no cycle should return an empty list.
2: M-atch
Match what this problem looks like to known categories of problems, e.g. Linked List or Dynamic Programming, and strategies or patterns in those categories.
For Linked List problems involving Cycle Detection and Cycle Node Identification, we want to consider the following approaches:
- Two Pointers (Slow and Fast Pointer Technique): Use two pointers to first detect the cycle and then identify the nodes involved in the cycle.
3: P-lan
Plan the solution with appropriate visualizations and pseudocode.
General Idea: We will use the slow and fast pointer technique to first detect if a cycle exists. Once a cycle is detected, we will determine the start of the cycle and then collect all the nodes involved in the cycle.
1) Initialize two pointers, `slow` and `fast`, both pointing to the `head` of the list.
2) Traverse the list:
a) Move the `slow` pointer by one step.
b) Move the `fast` pointer by two steps.
c) If `slow` and `fast` pointers meet, a cycle is detected.
3) If a cycle is detected:
a) Reset one pointer (`slow`) to the start of the list.
b) Move both pointers one step at a time until they meet; this meeting point is the start of the cycle.
4) Traverse the cycle starting from the meeting point and collect the values of all `nodes` in the cycle.
5) Return the list of cycle `node` values.
6) If no cycle is detected, return an empty list.
⚠️ Common Mistakes
- Failing to handle cases where the list is empty or contains only one node.
- Incorrectly identifying or collecting the nodes involved in the cycle.
4: I-mplement
Implement the code to solve the algorithm.
class Node:
def __init__(self, value, next=None):
self.value = value
self.next = next
def cycle_length(protein):
if not protein:
return []
slow = protein
fast = protein
# Step 1: Detect the cycle using slow and fast pointers
while fast and fast.next:
slow = slow.next
fast = fast.next.next
if slow == fast:
break
else:
return [] # No cycle detected
# Step 2: Find the start of the cycle
slow = protein
while slow != fast:
slow = slow.next
fast = fast.next
# Step 3: Collect all nodes in the cycle
cycle_nodes = []
start_of_cycle = slow
while True:
cycle_nodes.append(slow.value)
slow = slow.next
if slow == start_of_cycle:
break
return cycle_nodes
5: R-eview
Review the code by running specific example(s) and recording values (watchlist) of your code's variables along the way.
- Example: Use the provided
protein_head
linked list to verify that the function correctly identifies and returns the nodes involved in the cycle.
6: E-valuate
Evaluate the performance of your algorithm and state any strong/weak or future potential work.
Assume N
represents the number of nodes in the linked list.
- Time Complexity:
O(N)
because each node is visited at most twice. - Space Complexity:
O(1)
for cycle detection, but the space to store the cycle nodes isO(K)
, whereK
is the length of the cycle.