Find Most Frequent Keywords - codepath/compsci_guides GitHub Wiki
Unit 4 Session 2 Advanced (Click for link to problem statements)
U-nderstand
Understand what the interviewer is asking for by using test cases and questions about the problem.
- Q: What is the goal of the problem?
- A: The goal is to identify the most frequently used keywords across all scenes in a dictionary.
- Q: What are the inputs?
- A: The input is a dictionary where the keys are scene names and the values are lists of keywords used in each scene.
- Q: What are the outputs?
- A: The output is a list of keywords that appear most frequently across all scenes. If there is a tie, return all the keywords with the highest frequency.
- Q: How should ties be handled?
- A: If multiple keywords have the same highest frequency, return all of them.
- Q: Are there any assumptions about the input?
- A: The input dictionary is well-formed, with string keys and values being lists of strings.
P-lan
Plan the solution with appropriate visualizations and pseudocode.
General Idea: Use a dictionary to count the frequency of each keyword across all scenes. Then, identify the maximum frequency and return all keywords that have this frequency.
1) Initialize an empty dictionary `keyword_freq` to store the frequency of each keyword.
2) Iterate over each list of keywords in the `scenes` dictionary:
a) For each keyword, increment its count in `keyword_freq`.
3) Identify the maximum frequency in `keyword_freq`.
4) Create a list `most_frequent_keywords` containing all keywords with the maximum frequency.
5) Return `most_frequent_keywords`.
**⚠️ Common Mistakes**
- Forgetting to handle ties correctly, leading to only one keyword being returned instead of all keywords with the highest frequency.
- Not correctly updating the frequency of keywords, which could lead to inaccurate results.
- Assuming the input dictionary is always non-empty or well-formed without considering edge cases.
I-mplement
def find_most_frequent_keywords(scenes):
# Create a dictionary to store keyword frequencies
keyword_freq = {}
# Iterate over each scene and update keyword frequencies
for keywords in scenes.values():
for keyword in keywords:
if keyword in keyword_freq:
keyword_freq[keyword] += 1
else:
keyword_freq[keyword] = 1
# Find the maximum frequency
max_freq = max(keyword_freq.values())
# Find all keywords with the maximum frequency
most_frequent_keywords = [keyword for keyword, freq in keyword_freq.items() if freq == max_freq]
return most_frequent_keywords
Example Usage:
scenes = {
"Scene 1": ["action", "hero", "battle"],
"Scene 2": ["hero", "action", "quest"],
"Scene 3": ["battle", "strategy", "hero"],
"Scene 4": ["action", "strategy"]
}
print(find_most_frequent_keywords(scenes))
# Output: ['action', 'hero']
scenes = {
"Scene A": ["love", "drama"],
"Scene B": ["drama", "love"],
"Scene C": ["comedy", "love"],
"Scene D": ["comedy", "drama"]
}
print(find_most_frequent_keywords(scenes))
# Output: ['love', 'drama']