Validating HTML Tags - codepath/compsci_guides GitHub Wiki
Unit 4 Session 2 Standard (Click for link to problem statements)
Understand what the interviewer is asking for by using test cases and questions about the problem.
- Q: What is the goal of the problem?
- A: The goal is to determine if a string of HTML-like tags is properly nested and closed.
- Q: What are the inputs?
- A: The input is a string containing HTML-like tags (e.g.,
<div>
,</div>
).
- A: The input is a string containing HTML-like tags (e.g.,
- Q: What are the outputs?
- A: The output is a boolean value:
True
if the tags are properly nested and closed,False
otherwise.
- A: The output is a boolean value:
- Q: What assumptions are made about the tags?
- A: Tags are well-formed and can be nested but not improperly overlapped (e.g.,
<div><p></div></p>
is invalid).
- A: Tags are well-formed and can be nested but not improperly overlapped (e.g.,
- Q: How should nested tags be handled?
- A: Tags should be checked to ensure they are nested properly with corresponding closing tags in the correct order.
Plan the solution with appropriate visualizations and pseudocode.
General Idea: Use a stack to track opening tags. As you iterate through the string, push opening tags onto the stack. For closing tags, check if they match the tag on top of the stack. If they match, pop the stack; otherwise, return False
. At the end, if the stack is empty, return True
; otherwise, return False
.
1) Initialize an empty stack to keep track of opening tags.
2) Iterate through the `html` string using a loop.
a) When an opening `<` is found, identify the tag.
b) If the tag is an opening tag (does not start with `/`), push it onto the stack.
c) If the tag is a closing tag (starts with `/`), check if the stack is not empty and if the top of the stack matches the corresponding opening tag.
d) If it matches, pop the stack; otherwise, return `False`.
3) After the loop, if the stack is empty, return `True`; otherwise, return `False`.
**⚠️ Common Mistakes**
- Forgetting to check for mismatched or improperly nested tags.
- Not correctly handling cases where the stack is empty when a closing tag is encountered.
- Assuming that the input string is always valid without accounting for edge cases.
def validate_html_tags(html):
stack = []
i = 0
while i < len(html):
if html[i] == '<':
j = i + 1
while j < len(html) and html[j] != '>':
j += 1
tag = html[i+1:j]
if not tag.startswith('/'):
# It's an opening tag, push onto stack
stack.append(tag)
else:
# It's a closing tag, pop from stack and check
if not stack or stack[-1] != tag[1:]:
return False
stack.pop()
i = j
i += 1
# If stack is empty, all tags were properly closed
return len(stack) == 0
Example Usage:
html = "<div><p></p></div>"
print(validate_html_tags(html)) # Output: True
html_2 = "<div><p></div></p>"
print(validate_html_tags(html_2)) # Output: False
html_3 = "<div><p><a></a></p></div>"
print(validate_html_tags(html_3)) # Output: True
html_4 = "<div><p></a></p></div>"
print(validate_html_tags(html_4)) # Output: False