Binary search trees - Sam647254/Programetoj GitHub Wiki
A tree is a data structure that consists of a base node, which may have zero or more subnodes, which are called its children. Any child may also have children of its own, so a tree is a recursive data structure; that is, each child is also a tree. A node that does not have any children is a leaf. With in a tree, every node may hold a value.
A binary tree is a tree in which each node has up to two children, which are commonly called its left
and right
children.
class TreeNode:
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right
Much like how a LinkedList
needs only a pointer to its first node, a binary tree just needs a pointer to its root node:
class BinarySearchTree:
def __init__(self):
self.root = None
A binary search tree (BST) is a binary tree in which the value of the left child is less than or equal to its parent node's value, and the right child is greater than or equal to its parent node.
4
/ \
2 5
/ \
1 3
The height of a tree is the number of nodes between the root node and the farthest leaf node, inclusive. In the example above, the tree has a height of 3. The height of a well balanced tree is O(log n), whereas the height of a skewed tree is O(n), as shown below, in a tree that has only left subtrees:
4
/
3
/
2
/
1
Operations on binary search trees
- Inserting a new node: Compare the new value to the root node's value. If the new value is less, insert it into the left subtree, otherwise into the right subtree. If the subtree is null, create a new leaf that will hold the new value.
class TreeNode:
# ...
def insert(self, new_value) =
if new_value <= self.value:
if self.left is None: # Insert a leaf
self.left = TreeNode(new_value)
else:
self.left.insert(new_value)
else:
if self.right is None:
self.right = TreeNode(new_value)
else:
self.right.insert(new_value)
class BinarySearchTree:
# ...
def insert(self, new_value):
if self.root is None:
self.root = TreeNode(new_value)
else:
self.root.insert(new_value)
- Checking for the existence of a node: Compare the wanted value against the root node's value. If they match, we have found the value (and return
True
); otherwise, if the wanted value is less than the root node, repeat this process in the left subtree. If it is greater, repeat in the right subtree. If the target subtree does not exist, the wanted value does not exist in tree (and returnFalse
).
class TreeNode:
# ...
def contains(self, value):
if value == self.value:
return True
elif value < self.value and self.left is not None:
return self.left.contains(value)
elif value > self.value and self.right is not None:
return selfg.right.contains(value)
else:
return False
class BinarySearchTree:
# ...
def contains(self, value):
return False if self.root is None else self.root.contains(value)
- Removing a node: The steps to removing a node varies based on how many children the node has. If it has no children (i.e. it is a leaf), then simply replace its parent's pointer to it with
None
. If it has one child, replace it with its child. If it has two children, find its successor value in its right subtree, replace the to-be-deleted node's value with the successor's value, and repeat this for the successor's node.
class TreeNode:
# ...
def remove(self, value):
if self.left is not None and self.left.value == value:
if self.left.is_leaf():
self.left = None
elif self.left.has_both_children():
self.left.replace_with_successor()
else:
self.right.replace_with_only_child()
if self.right is not None and self.right.value == value:
if self.right.is_leaf():
self.right = None
elif self.right.has_both_children():
self.right.replace_with_only_child()
else:
self.right.replace_with_only_child()
def replace_with_only_child(self):
only_child = self.left if self.left is not None else self.right
self.value = only_child.value
self.left = only_child.left
self.right = only_child.right
def find_successor(self):
if self.left is not None:
return self.left.find_successor()
else:
return self
def replace_with_successor(self):
successor = self.find_successor(self.right)
self.value = successor.value
if successor == self.right and self.right.is_leaf():
self.right = None
else:
self.right.remove(successor.value)
class BinarySearchTree:
# ...
def remove(self, value):
if self.root.is_leaf():
self.root = None
else self.root.remove(value)
- Traversing through a binary search tree: Three commonly encountered traversal orders are: pre-order, in-order, and post-order. In any of these orders, the left subtree is visited before the right subtree. The visiting of the base node is the only difference.
class TreeNode:
# ...
def preorder(self):
yield self.value
if self.left is not None:
yield from self.left.preorder()
if self.right is not None:
yield from self.right.preorder()
def inorder(self):
if self.left is not None:
yield from self.left.preorder()
yield self.value
if self.right is not None:
yield from self.right.preorder()
def postorder(self):
if self.left is not None:
yield from self.left.preorder()
if self.right is not None:
yield from self.right.preorder()
yield self.value
Note: Since an in-order traversal will give the values in an non-decreasing order, it can be used as a sorting algorithm, although its worst case performance is O(n²) in the case of a skewed tree.
The following additional operations are straightforward, so I left out the code for brevity.
- Checking if a binary search tree is valid: this means checking whether a binary tree has the property that each node's left child is less than or equal to it, and the right child is greater than or equal to it. A tree that has only one node or an empty tree would be valid.
- Checking the height of a tree: This involves comparing the heights of the left and right subtrees. An empty tree has a height of 0, and a tree with only a root node has a height of 1. The total height of a tree is 1 + (the greater height between the root node's two children)
- Checking for the size of a tree: An empty tree has a size of 0; a single root node has a size of 1; the total size of a tree is 1 + the sizes of of the left and right subtrees
Runtime efficiency
In a well-balanced tree, all operations (that do not need to visit the entire tree) are O(log n), but in a skewed tree, the worst case performance is O(n). Operations such as checking if a tree is valid or traversal will always take O(n) time, since they need to visit every node regardless of the tree's shape.