The Binary Search Tree - Eishaaya/GMRDataStructuresTest GitHub Wiki

What makes it Binary?

A binary search tree (BST) is binary tree that is balanced according to a key value. All of the nodes with key values less than the root are on the left side of the tree, and all of the nodes with key values greater than the root are on the right. Operations on a BST, such as search, insert, and delete, begin at the root and go until a leaf node is reached. Therefore, the running time for these operations are highly dependent on the height of the tree. When a tree is completely unbalanced, it has the maximum possible height:

This should look familiar to you, in that, it is essentially a linked list. Linked lists, or completely unbalanced BSTs where the height and the size are equal, have an average search running time of O(n). Contrarily, consider a BST that is completely balanced:

Performing a search on this tree will take time. As a reference, search on a balanced tree is orders of magnitude faster than a completely unbalanced tree. Whether the tree is fully balanced, semi-balanced, or unbalanced is based on the ordering of the data that was presented to the tree.

Binary trees can have various classifications:

  • Balanced: The heights of the left and right subtrees have a difference of no more than 1; the left and right subtrees must in turn be balanced.
  • Full: Every node in the tree has 0 or 2 children
  • Perfect: Every node, except the leaves, have two children and all leaves have the same depth

        Balanced BST           Full BST             Perfect BST

Cases for Duplicate Values

When working with data structures there are 5 common practices used to handle duplicate values being added to the structure.

  • Allow Duplicates
    • Usually duplicates are stored after the existing value (or in the case of trees they are added to the right sub-tree). During deletion, whichever value is found first is the one to be deleted.
  • Increment Count
    • This method adds an integer counter on the value container (node) so that if a duplicate value is encountered during insertion, you increment the count and end the function. During deletion you reduce the count unless the count is zero, in which case you remove the value container from the data structure.
  • Throw an Exception
    • This case does not allow duplicates. If a duplicate value is inserted, the function throws an exception. This is more prevalent in data structures that use key/value pairs since duplicate keys are fundamentally not allowed.
  • Return a Boolean
    • This case does not allow duplicates. The insertion/add method returns true if the value is inserted and false if a duplicate was encountered (and the duplicate is not added to the data structure).
  • Silent Failure (NEVER DO THIS!)
    • When a duplicate is encountered, it is not added to the data structure and the function ends. In this case, the user of the data structure does not know the result of the insert/add operation and it becomes more difficult to use the data structure. It is much better to inform the user about the result of the function (either throw an exception or return a Boolean indicator). Even if they don't intend to use the result. It's better to have it and not need it, than need it and not have it.

Implementation Guide


Search:

Search the tree for a node with the given key: While the current node is not null and the keys are not a match, set the current node to either its left or right child. Set it to the left child if the key is less than the current node’s key. Otherwise set it to the right child.

Minimum:

Shift over left as many times as possible to get the smallest element of a given sub-tree.

Maximum:

Shift over right as many times as possible to get the largest element of a given sub-tree.

IsLeftChild:

Determine if a given node is a left child.

IsRightChild:

Determine if a given node is a right child.

Insert:

Start with the root of the tree. If there is no root, set the root to the new node. Increment the size of the tree and return. If there is a root, traverse the tree until you find the proper locations for the new node. This can be done by going left or right (depending on how the current node’s key compares to the new node’s key. NOTE: if the key's are equal, traverse right until a leaf is reached. Current will stop right on the parent of the new node, but the new node does know this. So you must set the parent of the new node to the current. Finally, depending on whether the new node’s key is less than or greater than the current node, the new node is set as the left or right child of the current node, respectively.

Delete

Start at the root of the tree, and do an in-order traversal of the tree until the node is found. Then follow the following cases (always keep in mind any edge cases such as root, null, parent, and children connections):

If the node has no children: Delete the connection to the node. Here we delete 15:

If the node has 1 child: Move the child up by connecting the child to the parent. Here we delete 5:

If the node has 2 children: This is a special case that requires an interesting method. We don't want to upset the tree by moving one of it's children up because that child might have it's own children. Instead, we select the best candidate in the tree to replace the node we want to delete. In order to find the best candidate we select the left sub-tree's greatest value by moving left once, the keep moving right until there are no more right children (alternatively you can choose the right sub-tree's least value by going right once then left until no more left children). The reason this node is the best candidate is because it is the only node that keeps the integrity of the binary tree such that every node to the left of it will be smaller and every node to the right of it will be larger. Copy the value of the candidate into the node that you want to delete then delete the candidate from it's location in the tree. Here we delete 10:


Assignments to be done with a Binary Search Tree


Traversal: Visit each node in the graph using the following methods

  • Depth First Traversal
    • Visits each node by going as deep into the tree as possible before backing up and travelling down a new path.
    • Uses a Stack data structure.
    • Depth First Traversal comes in 3 flavors: Pre-order, In-order, and Post-order.
      • A good response on when to use each version of DFS can be found here.
- Pre-order (Root-Left-Right)
  - Create a stack and queue inside the function to help us with the traversal. Push the root to the stack and 
    continue looping while the stack is not empty. Pop from the stack and add that node to the queue (which is our output from the function). 
    Then push the popped node's left and right children to the stack. (Figure out which order!);
  - This order preserves the structure of the tree and is used for cloning the tree. 

  • In-order (Left-Root-Right)
    • Create a stack and queue inside the function to help us with the traversal. Move along the left side of the subtree as much as possible, once you can't move to the left anymore go to the right (of which node?). As you move down the left side of the sub tree (maybe with some sort of extra 'curr' node) push the 'curr' node to the stack so you can eventually "backtrack" once you can't move left anymore.
    • This is used to traverse the tree in sorted order.

  • Post-order (Left-Right-Root)
    • This traversal is extremely similar to the Pre-Order, rather than storing the nodes that are being popped into a Queue, use a Stack. (You may also have to change the order of pushing your children)
    • This traverses from the leaf nodes and goes up. Good for deleting the tree in a safe manner, if you are managing memory in such cases as C++.

  • Breadth First Traversal
    • Visits each node by travelling through the whole row of the tree before moving to the next row.
    • Uses a Queue data structure.
    • Enqueue the root then do the following: dequeue from the queue, calculate the value that was dequeued, enqueue both children (if they exist), repeat while the queue is not empty.
    • A Breadth First Traversal should traverse the tree in level-order.


References


<-- Previous | Next -->

⚠️ **GitHub.com Fallback** ⚠️