A binary search tree, **BST**, is a fundamental dynamic data structure used
almost everywhere in science computer. Due to the efficiency of the retrieving
and storing element, it's espacially used in storing systems.

A tree is composed of nodes, edges and leaves. Each node is composed of a key
value and possibly some associated data and is attached to two others nodes. A
leaf can be seen as an empty node. The graphical representation is an upside
down tree where the beginning is the topmost node, called the *root*.

### Some definitions

**Root** - The top node in a tree.

**Child** - A node directly connected to another node when moving away from the
Root.

**Parent** - The converse notion of a child.

**Siblings** - A group of nodes with the same parent.

**Descendant** - A node reachable by repeated proceeding from parent to child.

**Ancestor** - A node reachable by repeated proceeding from child to parent.

**Leaf** - A node with no children.

**Internal node** - A node with at least one child.

**Path** - A path is a sequence of nodes and edges connecting a node to a
descendant.

**Height** - The height of a node is the number of edges on the longest downward
between that node and a leaf.

**Depth** - The depth of a node is the number of edges from the node to the
tree's root node.

**Level** - The level of a node is defined by 1 + the number of connections
between the node and the root.

**Degenerate tree** - A tree where for each parent node, there is only one
associated child node.

**Balanced tree** - A tree where leaves' heights differ by at most one.

The BST is a tree ruled by some characteristics: it keeps its keys in a certain order. Here is the specific rules:

For every node `N`

in the tree

- all keys in
`N`

's left subtree are less than the key in`N`

- all keys in
`N`

's right subtree are greater than the key in`N`

Figure 1 - A correct Binary Search Tree

Figure 2 - Not a BST because 5 in the right subtree of 6 is lower.

As you can see, this is a recursive definition. Hence, if we take a node, for instance 3 and all its descendants, it's a new smaller BST which fulfills the same rules. Due to this fact, functions can be applied recursively.

As all keys are sorted, when looking for a key in the tree, only one branch has to be explored. At every node, a comparison will be performed to know if the key is in the right or left subtree. On average, for every step, we skip half of the tree. Lookup, insertion, deletion have a complexity proportional to the logarithm of the number of items in the tree. The tree is not rearranged when a key is inserted or deleted, and that's why it may end up unbalanced.

Figure 3 - Worst case: a degenerate tree

Average | Worst case | |
---|---|---|

Space | O(n) | O(n) |

Search | O(log n) | O(n) |

Insert | O(log n) | O(n) |

Delete | O(log n) | O(n) |

#### Advantages

The reason why binary-search trees are used is that the following operations can be implemented efficiently using a BST when speaking of a sorted data structure.

- insert a key value
- determine whether a key value is in the tree
- remove a key value from the tree
- inorder traversial, for example: print all of the key values in sorted order

#### Disadvantages

It's unlikely that the tree will be properly balanced as it depends on the order of insertions and deletions.

- the shape of the tree depends on insertion and deletion order
- the insertions or deletions need to compare each key of visited node with the key of the element to be inserted/deleted

### Implementation

#### Structure

The heart of the BSTs is in the node's structure. For the simplicity, we didn't associate data in the node but you could add anything.

```
struct node {
int key;
node *left; // begin of the left subtree
node *right; // begin of the right subtree
node(int k): key(k) { // constructor
left = NULL;
right = NULL;
}
};
```

Note that if `left`

and `right`

pointers are not set to NULL, the condition
`left == NULL`

will not be triggered. That's why a constructor was added to
force these assignments.

#### Class

```
class binaryTree {
public:
binaryTree();
~binaryTree();
void insert(int key);
bool search(int key);
void destroyTree();
int countNode();
void print();
private:
void destroyTree(node *n);
void insert(int key, node *n);
bool search(int key, node *n);
int countNodes(node *n);
void print(node *n);
node *root;
};
```

#### Destroy tree

```
void binaryTree::destroyTree(node *n) {
if (n != NULL) {
destroyTree(n->right);
destroyTree(n->left);
delete n;
}
}
void binaryTree::destroyTree() {
destroyTree(root);
}
binaryTree::~binaryTree() {
destroyTree();
}
```

`destroyTree`

begins from the bottom of the tree and goes up. It frees all the
tree's used memory.

#### Insertion

```
void binaryTree::insert(int key, node *&n) {
if (n == NULL) {
n = new node(key);
return;
}
if (n->key < key) {
insert(key, n->left);
} else {
insert(key, n->right);
}
}
void binaryTree::insert(int key) {
insert(key, root);
}
```

#### Search

```
bool binaryTree::search(int key, node *n) {
if (n == NULL) {
return false;
}
if (n->key == key) {
return true;
}
return search(key, n->left) || search(key, n->right);
}
bool binaryTree::search(int key) {
return search(key, root);
}
```

#### Count nodes

```
int binaryTree::countNodes(node *n) {
if (n == NULL) {
return 0;
}
return 1 + countNodes(n->left) + countNodes(n->right);
}
int binaryTree::countNodes() {
return countNodes(root);
}
```

## Conclusion

#### How could we improve efficienty of BST?

To improve efficienty of BST, we have to keep the tree balanced. For every insertion and deletion operations, we could rotate the nodes in order to keep the same height on the left and right subtree. Moreover, to reduce time comparison of each visited node, we could use radix tree.

Next post: self-balancing binary search tree.