A binary search tree is a special kind of binary tree (a tree in which each node has at most two children) that performs insertions and deletions such that the tree is always sorted.
If you don't know what a tree is or what it is for, then read this first.
Here is an example of a valid binary search tree:
Notice how each left child is smaller than its parent node, and each right child is greater than its parent node. This is the key feature of a binary search tree.
For example, 2
is smaller than 7
so it goes on the left; 5
is greater than 2
so it goes on the right.
When performing an insertion, we first compare the new value to the root node. If the new value is smaller, we take the left branch; if greater, we take the right branch. We work our way down the tree this way until we find an empty spot where we can insert the new value.
Say we want to insert the new value 9
:
- We start at the root of the tree (the node with the value
7
) and compare it to the new value9
. 9 > 7
, so we go down the right branch and repeat the same procedure but this time on node10
.- Because
9 < 10
, we go down the left branch. - We've now arrived at a point where there are no more values to compare with. A new node for
9
is inserted at that location.
The tree now looks like this:
There is always only one possible place where the new element can be inserted in the tree. Finding this place is usually pretty quick. It takes O(h) time, where h is the height of the tree.
Note: The height of a node is the number of steps it takes to go from that node to its lowest leaf. The height of the entire tree is the distance from the root to the lowest leaf. Many of the operations on a binary search tree are expressed in terms of the tree's height.
By following this simple rule -- smaller values on the left, larger values on the right -- we keep the tree sorted in a way such that whenever we query it, we can quickly check if a value is in the tree.
To find a value in the tree, we essentially perform the same steps as with insertion:
- If the value is less than the current node, then take the left branch.
- If the value is greater than the current node, take the right branch.
- And if the value is equal to the current node, we've found it!
Like most tree operations, this is performed recursively until either we find what we're looking for, or run out of nodes to look at.
If we were looking for the value 5
in the example, it would go as follows:
Thanks to the structure of the tree, searching is really fast. It runs in O(h) time. If you have a well-balanced tree with a million nodes, it only takes about 20 steps to find anything in this tree. (The idea is very similar to binary search in an array.)
Sometimes you don't want to look at just a single node, but at all of them.
There are three ways to traverse a binary tree:
- In-order (or depth-first): first look at the left child of a node, then at the node itself, and finally at its right child.
- Pre-order: first look at a node, then its left and right children.
- Post-order: first look at the left and right children and process the node itself last.
Once again, this happens recursively.
If you traverse a binary search tree in-order, it looks at all the nodes as if they were sorted from low to high. For the example tree, it would print 1, 2, 5, 7, 9, 10
:
Removing nodes is also easy. After removing a node, we replace the node with either its biggest child on the left or its smallest child on the right. That way the tree is still sorted after the removal. In following example, 10 is removed and replaced with either 9 (Figure 2), or 11 (Figure 3).
Note the replacement needs to happen when the node has at least one child. If it has no child, you just disconnect it from its parent:
So much for the theory. Let's see how we can implement a binary search tree in Swift. There are different approaches you can take. First, I'll show you how to make a class-based version but we'll also look at how to make one using enums.
Here's a first stab at a BinarySearchTree
class:
public class BinarySearchTree<T: Comparable> {
private(set) public var value: T
private(set) public var parent: BinarySearchTree?
private(set) public var left: BinarySearchTree?
private(set) public var right: BinarySearchTree?
public init(value: T) {
self.value = value
}
public var isRoot: Bool {
return parent == nil
}
public var isLeaf: Bool {
return left == nil && right == nil
}
public var isLeftChild: Bool {
return parent?.left === self
}
public var isRightChild: Bool {
return parent?.right === self
}
public var hasLeftChild: Bool {
return left != nil
}
public var hasRightChild: Bool {
return right != nil
}
public var hasAnyChild: Bool {
return hasLeftChild || hasRightChild
}
public var hasBothChildren: Bool {
return hasLeftChild && hasRightChild
}
public var count: Int {
return (left?.count ?? 0) + 1 + (right?.count ?? 0)
}
}
This class describes just a single node, not the entire tree. It's a generic type, so the node can store any kind of data. It also has references to its left
and right
child nodes and a parent
node.
Here's how you'd use it:
let tree = BinarySearchTree<Int>(value: 7)
The count
property determines how many nodes are in the subtree described by this node. This doesn't just count the node's immediate children but also their children and their children's children, and so on. If this particular object is the root node, then it counts how many nodes are in the entire tree. Initially, count = 0
.
Note: Because
left
,right
, andparent
are optionals, we can make good use of Swift's optional chaining (?
) and nil-coalescing operators (??
). You could also write this sort of thing withif let
but that is less concise.
A tree node by itself is pretty useless, so here is how you would add new nodes to the tree:
public func insert(value: T) {
if value < self.value {
if let left = left {
left.insert(value: value)
} else {
left = BinarySearchTree(value: value)
left?.parent = self
}
} else {
if let right = right {
right.insert(value: value)
} else {
right = BinarySearchTree(value: value)
right?.parent = self
}
}
}
Like so many other tree operations, insertion is easiest to implement with recursion. We compare the new value to the values of the existing nodes and decide whether to add it to the left branch or the right branch.
If there is no more left or right child to look at, we create a BinarySearchTree
object for the new node and connect it to the tree by setting its parent
property.
Note: Because the whole point of a binary search tree is to have smaller nodes on the left and larger ones on the right, you should always insert elements at the root, to make to sure this remains a valid binary tree!
To build the complete tree from the example you'd do:
let tree = BinarySearchTree<Int>(value: 7)
tree.insert(2)
tree.insert(5)
tree.insert(10)
tree.insert(9)
tree.insert(1)
Note: For reasons that will become clear later, you should insert the numbers in a somewhat random order. If you insert them in sorted order, the tree won't have the right shape.
For convenience, let's add an init method that calls insert()
for all the elements in an array:
public convenience init(array: [T]) {
precondition(array.count > 0)
self.init(value: array.first!)
for v in array.dropFirst() {
insert(value: v)
}
}
Now you can simply do this:
let tree = BinarySearchTree<Int>(array: [7, 2, 5, 10, 9, 1])
The first value in the array becomes the root of the tree.
When working with somewhat complicated data structures such as this, it's useful to have human-readable debug output.
extension BinarySearchTree: CustomStringConvertible {
public var description: String {
var s = ""
if let left = left {
s += "(\(left.description)) <- "
}
s += "\(value)"
if let right = right {
s += " -> (\(right.description))"
}
return s
}
}
When you do a print(tree)
, you should get something like this:
((1) <- 2 -> (5)) <- 7 -> ((9) <- 10)
The root node is in the middle. With some imagination, you should see that this indeed corresponds to the following tree:
By the way, you may be wondering what happens when you insert duplicate items? We always insert those in the right branch. Try it out!
What do we do now that we have some values in our tree? Search for them, of course! Being able to find items quickly is the entire purpose of a binary search tree. :-)
Here is the implementation of search()
:
public func search(value: T) -> BinarySearchTree? {
if value < self.value {
return left?.search(value)
} else if value > self.value {
return right?.search(value)
} else {
return self // found it!
}
}
I hope the logic is clear: this starts at the current node (usually the root) and compares the values. If the search value is less than the node's value, we continue searching in the left branch; if the search value is greater, we dive into the right branch.
Of course, if there are no more nodes to look at -- when left
or right
is nil -- then we return nil
to indicate the search value is not in the tree.
Note: In Swift that's very conveniently done with optional chaining; when you write
left?.search(value)
it automatically returns nil ifleft
is nil. There's no need to explicitly check for this with anif
statement.
Searching is a recursive process but you can also implement it with a simple loop instead:
public func search(value: T) -> BinarySearchTree? {
var node: BinarySearchTree? = self
while case let n? = node {
if value < n.value {
node = n.left
} else if value > n.value {
node = n.right
} else {
return node
}
}
return nil
}
Verify for yourself that you understand that these two implementations are equivalent. Personally, I prefer to use iterative code over recursive code but your opinion may differ. ;-)
Here's how to test searching:
tree.search(5)
tree.search(2)
tree.search(7)
tree.search(6) // nil
The first three lines all return the corresponding BinaryTreeNode
object. The last line returns nil
because there is no node with value 6
.
Note: If there are duplicate items in the tree,
search()
always returns the "highest" node. That makes sense, because we start searching from the root downwards.
Remember there are 3 different ways to look at all nodes in the tree? Here they are:
public func traverseInOrder(process: (T) -> Void) {
left?.traverseInOrder(process: process)
process(value)
right?.traverseInOrder(process: process)
}
public func traversePreOrder(process: (T) -> Void) {
process(value)
left?.traversePreOrder(process: process)
right?.traversePreOrder(process: process)
}
public func traversePostOrder(process: (T) -> Void) {
left?.traversePostOrder(process: process)
right?.traversePostOrder(process: process)
process(value)
}
They all do pretty much the same thing but in different orders. Notice once again that all the work is done recursively. Thanks to Swift's optional chaining, the calls to traverseInOrder()
etc are ignored when there is no left or right child.
To print out all the values from the tree sorted from low to high you can write:
tree.traverseInOrder { value in print(value) }
This prints the following:
1
2
5
7
9
10
You can also add things like map()
and filter()
to the tree. For example, here's an implementation of map:
public func map(formula: (T) -> T) -> [T] {
var a = [T]()
if let left = left { a += left.map(formula: formula) }
a.append(formula(value))
if let right = right { a += right.map(formula: formula) }
return a
}
This calls the formula
closure on each node in the tree and collects the results in an array. map()
works by traversing the tree in-order.
An extremely simple example of how to use map()
:
public func toArray() -> [T] {
return map { $0 }
}
This turns the contents of the tree back into a sorted array. Try it out in the playground:
tree.toArray() // [1, 2, 5, 7, 9, 10]
As an exercise for yourself, see if you can implement filter and reduce.
We can make the code much more readable by defining some helper functions.
private func reconnectParentToNode(node: BinarySearchTree?) {
if let parent = parent {
if isLeftChild {
parent.left = node
} else {
parent.right = node
}
}
node?.parent = parent
}
Making changes to the tree involves changing a bunch of parent
and left
and right
pointers. This function helps with that. It takes the parent of the current node -- that is self
-- and connects it to another node. Usually that other node will be one of the children of self
.
We also need a function that returns the minimum and maximum of a node:
public func minimum() -> BinarySearchTree {
var node = self
while case let next? = node.left {
node = next
}
return node
}
public func maximum() -> BinarySearchTree {
var node = self
while case let next? = node.right {
node = next
}
return node
}
The rest of the code is pretty self-explanatory:
@discardableResult public func remove() -> BinarySearchTree? {
let replacement: BinarySearchTree?
// Replacement for current node can be either biggest one on the left or
// smallest one on the right, whichever is not nil
if let right = right {
replacement = right.minimum()
} else if let left = left {
replacement = left.maximum()
} else {
replacement = nil
}
replacement?.remove()
// Place the replacement on current node's position
replacement?.right = right
replacement?.left = left
right?.parent = replacement
left?.parent = replacement
reconnectParentTo(node:replacement)
// The current node is no longer part of the tree, so clean it up.
parent = nil
left = nil
right = nil
return replacement
}
Recall that the height of a node is the distance to its lowest leaf. We can calculate that with the following function:
public func height() -> Int {
if isLeaf {
return 0
} else {
return 1 + max(left?.height() ?? 0, right?.height() ?? 0)
}
}
We look at the heights of the left and right branches and take the highest one. Again, this is a recursive procedure. Since this looks at all children of this node, performance is O(n).
Note: Swift's null-coalescing operator is used as shorthand to deal with
left
orright
pointers that are nil. You could write this withif let
but this is a lot more concise.
Try it out:
tree.height() // 2
You can also calculate the depth of a node, which is the distance to the root. Here is the code:
public func depth() -> Int {
var node = self
var edges = 0
while case let parent? = node.parent {
node = parent
edges += 1
}
return edges
}
It steps upwards through the tree, following the parent
pointers until we reach the root node (whose parent
is nil). This takes O(h) time. Example:
if let node9 = tree.search(9) {
node9.depth() // returns 2
}
The binary search tree is always "sorted" but that doesn't mean that consecutive numbers are actually next to each other in the tree.
Note that you can't find the number that comes before 7
by just looking at its left child node. The left child is 2
, not 5
. Likewise for the number that comes after 7
.
The predecessor()
function returns the node whose value precedes the current value in sorted order:
public func predecessor() -> BinarySearchTree<T>? {
if let left = left {
return left.maximum()
} else {
var node = self
while case let parent? = node.parent {
if parent.value < value { return parent }
node = parent
}
return nil
}
}
It's easy if we have a left subtree. In that case, the immediate predecessor is the maximum value in that subtree. You can verify in the above picture that 5
is indeed the maximum value in 7
's left branch.
However, if there is no left subtree then we have to look at our parent nodes until we find a smaller value. So if we want to know what the predecessor is of node 9
, we keep going up until we find the first parent with a smaller value, which is 7
.
The code for successor()
works the exact same way but mirrored:
public func successor() -> BinarySearchTree<T>? {
if let right = right {
return right.minimum()
} else {
var node = self
while case let parent? = node.parent {
if parent.value > value { return parent }
node = parent
}
return nil
}
}
Both these methods run in O(h) time.
Note: There is a cool variation called a "threaded" binary tree where "unused" left and right pointers are repurposed to make direct links between predecessor and successor nodes. Very clever!
If you were intent on sabotage you could turn the binary search tree into an invalid tree by calling insert()
on a node that is not the root, like so:
if let node1 = tree.search(1) {
node1.insert(100)
}
The value of the root node is 7
, so a node with value 100
is supposed to be in the tree's right branch. However, you're not inserting at the root but at a leaf node in the tree's left branch. So the new 100
node is in the wrong place in the tree!
As a result, doing tree.search(100)
gives nil.
You can check whether a tree is a valid binary search tree with the following method:
public func isBST(minValue minValue: T, maxValue: T) -> Bool {
if value < minValue || value > maxValue { return false }
let leftBST = left?.isBST(minValue: minValue, maxValue: value) ?? true
let rightBST = right?.isBST(minValue: value, maxValue: maxValue) ?? true
return leftBST && rightBST
}
This verifies that the left branch does indeed contain values that are less than the current node's value, and that the right branch only contains values that are larger.
Call it as follows:
if let node1 = tree.search(1) {
tree.isBST(minValue: Int.min, maxValue: Int.max) // true
node1.insert(100) // EVIL!!!
tree.search(100) // nil
tree.isBST(minValue: Int.min, maxValue: Int.max) // false
}
We've implemented the binary tree node as a class but you can also use an enum.
The difference is reference semantics versus value semantics. Making a change to the class-based tree will update that same instance in memory. But the enum-based tree is immutable -- any insertions or deletions will give you an entirely new copy of the tree. Which one is best totally depends on what you want to use it for.
Here's how you'd make a binary search tree using an enum:
public enum BinarySearchTree<T: Comparable> {
case Empty
case Leaf(T)
indirect case Node(BinarySearchTree, T, BinarySearchTree)
}
The enum has three cases:
Empty
to mark the end of a branch (the class-based version usednil
references for this).Leaf
for a leaf node that has no children.Node
for a node that has one or two children. This is markedindirect
so that it can holdBinarySearchTree
values. Withoutindirect
you can't make recursive enums.
Note: The nodes in this binary tree don't have a reference to their parent node. It's not a major impediment but it will make certain operations slightly more cumbersome to implement.
As usual, we'll implement most functionality recursively. We'll treat each case of the enum slightly differently. For example, this is how you could calculate the number of nodes in the tree and the height of the tree:
public var count: Int {
switch self {
case .Empty: return 0
case .Leaf: return 1
case let .Node(left, _, right): return left.count + 1 + right.count
}
}
public var height: Int {
switch self {
case .Empty: return 0
case .Leaf: return 1
case let .Node(left, _, right): return 1 + max(left.height, right.height)
}
}
Inserting new nodes looks like this:
public func insert(newValue: T) -> BinarySearchTree {
switch self {
case .Empty:
return .Leaf(newValue)
case .Leaf(let value):
if newValue < value {
return .Node(.Leaf(newValue), value, .Empty)
} else {
return .Node(.Empty, value, .Leaf(newValue))
}
case .Node(let left, let value, let right):
if newValue < value {
return .Node(left.insert(newValue), value, right)
} else {
return .Node(left, value, right.insert(newValue))
}
}
}
Try it out in a playground:
var tree = BinarySearchTree.Leaf(7)
tree = tree.insert(2)
tree = tree.insert(5)
tree = tree.insert(10)
tree = tree.insert(9)
tree = tree.insert(1)
Notice that each time you insert something, you get back a completely new tree object. That's why you need to assign the result back to the tree
variable.
Here is the all-important search function:
public func search(x: T) -> BinarySearchTree? {
switch self {
case .Empty:
return nil
case .Leaf(let y):
return (x == y) ? self : nil
case let .Node(left, y, right):
if x < y {
return left.search(x)
} else if y < x {
return right.search(x)
} else {
return self
}
}
}
As you can see, most of these functions have the same structure.
Try it out in a playground:
tree.search(10)
tree.search(1)
tree.search(11) // nil
To print the tree for debug purposes you can use this method:
extension BinarySearchTree: CustomDebugStringConvertible {
public var debugDescription: String {
switch self {
case .Empty: return "."
case .Leaf(let value): return "\(value)"
case .Node(let left, let value, let right):
return "(\(left.debugDescription) <- \(value) -> \(right.debugDescription))"
}
}
}
When you do print(tree)
it will look something like this:
((1 <- 2 -> 5) <- 7 -> (9 <- 10 -> .))
The root node is in the middle; a dot means there is no child at that position.
A binary search tree is balanced when its left and right subtrees contain roughly the same number of nodes. In that case, the height of the tree is log(n), where n is the number of nodes. That's the ideal situation.
However, if one branch is significantly longer than the other, searching becomes very slow. We end up checking way more values than we'd ideally have to. In the worst case, the height of the tree can become n. Such a tree acts more like a linked list than a binary search tree, with performance degrading to O(n). Not good!
One way to make the binary search tree balanced is to insert the nodes in a totally random order. On average that should balance out the tree quite nicely. But it doesn't guarantee success, nor is it always practical.
The other solution is to use a self-balancing binary tree. This type of data structure adjusts the tree to keep it balanced after you insert or delete nodes. See AVL tree and red-black tree for examples.
Binary Search Tree on Wikipedia
Written for the Swift Algorithm Club by Nicolas Ameghino and Matthijs Hollemans