tags : Data Structures, Graphs
Tree Traversal
Meta ideas about tree traversal
 We usually just go L > R
Complexity
 All trees have
n  1
edges,n
being the number of nodes.  Time complexity:
O(V + E)
⇒ $O(n+(n−1))=O(n)$
About {in,pre,post}order
 BFS doesn’t have
{in,pre,post}order
, those are DFS specific. {pre,post}order
work fornary
trees, while inorder becomes a little weird for anything that’s not a binary tree.
Recursion and traversal
BFS (Queue)  DFS (Stack)  

Recursive  Not natural  Yes, that’s the way. Preserves shape/structure. 
Iterative  Yes, that’s the way  Unsual to do, might help w limited call stack 
DFS (Stack)
 Workhorse: push and pop things on a
stack
(call stack)
Preorder
VisitNode() + RecurseL() + RecurseR()
 Root’s in the beginning
Inorder
RecurseL() + VisitNode() + RecurseR()
 Root’s in the middle
 Inorder traversal of a Binary search tree will print things in order
Postorder
RecurseL() + RecurseR() + VisitNode()
 Root’s in the end
Implementation
 Recursive
 DFS uses a stack to maintain a frontier, and recursion has an implicit stack (call stack)
 Implementing DFS using recursion simply means replacing the stack with a call stack.
 Iterative
 To convert it into an iterative DFS you can simply use an actual stack data structure, but you now need to manage the stack yourself.
BFS (Queue)
 Workhorse: enqueue and dequeue things on a
queue
(auxiliary storage)
while !q.empty?
curr = q.rm_from_start() // dequeue
print(curr) // do whatever w curr
q.enqueue(curr.l)
q.enqueue(curr.r)
Implementation
 Mostly uses a queue
 BFS and recursion cannot be combined as naturally since BFS does not use a stack
Tree based Structures
Basic definitions
 Root: the most parent node. The First. Adam.
 Height: The longest path from the root to the most child node
 Leaves: a node without children
 Branching factor: the amount of children a tree has
 General tree: A tree with 0 or more children
 Binary tree
 A tree in which has at most 2 children, at least 0 children
 Each level in a binary tree is approx. half the size of the entire tree above it.
Tree ADTs
Binary search tree (BST)
 Rule at every node: Left ≤ Node < Right (Similar to Quicksort)
 A tree in which has a specific ordering to the nodes and at most 2 children
 Inorder traversal will result in an ordered list

Search
 Searching: Searching is similar to Binary search
 Time Complexity: $O(log(n))$  $O(n)$. $O(n)$ in the case where the BST is simply a linked list. i.e How balanced your BST will determine the complexity.

Insertion
 Insertion inherently unbalances the tree

Deletion
 Choose
smallest on largeside
orlargest on smallside
. We can choose any but if we have the
individual height
of the node, we can make a better decision as we’ll know which one to choose and shrink our tree.
 We can choose any but if we have the
 Then replace the parent to be removed with it.
 Choose
Heap/Priority Queue
 Heap is an implementation of a Priority Queue ADT
 There’s usually no need to traversing the tree as you only look at the top for priority
 A
binary tree
w constraints: Every child and grand child is smaller (MaxHeap) than current node
 Every child and grand child is larger (MinHeap) than current node
 i.e Root node is at some extreme
 Self balancing
 Every level of the tree is complete
 Whenever a node is added, we must adjust the tree
 Whenever a node is deleted, we must adjust the tree

Array Implementation
 Heapordered binary tree Implementation

Binary Heap
 Binary heaps are a common way of implementing priority queues
 The Data Structures used for
Heapsort
Tries/Prefix Tree/Re’trie’val tree
 Used for autocompletion/caching type stuff
 Value is usually a string

Complexity
 You need to think about what
n
is.  The complexity of creating a trie is O(W*L), where W is the number of words, and L is an average length of the word: you need to perform L lookups on the average for each of the W words in the set.
 Same goes for looking up words later: you perform L steps for each of the W words.
 If the longest english letter is 12, then it’s 12 node check. That’s constant time.
 In this case
n
is the height and the height is bound by the en dictionary.  So trie is good for cases where you can determine the height in advance
 You need to think about what
Balanced tree
 Depth is uniform across the entire tree.
 perfectly balanced when any node’s left and right children have the same height.
 There are different ways to balance a tree. Popular rotation techniques are AVL and RB trees.
 RB : If I don’t find things often but I insert a lot. (Fast balancing)
 AVL: If I find a lot but insert rarely. (Slower balancing)
Skip List
Btree / Btree
 BTrees and Database Indexes  Hacker News 🌟
 Binary Search Tree(BST) vs Btree
 Relational database on top of keyvalue store explained (or why Btrees are cool)  Gregory Trubetskoy
 The Taming of the BTrees  ScyllaDB
 It’s a data structure that aims for “read and write large blocks of data”,
 Key point is that such trees usually grow wide and shallow, so for some type of query (remember, it’s still a search tree) only very few nodes are to be “touched”.
 This is important, because such access could be very expensive (think of rotating HDD).
B+ tree
 The subtleties of proper B+Tree implementation  Hacker News
 B+tree is a variant on BTree
 The main difference B+ Trees show off is that intermediate nodes don’t store any data on them. Instead, all the data references are linked to the leaf nodes, which allows for better caching of the tree structure.
 Secondly, the leaf nodes are linked, so if you need to do an index scan, you can do a single linear pass rather than traversing the entire tree up and down and loading more index data from the disk.
LSM Tree / LSMT

LSMT in Database
 The LogStructured MergeTree (LSM Tree)  the morning paper
 Intro into database storage engines
 LSM tree normally has an inmemory sorted list of keyvalue pairs called a MemTable. Writes are inserted into this MemTable. To keep writes durable, writes are also sent to a writeahead log (WAL). When MemTables get large, they’re frozen (made immutable) and flushed to disk as sortedstring tables (SSTs). (from materializedview blog)
 Allows different storage(write) and access(read) characteristics.