PA152: Efficient Use of DB 4. Indexing Vlastislav Dohnal PA152, Vlastislav Dohnal, FI MUNI, 2023 2 Indexing ◼ Reason: faster access to records than sequential (table) scan ◼ Variants: Conventional indexes B-tree Hashing ? recordkey value file key value …… PA152, Vlastislav Dohnal, FI MUNI, 2023 3 Terminology ◼ Sequential file  Index-sequential file ◼ Index  Primary index  Secondary index  Dense index  Sparse index  Multilevel index ◼ Search key  Primary key PA152, Vlastislav Dohnal, FI MUNI, 2023 4 File Sequential file 20 10 40 30 60 50 80 70 100 90 Index-sequential file 20 10 40 30 60 50 80 70 100 90 10 30 50 70 90 PA152, Vlastislav Dohnal, FI MUNI, 2023 5 Index ◼ Collection of items:  20 10 40 30 60 50 80 70 Sparse index 10 30 50 70 Dense index 10 20 30 40 50 60 70 80 PA152, Vlastislav Dohnal, FI MUNI, 2023 6 Index ◼ Multilevel index ◼ Should indexes be dense in higher levels? 2nd level 10 50 1st level 10 20 30 40 50 60 70 80 20 10 40 30 60 50 80 70 File PA152, Vlastislav Dohnal, FI MUNI, 2023 7 Indices and Pointers ◼ Pointers in indexes Pointer to records ◼ Block addr. + record position (index with a block) Pointer to block ◼ Block addr. =  file ID + block number File is contiguous and sequential ◼ May to store pointers to blocks  use “implicit” pointers, i.e., can be computed  e.g., block number derived from the order of items in index PA152, Vlastislav Dohnal, FI MUNI, 2023 8 Implicit Block Pointers ◼ Block size 8KiB ◼ Searching for „K3“ Index scan: ◼ K3 in 3rd item  → 3rd block (B3) Offset in file ◼ (3-1)·8192=16 384 K1→B1 K3→B3 K4→B4 K2→B2 B1 B2 B3 B4 Seq. file Index 0 16384 8192 24576 32768 PA152, Vlastislav Dohnal, FI MUNI, 2023 9 Duplicate Keys ◼ Index type dense index? sparse index? 10 10 20 10 30 20 30 30 45 40 File PA152, Vlastislav Dohnal, FI MUNI, 2023 10 ◼ Duplicate values in primary index 20 30 30 30 20 30 30 30 20 30 30 30 40 45 Duplicate Keys: Dense Index 10 10 20 10 30 20 30 30 45 40 Seq. file 10 10 10 20 10 10 10 20 PA152, Vlastislav Dohnal, FI MUNI, 2023 11 ◼ Values in primary index are unique File must always be sequential 20 30 30 30 45 Duplicate Keys: Dense Index 10 10 20 10 30 20 30 30 45 40 Seq. file 10 10 10 20 10 20 30 40 PA152, Vlastislav Dohnal, FI MUNI, 2023 12 ◼ Pointers with the first value in the block Can eliminate duplicate values ◼ Record look up!!! Find value “20” 20 30 30 30 40 Duplicate Keys: Sparse Index 10 10 20 10 30 20 30 30 45 40 Seq. file 10 10 10 20 10 10 20 30 PA152, Vlastislav Dohnal, FI MUNI, 2023 13 ◼ Pointers with new value in block Always place new key to index ◼ Next slides (Deletion/insertion to prim. index) are skipped. 20 30 30 30 40 Duplicate Keys: Sparse Index 10 10 20 10 30 20 30 30 45 40 Seq. file 10 10 10 20 10 20 30 30 Delete this item? PA152, Vlastislav Dohnal, FI MUNI, 2023 14 Deletion from Index ◼ Sparse index Delete record with key 40 20 10 40 30 60 50 80 70 10 30 50 70 PA152, Vlastislav Dohnal, FI MUNI, 2023 15 Deletion from Index: Result ◼ Sparse index After deletion of 40 20 10 40 30 60 50 80 70 10 30 50 70 PA152, Vlastislav Dohnal, FI MUNI, 2023 16 Deletion from Index ◼ Sparse index Delete record with key 30 20 10 40 30 60 50 80 70 10 30 50 70 PA152, Vlastislav Dohnal, FI MUNI, 2023 17 Deletion from Index: Result ◼ Sparse index After deletion of record30 ◼ New value in block changed, so update index 20 10 40 60 50 80 70 10 40 50 70 PA152, Vlastislav Dohnal, FI MUNI, 2023 18 Deletion from Index ◼ Sparse index Delete records 30 and 40 20 10 40 30 60 50 80 70 10 30 50 70 PA152, Vlastislav Dohnal, FI MUNI, 2023 19 Deletion from Index: Result ◼ Sparse index After deletion of records 30 and 40 ◼ Block reclaimed, so update index 20 10 40 30 60 50 80 70 10 50 70 PA152, Vlastislav Dohnal, FI MUNI, 2023 20 Deletion from Index ◼ Dense index – always update index Delete record with key 30 20 10 40 30 60 50 80 70 10 20 30 40 50 60 70 80 PA152, Vlastislav Dohnal, FI MUNI, 2023 21 Deletion from Index: Result ◼ Dense index After deletion of record 30 20 10 40 60 50 80 70 10 20 40 50 60 70 80 PA152, Vlastislav Dohnal, FI MUNI, 2023 22 Insertion to Index ◼ Sparse index Insert record 34 ◼ Free space → no reorganization 20 10 34 30 50 40 60 10 30 40 60 PA152, Vlastislav Dohnal, FI MUNI, 2023 23 Insertion to Index ◼ Sparse index Insert record with key 15 ◼ No free space → reorganize immediately 20 10 30 50 40 60 10 30 40 60 PA152, Vlastislav Dohnal, FI MUNI, 2023 24 Insertion to Index ◼ Sparse index Insert record with key 15 ◼ No free space → reorganize immediately ◼ Solution: move some records to next block Variation: ◼ insert new block (chained file) ◼ may corrupt implicit pointers 15 10 30 20 50 40 60 10 20 40 60 PA152, Vlastislav Dohnal, FI MUNI, 2023 25 Insertion to Index ◼ Sparse index Insert record with key 25 ◼ No free space → reorganize ◼ Solution:  allocate overflow block  Reorganize record into main file later 20 10 34 30 50 40 60 10 30 40 60 25 PA152, Vlastislav Dohnal, FI MUNI, 2023 26 Insertion to Index ◼ Dense index Insert record ◼ Update index – insert new item ◼ Update file – by analogy to file update in sparse index case PA152, Vlastislav Dohnal, FI MUNI, 2023 27 Secondary Index ◼ File ordered by another key i.e., index created for different key than the primary file Or the file is not ordered at all ◼ Which type: Dense or sparse? PA152, Vlastislav Dohnal, FI MUNI, 2023 28 Secondary Index ◼ Must use pointers to records! Search key 50 30 70 20 40 80 10 100 60 90 Dense index 10 20 30 40 50 60 70 ... 10 50 90 ... Sparse for higher levels PA152, Vlastislav Dohnal, FI MUNI, 2023 29 Secondary Index: Duplicate Keys ◼ Replicated in index Increases ◼ space requirements ◼ access time 10 20 40 20 40 10 40 10 40 30 10 10 10 20 20 30 40 40 40 40 ... PA152, Vlastislav Dohnal, FI MUNI, 2023 30 Secondary Index: Duplicate Keys ◼ Index item contains list of pointers But the item is of variable length 10 20 40 20 40 10 40 10 40 30 10 40 30 20 PA152, Vlastislav Dohnal, FI MUNI, 2023 31 Secondary Index: Duplicate Keys ◼ Shift the variable-length list to “buckets” 10 20 40 20 40 10 40 10 40 30 10 20 30 40 50 60 ... buckets index blocks file blocks PA152, Vlastislav Dohnal, FI MUNI, 2023 32 Secondary Index: Duplicate Keys ◼ Advantage: a list of records for querying Evaluate more selection constrains without accessing records ◼ Example: Relation ◼ employee(name, department, room) Indexes: ◼ name – primary index ◼ department – secondary index ◼ room – secondary index PA152, Vlastislav Dohnal, FI MUNI, 2023 33 Secondary Index: Duplicate Keys ◼ Query: employee of Toys dept. in room G243 ◼ Intersect buckets  To get pointers to matching employee records  Also used in text information retrieval Toys G243 File (employee)Index (department) Index (room) Example: Text Information Retrieval ◼ “Full-text” index for documents ◼ Split documents into words ◼ Build an inverted file over all documents i.e., a file of records PA152, Vlastislav Dohnal, FI MUNI, 2023 34 Document 1 Word List for Doc1 Caesar and Brutus are ambitious. •Caesar •Brutus •ambitious 1. Split to words 2. Stemming 3. Ignore words in stop-list Example: Text Information Retrieval ◼ Inverted file ◼ Retrieve docs containing Brutus & Caesar Read posting lists for Brutus and Caesar Intersect them PA152, Vlastislav Dohnal, FI MUNI, 2023 35 Term docID ambitious 1 brutus 1 brutus 3 capitol 2 caesar 1 caesar 2 Term Posting list of docIDs ambitious 1 brutus 1, 3 capitol 1 caesar 1, 2 Relational view Inverted file PA152, Vlastislav Dohnal, FI MUNI, 2023 36 Conventional Indexes: Summary ◼ Basic ideas  Sparse vs. dense; multilevel ◼ Insertion / deletion  Duplicate keys ◼ in case of secondary indexes ◼ Advantages  Simple  Index is a sequential file too → good for „full scan“ ◼ Disadvantages  Costly updates  Lost of physical “sequentiality” ◼ due to overflow buckets PA152, Vlastislav Dohnal, FI MUNI, 2023 37 B-trees ◼ Another index type  Sequential order not necessary  Balanced – max I/Os guarantee ◼ More variants  B-tree, B+-tree, B*-tree, … ◼ Typically, by saying “B-tree” we mean “B+-tree”! ◼ Origin  Rudolf Bayer and Ed McCreight invented the B-tree while working at Boeing Research Labs in 1971 (Bayer & McCreight 1972) ◼ They did not explain what, if anything, the B stands for. ◼ Douglas Comer explains:  The origin of "B-tree" has never been explained by the authors. As we shall see, "balanced," "broad," or "bushy" might apply. Others suggest that the "B" stands for Boeing. Because of his contributions, however, it seems appropriate to think of B-trees as "Bayer"-trees. * Source: Wikipedia PA152, Vlastislav Dohnal, FI MUNI, 2023 38 B+-tree ◼ Example n=4 100 120 150 180 30 3 5 11 30 35 100 101 110 120 130 150 156 179 180 200 … pointers to record in file … PA152, Vlastislav Dohnal, FI MUNI, 2023 39 B+-tree ◼ Non-leaf node, n=4 57 81 95 Keys k < 57 Keys 57 ≤ k < 81 Keys 81 ≤ k < 95 Keys 95 ≤ k PA152, Vlastislav Dohnal, FI MUNI, 2023 40 B+-tree ◼ Leaf node, n=4 57 81 95 Pointer from non-leaf node (parent) Pointer to next leaf Record pointers: Revision follows, so skip it. PA152, Vlastislav Dohnal, FI MUNI, 2023 41 B+-tree ◼ Parameter n (tree arity) influences: Node format: Minimal occupation Leaf node ◼ All leaves at same lowest level ◼ pi points to record with key Ki (data) ◼ pn points to next leaf (chained leaves) Non-leaf node ◼ pi points to node organizing keys K: Ki-1 ≤ K < Ki pn-1K1 p1 K2 p2 Kn-1 pnp3 … PA152, Vlastislav Dohnal, FI MUNI, 2023 42 B+-tree ◼ Occupation constraints Max pointers Min pointers Max keys Min keys Non-leaf (not root) n (children) n/2 (children) n-1 n/2 -1 Non-leaf (root) n (children) 2 (children) n-1 1 Leaf (not root) n-1 (records) (n-1)/2 (records) n-1 (records) (n-1)/2 (records) Leaf (root) n-1 (records) 0 (records) n-1 (records) 0 (records) B+-tree: Insertion ◼ Principle: Grows from leaves to root ◼ Procedure: Find leaf node and insert new key  Including pointer to the new record  Update parent if necessary ◼ Insert cases: a) No reorganization ◼ Free capacity in leaf b) Split leaf c) Split non-leaf d) Split root PA152, Vlastislav Dohnal, FI MUNI, 2023 43 PA152, Vlastislav Dohnal, FI MUNI, 2023 44 B+-tree: n=4 ◼ Insert key 32 No reorganization 3 5 11 30 31 30 10032 PA152, Vlastislav Dohnal, FI MUNI, 2023 45 B+-tree: n=4 ◼ Insert key 7 Leaf split 30 31 30 100 7 3 5 11 3 5 11 7 11 Deletion variants follow, which is revision, so skip it. PA152, Vlastislav Dohnal, FI MUNI, 2023 46 B+-tree: n=4 ◼ Insert key 160 Splitting non-leaf node 100 120 150 180 150 156 179 180 200 180 160 179 160 PA152, Vlastislav Dohnal, FI MUNI, 2023 47 B+-tree: n=4 ◼ Insert key 45 Split root 10 20 30 1 2 3 10 12 20 25 30 32 40 40 45 40 30 New root node PA152, Vlastislav Dohnal, FI MUNI, 2023 48 B+-tree: Split Leaf n=3, insert key 6 5 6 7 7 1 3 5 5 6 7 5 7 1 3 5 6 7 5 7 5 1 3 6 1. 2. 3. PA152, Vlastislav Dohnal, FI MUNI, 2023 49 B+-tree: Split non-leaf node 5 6 7 5 7 1 3 41. 2. 3. 4 5 6 5 7 1 3 7 4 5 74 4 5 6 4 1 3 7 7 5 5 4 7 4 5 61 3 7 4. n=3, insert key 4 PA152, Vlastislav Dohnal, FI MUNI, 2023 50 B+-tree: Deletion ◼ Find leaf node and delete key  Including the corresponding record  Delete node if empty, … ◼ Deletion cases: a) No reorganization (leaf is not “underfilled”) b) Coalesce with neighbor (sibling node) and delete node c) Redistribute keys between neighbors (without node deletion) d) Cases (b) and (c) for non-leaf nodes PA152, Vlastislav Dohnal, FI MUNI, 2023 51 B+-tree: n=5 ◼ Delete key 50 Coalesce (merge) keys into a neighbor and delete node 10 40 100 10 20 30 40 50 40 PA152, Vlastislav Dohnal, FI MUNI, 2023 52 B+-tree: n=5 ◼ Delete key 50 Redistribute keys between neighbors (avoid node deletion) 10 40 100 10 20 30 35 40 50 35 35 PA152, Vlastislav Dohnal, FI MUNI, 2023 53 B+-tree: n=5 ◼ Delete key 37 Redistribute keys between neighboring nonleaf nodes 40 45 30 37 25 26 20 22 10 14 1 3 10 20 30 40 25 25New root 30 40 PA152, Vlastislav Dohnal, FI MUNI, 2023 54 B+-tree: Deletion ◼ Practice: Coalescing often not implemented ◼ More inserts than deletes (both random) leads to utilization of 65-69% even if nodes not merged Too complex and low impact PA152, Vlastislav Dohnal, FI MUNI, 2023 55 B+-tree vs. Conventional index ◼ Block size 4 KiB  Key = 4B, pointer to block/rec = 4B  Multilevel secondary index ◼ sparse: 512 keys and pointers to a block ◼ dense: 512 keys and pointers to records  B+-tree ◼ non-leaf node: 512 pointers to other nodes ◼ leaf: 511 pointers to records ◼ Comparison in records in a relation:  Full 2-level indexes: (1st level == 1 block) ◼ Sec. index: up to 262 144 records (512h)  up to 1 048 576 records if implicit indexes are used ◼ B+-tree: up to 261 632 records (512h-1 . 511) ◼ Prim. index (all sparse levels): up to 512h+1 records PA152, Vlastislav Dohnal, FI MUNI, 2023 56 B+-tree vs. Conventional index ◼ Conclusion:  B+-tree has larger space overhead ☺ Is dynamic, but may not be physically sequential  B+-tree – more complex locking  Conventional index must be reorganized as whole ◼ DBMS does not know when to reorganize ☺ B+-tree makes small local reorganizations  Conventional index needs large reorganizations  Buffer manager ☺ B+-tree – fixed buffer requirements (log depth)  Conventional index – must use overflow blocks to be efficient  Linear complexity due to overflow areas ◼ LRU is no good for B+-trees! ◼ B+-tree is a better organization. PA152, Vlastislav Dohnal, FI MUNI, 2023 57 B-tree (without +) ◼ Idea: no key replication → record pointer also in non-leaf nodes Different constraints on key values in subtrees K1 P1 K2 P2 K3 P3 Pointers to: … Node with keys k < K1 Node with keys K1 < k < K2 Node with keys K2 < k < K3 Node with keys K3 < k Record with K1 Record with K2 Record with K3 PA152, Vlastislav Dohnal, FI MUNI, 2023 58 B-tree: Example ◼ Leaf chaining cannot be used n=3 65 125 145 165 85 105 25 45 10 20 30 40 110 120 90 100 70 80 170 180 50 60 130 140 150 160 No! Not present! PA152, Vlastislav Dohnal, FI MUNI, 2023 59 B-tree ◼ Occupation constrains Max pointers Min pointers Max keys Min keys Non-leaf (non-root) n (children) n/2 (children) n-1 (keys and pointers) n/2 -1 (keys and pointers) Non-leaf (root) n (children) 2 (children) n-1 (keys and pointers) 1 (keys and pointers) Leaf (non-root) n-1 (records) (n-1)/2 (records) n-1 (record pointers) (n-1)/2 (record pointers) Leaf (root) n-1 (records) 0 (records) n-1 (record pointers) 0 (record pointers) PA152, Vlastislav Dohnal, FI MUNI, 2023 60 Comparison: B-tree and B+-tree ◼ Sizes Block = 4KiB Pointer = 4 bytes Key = 4 bytes ◼ Assume a full 2-level tree 1 root and leaves Each node in one block PA152, Vlastislav Dohnal, FI MUNI, 2023 61 Comparison: B-tree ◼ Root: 341 keys + 341 record pointers 342 pointers to child nodes (blocks) ◼ 341·(4+4) + 342·4 = 4096 bytes ◼ Leaf: 512 keys + 512 record pointers ◼ 512 · (4+4) = 4096 bytes ◼ Total records: 341 + 342 · 512 = 175 445 recs PA152, Vlastislav Dohnal, FI MUNI, 2023 62 Comparison: B+-tree ◼ Root: 511 keys, 512 block pointers ◼ 511·4 + 512·4 = 4092 bytes ◼ Leaf: 511 keys + 511 record pointers ◼ 511 · (4+4) + 4 = 4092 bytes ◼ Total records: 512 · 511 = 261 632 recs PA152, Vlastislav Dohnal, FI MUNI, 2023 63 Comparison: Result ◼ Read I/Os: B-tree ◼ P1 read = 341 / 175 445 = 0,2% ◼ P2 reads = 1 - P1 read = 99,8% B+-tree ◼ P2 reads = 100% PA152, Vlastislav Dohnal, FI MUNI, 2023 64 Comparison: Result ◼ B-trees  Faster lookup ◼ Not always, can be deeper (see prev. slide)  Different formats of non-leaf & leaf nodes  Deletion more complicated ➔ B+-trees preferred! PA152, Vlastislav Dohnal, FI MUNI, 2023 65 B+-tree ◼ B+-tree as file Leaves store the records themselves. ◼ Duplicate keys Pointers in leaves = pointers to buckets ◼ i.e., blocks with a list of record pointers with the same key value ◼ Variable-length key values (e.g., strings) Store completely → low arity, varying arity, … Use prefixes (prefix compression) Lecture’s Takeaways ◼ Efficiency of B+ trees ◼ Handling duplicate keys i.e., multiple records with the same key value. ◼ Revision of terminology Dense / sparse index Primary / secondary index ◼ Clustered / non-clustered index Covering index PA152, Vlastislav Dohnal, FI MUNI, 2023 66