B+-tree B-tree M-way search

Introduction

The MySQL index structure is based on the B+-tree, which is an extension of the B-tree. The B-tree itself is a type of M-way search tree.

Body

An M-way tree is a rooted tree in which each node has no more than m children. A binary tree is a special case where m = 2, and a ternary tree is another case with m = 3, limiting its children to three.

alt text

B-tree

B-tree is an M-way search tree with two special properties:

  1. It is perfectly balanced: every leaf node is at the same depth.
    (With a balanced tree, access1 is O(log n).
    With an unbalanced tree, access1 is O(n) (worst case).)

alt text

  1. Every node, except perhaps the root, is at least half-full, i.e. contains M/2 or more values (of course, it cannot contain more than M-1 values). The root may have any number of values (1 to M-1).

B+-tree

Difference Between B-Tree And B+ Tree

| B-tree | B+-tree |
| ——- | — | —– |
| Data is stored in leaf nodes as well as internal nodes. | Data is stored only in leaf nodes. |
| Leaf nodes cannot be linked together. | Leaf nodes are linked together to form a linked list.|
| Searching is a bit slower as data is stored in internal as well as leaf nodes. | Searching is faster as the data is stored only in the leaf nodes. |
| Deletion operation is complex. | Deletion operation is easy as data can be directly deleted from the leaf nodes. |
| No redundant search keys are present. | Redundant search keys may be present. |

alt text

Searching is faster as the data is stored only in the leaf nodes.

B+ trees don’t have data associated with interior nodes, more keys can fit on a page of memory. Therefore, it will require fewer cache misses in order to access data that is on a leaf node.

The leaf nodes of B+ trees are linked, so doing a full scan of all objects in a tree requires just one linear pass through all the leaf nodes. A B tree, on the other hand, would require a traversal of every level in the tree. This full-tree traversal will likely involve more cache misses than the linear traversal of B+ leaves.

B-tree Deletion operation is complex.

Data is stored in leaf nodes as well as internal nodes, so there are two different situations we need to deal with, and B+-tree only stores data in leaf nodes.

No redundant search keys are present.

B-tree only includes < > equal is not an option.

Reference

https://www.youtube.com/watch?v=aZjYr87r1b8&ab_channel=AbdulBari
https://www.youtube.com/watch?v=C_q5ccN84C8&ab_channel=FullstackAcademy

https://www.softwaretestinghelp.com/b-tree-data-structure-cpp/
https://webdocs.cs.ualberta.ca/~holte/T26/b-trees.html

https://stackoverflow.com/questions/59206128/balanced-vs-unbalanced-binary-tree-clarification-needed
https://stackoverflow.com/questions/870218/what-are-the-differences-between-b-trees-and-b-trees

https://en.wikipedia.org/wiki/M-ary_tree