Database Index Storage & Page Splits

1) How is an index stored in a database?
2) Visual diagram of B+Tree, clustered vs non-clustered
3) Page splits – explanation
4) When nearby disk locations are full – fragmentation
5) Visual diagram of remote allocation after split

How Indexes Are Stored in a Database

In a relational database, an index is a specialized data structure that maps KEY → ROW LOCATION to speed up reads. Most engines store indexes as B/B+‑Trees made of pages (root, branch, leaf). Clustered indexes keep the data rows in the leaf pages; non‑clustered indexes keep the key plus a row locator.

Pages (e.g., 8 KB in SQL Server, 16 KB in InnoDB) live in their own index allocation, are cached in the buffer pool, and leaf pages are doubly-linked to support efficient range scans. Maintenance operations (rebuild/reorganize), statistics, and page splits keep the tree balanced.

Root → Branch → Leaf (sorted keys)
Leaf (clustered): [key | full row]
Leaf (non-clustered): [key | row locator]

Visual Representation: B+Tree Structure

The following diagram illustrates B+Tree storage with both clustered and non‑clustered leaf pages.

B+Tree index storage: root/branch/leaf, clustered vs non-clustered leaf pages.

Figure 1 — B+Tree structure. Clustered leaf pages store full rows; non‑clustered store row locators.

Understanding Page Splits

A page split occurs when inserting into a full leaf page. The engine allocates a new page, moves about half the rows, inserts the new key, and updates parent pointers so the B‑Tree remains balanced.

Before: [10, 20, 30]  (insert 25 → no space)
Split  : allocate new page
After : [10, 20]  |  [25, 30]
Parent updated with new child pointer

Splits cause extra I/O and can lead to logical fragmentation if the new page isn't allocated near the original. Use sequential keys, appropriate FILLFACTOR, and periodic maintenance to reduce impact.

Page Splits When Nearby Disk Locations Are Full

If the extent containing the original page has no free space, the engine allocates the new page from another extent (possibly far away). Logical adjacency is preserved via B‑Tree links, but physical locality is lost, increasing fragmentation and hurting range‑scan performance. Rebuild/reorganize and lower fill factor can help.

Old extent (full) → allocate new page in another extent
Logical leaf order remains; physical order diverges (fragmentation)

Visual Representation: Remote Page Allocation

The following diagram shows page split with remote allocation, illustrating the contrast between logical links and physical separation.

Page split when nearby extent is full: logical link across physically distant extents.

Figure 2 — After split, new leaf page is allocated in a distant extent; the leaf chain remains logically adjacent.

Table of Contents

How Indexes Are Stored in a Database

Visual Representation: B+Tree Structure

Understanding Page Splits

Page Splits When Nearby Disk Locations Are Full

Visual Representation: Remote Page Allocation