Index Fragmentation Guide

In-depth explanation of database index fragmentation and maintenance

← Home

What is Index Fragmentation?

An index is typically implemented as a B-tree (balanced tree). Over time, data modifications (INSERT/UPDATE/DELETE) create two forms of fragmentation:

TypeMeaningAnalogyPrimary Causes
Internal FragmentationExcess free space inside pages; lower page densityHalf-empty boxes on a shelfDeletes; updates that enlarge variable-length rows; page splits that don’t fully refill
External FragmentationLogical order of leaf pages doesn’t match physical order in the fileBook pages out of order, forcing jumpsPage splits allocate new pages that are far away, repeated over time

Ideal vs Fragmented (ASCII)

Ideal (Logical = Physical)
 ┌───────────────┐→┌───────────────┐→┌───────────────┐
 │ Page 1: 1–100 │  │ Page 2:101–200│  │ Page 3:201–300│
 └───────────────┘  └───────────────┘  └───────────────┘

Fragmented (Logical ≠ Physical)
Logical: Page1 → Page2 → Page7 → Page3
Physical: Page1 → Page2 → Page3 → … → Page7

How Page Splits Create Fragmentation

1) Initial (Healthy) State

Physical order:
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Page #1       │→│ Page #2       │→│ Page #3       │
│ [1,2,3,4]     │ │ [5,6,7,8]     │ │ [9,10,11,12]  │
└───────────────┘ └───────────────┘ └───────────────┘

2) Insert Key 6.5 → Page #2 is Full → Split

Allocate new page (#7), move ~50% from Page #2:
┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Page #1       │→│ Page #2       │→│ Page #7       │→│ Page #3       │
│ [1,2,3,4]     │ │ [5,6]         │ │ [6.5,7,8]     │ │ [9,10,11,12]  │
└───────────────┘ └───────────────┘ └───────────────┘ └───────────────┘

Logical order: #1 → #2 → #7 → #3
Physical order: #1 → #2 → #3 → … → #7   (≠ logical) → External fragmentation

3) After Many Modifications

Scattered leaves (example):
#1 [1–4], #9 [4.5–5], #2 [5–6], #7 [6.5–8], #3 [9–12]
Performance Impact: Range scans that could be sequential devolve into random I/O and cache churn. On SSDs the penalty is smaller than HDDs, but buffer/cache efficiency still drops.

Where the Split Happens in the B-Tree

B-Tree Layers

         (Root)
            │
     ┌──────┴──────┐
(Intermediate)   (Intermediate)
     │                │
 ┌───┴───┐        ┌───┴───┐
Leaf1  Leaf2     Leaf3  Leaf4

Leaf-Level Split (Insert 6.5)

                   [Root]
                     │
         ┌────────────┴────────────────┐
     [1–4] Page1                 [5–12] Page2
                                  │
                       ┌──────────┴──────────┐
                 [5–6] Page3          [6.5–8] Page7 (new)

Split Propagation Upward

                        [New Root]
                      /             \
                  [1–8]           [9–16]
                 /      \         /       \
           [1–4]       [5–8]  [9–12]   [13–16]

Fixing Fragmentation

ActionEffectWhen
REORGANIZEPhysically reorder leaf pages; compactLight to moderate fragmentation (e.g., < 30%)
REBUILDRecreate the B-tree in order; new pagesHigh fragmentation (e.g., ≥ 30%) or page density too low

Operational Tips

  • Choose a fill factor to leave room in pages where mid-range inserts are common (reduces splits).
  • Avoid hot-spot keys when possible; consider surrogate keys or append-only patterns.
  • Schedule maintenance during low-traffic windows; watch log growth and tempdb (for rebuilds).
  • Measure with the engine’s DMVs/metadata before and after (e.g., avg_fragmentation_in_percent in SQL Server).

Quick Analogy Reference

ConceptAnalogy
Page SplitInserting a page into the middle of a printed book
Internal FragmentationBlank space left on each page
External FragmentationBook’s pages shuffled out of physical order
RebuildReprinting the book in correct order

Appendix: Minimal SQL Server Scripts

Check Fragmentation

SELECT
    dbschemas.[name] AS SchemaName,
    dbtables.[name]  AS TableName,
    dbindexes.[name] AS IndexName,
    indexstats.avg_fragmentation_in_percent,
    indexstats.page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'SAMPLED') AS indexstats
JOIN sys.tables dbtables           ON dbtables.[object_id] = indexstats.[object_id]
JOIN sys.schemas dbschemas         ON dbtables.[schema_id] = dbschemas.[schema_id]
JOIN sys.indexes dbindexes         ON dbindexes.[object_id] = indexstats.[object_id]
                                  AND dbindexes.[index_id] = indexstats.[index_id]
WHERE indexstats.database_id = DB_ID()
ORDER BY indexstats.avg_fragmentation_in_percent DESC;

Maintain (Heuristic)

-- Reorganize when < 30%, Rebuild when ≥ 30%
DECLARE @object_id INT = OBJECT_ID(N'dbo.YourTable');
DECLARE @index_id INT = 1; -- pick your index id

SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), @object_id, @index_id, NULL, 'SAMPLED');

-- Example actions:
ALTER INDEX IX_YourTable_YourKey ON dbo.YourTable REORGANIZE;
-- or
ALTER INDEX IX_YourTable_YourKey ON dbo.YourTable REBUILD WITH (ONLINE = ON);

Thresholds vary by workload and engine. Adjust for your environment.