Index Fragmentation Analysis
Causes of Fragmentation
Fragmentation occurs when data is modified in a table. When you insert or update data in a table (via INSERT or UPDATE), the table’s corresponding clustered indexes and the affected nonclustered indexes are modified. This can cause an index leaf page split if the modification to an index can’t be accommodated in the same page. A new leaf page will then be added that contains part of the original page and maintains the logical order of the rows in the index key. Although the new leaf page maintains the logical order of the data rows in the original page, this new page usually won’t be physically adjacent to the original page on the disk. Or, put a slightly different way, the logical key order of the index doesn’t match the physical order within the file.
SQL Server 2014 exposes the leaf and nonleaf pages and other data through a dynamic management view called sys.dm_db_index_physical_stats. It stores both the index size and the fragmentation.
Fragmentation Overhead
Both internal and external fragmentations adversely affect data retrieval performance. External fragmentation causes a noncontiguous sequence of index pages on the disk, with new leaf pages far from the original leaf pages and with their physical ordering different from their logical ordering.
For better performance, it is preferable to use sequential I/O, since this can read a whole extent (eight 8KB pages together) in a single disk I/O operation. By contrast, a noncontiguous layout of pages requires nonsequential or random I/O operations to retrieve the index pages from the disk, and a random I/O operation can read only 8KB of data in a single disk operation (this may be acceptable, however, if you are retrieving only one row).
Analyzing the Amount of Fragmentation
You can analyze the fragmentation ratio of an index by using the sys.dm_db_index_physical_ stats
dynamic management function.
SELECT ddips.avg_fragmentation_in_percent,
ddips.fragment_count,
ddips.page_count,
ddips.avg_page_space_used_in_percent,
ddips.record_count,
ddips.avg_record_size_in_bytes
FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks2012'),
OBJECT_ID(N'Person.Person'),NULL,
NULL,'Sampled') AS ddips;
Fragmentation Resolutions
You can resolve fragmentation in an index by rearranging the index rows and pages so that their physical and logical orders match. To reduce external fragmentation, you can physically reorder the leaf pages of the index to follow the logical order of the index. You achieve this through the following techniques:
- Dropping and re-creating the index
- Re-creating the index with the DROP_EXISTING = ON clause
- Executing the ALTER INDEX REBUILD statement on the index
- Executing the ALTER INDEX REORGANIZE statement on the index
Significance of the Fill Factor
The internal fragmentation of an index is reduced by getting more rows per leaf page in an index. Getting more rows within a leaf page reduces the total number of pages required for the index and in turn decreases disk I/O and the logical reads required to retrieve a range of index rows. On the other hand, if the index key values are highly transactional, then having fully used index pages will cause page splits. Therefore, for a transactional table, a good balance between maximizing the number of rows in a page and avoiding page splits is required
SQL Server allows you to control the amount of free space within the leaf pages of the index by using the fill factor. If you know that there will be enough INSERT queries on the table or UPDATE queries on the index key columns, then you can pre-add free space to the index leaf page using the fill factor to minimize page splits. If the table is read-only, you can create the index with a high fill factor to reduce the number of index pages.
The default fill factor is 0, which means the leaf pages are packed to 100 percent, although some free space is left in the branch nodes of the B-tree structure.
Automatic Maintenance
In a database with a great deal of transactions, tables and indexes become fragmented over time. Thus, to improve performance, you should check the fragmentation of the tables and indexes regularly, and you should defragment the ones with a high amount of fragmentation. You also may need to take into account the workload and defragment indexes as dictated by the load as well as the fragmentation level of the index. You can do this analysis for a database by following these steps:
- Identify all user tables in the current database to analyze fragmentation.
- Determine fragmentation of every user table and index.
- Determine user tables and indexes that require defragmentation by taking into account the following considerations:
- A high level of fragmentation where avg_fragmentation_in_percent is greater than 20 percent
- Not a very small table/index—that is, pagecount is greater than 8
- Defragment tables and indexes with high fragmentation.