Clustered Indexes, Fragmentation, fragmentation Clustered, fragmentation in Clustered Indexes, GUID causes fragmentation in Clustered Indexes, GUID fragmentation, GUID SQL, Interesting Interview Questions, Interview Qs.SQL SERVER Questions, Interview questions, Interview Questions on SQL, InterviewQuestions, InterviewQuestions for SQL, puzzle sql developer, Queries for SQL Interview, SELECT Puzzle, SQL, SQL 2012, SQL 2014, SQL 2014 Interview Questions, SQL Interview Questions, SQL Joins, SQL Queries, SQL Quiz, SQL Server Database, SQL SERVER Interview questions, SQL Skills, SQL Tricky question, sql/database interview for puzzle sql developer, SQLSERVER, T SQL Puzzles, Tricky Questions, TSQL, TSQL Interview questions, TSQL Queries
How GUIDs can cause fragmentation in clustered indexes
Reference – http://www.sqlskills.com/blogs/paul/can-guid-cluster-keys-cause-non-clustered-index-fragmentation/
A GUID Key will create fragmentation(Fragmentation means the data is stored non-contiguously on disk) because it is random in nature and its size is also on a very higher side. A GUID is of 16 bytes, four times the space of an 4-byte integer. So every time when you insert a row in the index SQL has to search the insertion point in the B-tree and since the value we are expecting is random hence the insertion point is also random.
This means that if an index page is full, a random insert that happens to have to go onto that page will cause a page split to make room for the new record. A page-split is where a new page is allocated and (as near as possible to) half the rows from the splitting page are moved to the new page. The new row is then inserted into one of the two pages, determined by the key value. Usually the newly allocated page is not physically contiguous to the splitting page, and so fragmentation has been caused.
In this case *two* kinds of fragmentation have been caused-
1. Logical fragmentation – Here the next logical page as determined by the index order is not the next physical page in the data file.
2. Physical (or internal) fragmentation – Here the space is being wasted on index pages.
These can both affect query performance, as well as the expense of having to do the page split in the first place.
Check out the example below – I’ll create a clustered index with GUID. Let’s see what happens when we insert 10000 rows:
-- CREATE TABLE TestClusteredKeyFragmentation ( ID UNIQUEIDENTIFIER DEFAULT NEWID (), Dates DATETIME DEFAULT GETDATE () ) CREATE CLUSTERED INDEX Ix_ID ON TestClusteredKeyFragmentation (ID) GO INSERT INTO TestClusteredKeyFragmentation DEFAULT VALUES GO 10000 --Execute the below query and check the details SELECT OBJECT_NAME (ips.[object_id]) AS 'Object Name', si.name AS 'Index Name', ROUND (ips.avg_fragmentation_in_percent, 2) AS 'Fragmentation', ips.page_count AS 'Pages', ROUND (ips.avg_page_space_used_in_percent, 2) AS 'Page Density' FROM sys.dm_db_index_physical_stats (DB_ID('InMemory'), NULL, NULL, NULL, 'DETAILED') ips CROSS APPLY sys.indexes si WHERE si.object_id = ips.object_id AND si.index_id = ips.index_id AND ips.index_level = 0 AND OBJECT_NAME (ips.[object_id]) = 'TestClusteredKeyFragmentation' GO --
The TestClusteredKeyFragmentation clustered index 98.41% fragmented, with around 35% space being wasted on each page. Hence we should never use GUID as Keys. They uses 16 bytes which is four times then the normal integer fields used to take and this space is wasted everywhere. They also expensive in joins and takes more time to perform lookup.
We should always use Integer fields for Primary Key if possible. If you still want to learn more on this topic then click Pauls blog (Paul is the best in SQL SERVER ) – http://www.sqlskills.com/blogs/paul/can-guid-cluster-keys-cause-non-clustered-index-fragmentation/
Keep learning. We all need to learn.
Pingback: SQL SERVER Interview Questions & Answers – SET 1 (50 Questions) | Enhance your SQL Server Skills
Gregor Kralj said:
how would you construct table if you need to have a guid identier key and it should not be a primary key?
a primary key as IDENTITY(1,1) clustered index
and guid identifier nonclustered index
LikeLiked by 1 person
Gregor Kralj said:
point is, you need to ensure fast searching by guid and you cannot change existing data regarding previous question
LikeLiked by 1 person