Tags

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,


How SQL SERVER handles duplicate values in an index?

Today let’s talk about how SQL Server handles duplicate values in a clustered index and unique constraints. In which cases it allows insertion of duplicate values and in which cases it throws an error. So let’s directly jumps on the demos-

CASE 1 – Duplicate Insertion in Primary Key Column

Here we are creating a simple table with 2 columns. Only point to be noted here is we have mentioned Primary Key on ID column.

--

CREATE TABLE testDuplicateIndexesPrimaryKey
(
	 ID INT PRIMARY KEY
	,NAME VARCHAR(10)
)
GO

--

Now check what indexes and constraints created on our table. There are some cases where SQL Server internally creates indexes or constraints.

--

EXEC sp_helpindex 'testDuplicateIndexesPrimaryKey'
EXEC sp_helpconstraint 'testDuplicateIndexesPrimaryKey';

--

Output

Duplicate Insertion in Primary Key Column

Pawan Khowal – Duplicate Insertion in Primary Key Column

So in table creation we mentioned primary key. Now internally SQL Server created 2 objects.

• A Unique Clustered Index – PK__testDupl__3214EC27CB282EB4(clustered, unique, primary key located on PRIMARY)
• A Constraint –
PK__testDupl__3214EC27CB282EB4(PRIMARY KEY (clustered))

Note here that whenever you create a primary key SQL Server internally creates an index and a constraint. Now here unique clustered index is used to sort the data at leaf level (Physical sorting) using a B+ tree structure and constraint is used to handle the duplicate values. The Type of constraint here is primary key clustered.

Let’s prove it by running the below statement.

--

INSERT INTO testDuplicateIndexesPrimaryKey(ID,NAME) VALUES (1,'Pawan'),(2,'Avika'),(1,'Aisha')

--

Output

Msg 2627, Level 14, State 1, Line 11
Violation of PRIMARY KEY constraint ‘PK__testDupl__3214EC27CB282EB4’. Cannot insert duplicate key in object ‘dbo.testDuplicateIndexesPrimaryKey’. The duplicate key value is (1). The statement has been terminated.

Now the message we are getting is that we are violating primary key constraint (PK__testDupl__3214EC27CB282EB4). Here It does not provide us any information about the clustered index we have.

CASE 2 –Duplicate Insertion in a Column having unique constraint

Here we are creating a simple table with 2 columns. Only point to be noted here is we have unique constraint on name column.

--

CREATE TABLE testDuplicateIndexesConstraint
(
	 ID INT
	,NAME VARCHAR(10)
	CONSTRAINT Ak_Name UNIQUE (Name)
)
GO

/*

---Alternate syntax to create Constraint

CREATE TABLE testDuplicateIndexesConstraint
(
	 ID INT
	,NAME VARCHAR(10)
)
GO

ALTER TABLE testDuplicateIndexesConstraint
ADD CONSTRAINT Ak_Name UNIQUE (Name); 

*/

--

Now check what indexes and constraints created on our table.

--

EXEC sp_helpindex 'testDuplicateIndexesConstraint'
EXEC sp_helpconstraint 'testDuplicateIndexesConstraint';

--
Pawan Khowal - Duplicate Insertion in a Column having unique constraint

Pawan Khowal – Duplicate Insertion in a Column having unique constraint

So by mentioning a unique constraint SQL Server creates 2 objects
1. A Non Clustered Index (Index Description – nonclustered, unique, unique key located on PRIMARY)
2. A Constraint (Constraint_ type – UNIQUE (non-clustered))

Please note both index and constraint are of type NonClustered. Check out the constraint type and index description for details. Now let’s execute the below query and see what error we will get and the object mentioned in there.

--

INSERT INTO testDuplicateIndexesConstraint(ID,NAME) VALUES (1,'Pawan'),(2,'Pawan')

--

Output

Msg 2627, Level 14, State 1, Line 30
Violation of UNIQUE KEY constraint ‘Ak_Name’. Cannot insert duplicate key in object ‘dbo.testDuplicateIndexesConstraint’. The duplicate key value is (Pawan). The statement has been terminated.

Now the message we are getting is that we cannot insert duplicate values in Unique key constraint which we mentioned while creating the table. Here It does not provide us any information about the nonclustered index we have.

CASE 3 –Duplicate Insertion in Column having Unique Clustered Index

--

CREATE TABLE testDuplicateIndexes
(
	 ID INT
	,NAME VARCHAR(10)
)
GO

CREATE UNIQUE CLUSTERED INDEX Ix_ID ON testDuplicateIndexes(ID)
GO

--

Now check what indexes and constraints created on our table.

EXEC sp_helpindex ‘testDuplicateIndexes’
EXEC sp_helpconstraint ‘testDuplicateIndexes’;

Pawan Kumar Khowal - Duplicate Insertion in Column having Unique Clustered Index

Pawan Kumar Khowal – Duplicate Insertion in Column having Unique Clustered Index

Okay, in this SQL Server only creates a unique clustered index. Now let’s execute the below statement and see what error we will get.

--

INSERT INTO testDuplicateIndexes(ID,NAME) VALUES (1,'Pawan')
,(2,'Avika')
,(1,'Aisha')

--

Output

Msg 2601, Level 14, State 1, Line 13
Cannot insert duplicate key row in object ‘dbo.testDuplicateIndexes’ with unique index ‘Ix_ID’. The duplicate key value is (1). The statement has been terminated.

In this case it is saying that it cannot insert duplicate values with index Ix_Id. Here it is using clustered index to handle duplicate values. There are not constraints involved here.

So far we have seen cases where indexes or constraints are used to restrict the insertion of duplicate values. Now let’s see cases where we can successfully insert the duplicate values and how SQL Server handles duplicate values in an index

CASE 4 –Duplicate Insertion in Column having Non-Unique Clustered Index

Here we are creating a simple table with 2 columns. Only point to be noted here is we have mentioned non unique clustered index on ID column.

--

CREATE TABLE testDuplicateIndexesNonUnique
(
	 ID INT
	,NAME VARCHAR(10)
)
GO

CREATE CLUSTERED INDEX Ix_ID ON testDuplicateIndexesNonUnique(ID)
GO

--

Now check what indexes and constraints created on our table.

--

EXEC sp_helpindex 'testDuplicateIndexesNonUnique'
EXEC sp_helpconstraint 'testDuplicateIndexesNonUnique';

--
Pawan Kumar Khowal - Duplicate Insertion in Column having Non-Unique Clustered Index

Pawan Kumar Khowal – Duplicate Insertion in Column having Non-Unique Clustered Index

In this case SQL Server only creates a non-unique clustered index. Now let’s see if we can insert duplicate values or not.

--

INSERT INTO testDuplicateIndexesNonUnique(ID,NAME) VALUES (1,'Pawan'),(2,'Avika'),(1,'Aisha')

--
Pawan Kumar Khowal - Duplicate Insertion in Column having Non-Unique Clustered Index

Pawan Kumar Khowal – Duplicate Insertion in Column having Non-Unique Clustered Index

It allows us to insert duplicate values. Now the question is how SQL Server internally handles this.

Okay so the point to be noted here is that we can create a clustered index on columns that contains duplicate values. Clustered index in this case should be Non Unique in this case. Now in this case SQL Server internally adds a four digit unique identifier after the column value to maintain the uniqueness internally.

Pawan Khowal - How SQL Server handles duplicate values in an Index

In the above example we have a non-unique clustered index on Name column. Now for example Manoj, For manoj we have 2 values. For the first value the data will be inserted as it is. For the second manoj value SQL internally adds a 4 digit like shown in the above example. So first value will be manoj and second value will be manoj7654. User will not be affected by this.

So internally it will increase the space taken by the index. Hence always try to create unique clustered indexes on less percentage of duplicate values.

Summary

If you create a PRIMARY KEY SQL Server internally creates 2 objects

  • A Unique Clustered Index (clustered, unique, primary key located on PRIMARY)
  • A Constraint – (PRIMARY KEY (clustered))
  • If you try to insert duplicate values this type of error will be thrown – Violation of PRIMARY KEY constraint

If you create Unique Constraint SQL Server internally creates 2 objects

  • A non clustered index – (Index Description – nonclustered, unique, unique key located on PRIMARY)
  • A Constraint (Constraint_ type – UNIQUE (non-clustered))
  • If you try to insert duplicate values this type of error will be thrown – Violation of UNIQUE KEY constraint

If you create a Unique Clustered Index SQL Server internally create single object

  • If you try to insert duplicate values this type of error will be thrown – Cannot insert duplicate key row in object ‘dbo.testDuplicateIndexes’ with unique index ‘Ix_ID’

If you create a NON-Unique Clustered Index SQL Server internally create single object

  • Here duplicate and non duplicate data will be inserted successfully. SQL Server here internally add a 4 digit integer value to the original value to maintain uniqueness, but this will increase the index size.

I hope you have enjoyed the article. Thanks for reading

-Pawan Khowal

MSBISkills.com