Using constraint improves query performance

We know that the constraints are used to enforce the data integrity. A CHECK constraint is used to limit the value being entered into a column, similarly, unique constraint is used to avoid duplicate value being entered into a column. This post will cover how certain queries will get benefit with the presence of unique constraint.

Two tables are created for this demonstration, both tables are created with similar definition except constraint. A unique constraint is defined only in one table.

CREATE TABLE dbo.OrderDtl 
(
orderId INT IDENTITY(1,1) NOT NULL,
ItemId INT NOT NULL,
itemQty INT
);

CREATE TABLE dbo.OrderDtl_uq
(
orderId INT IDENTITY(1,1) NOT NULL,
ItemId INT not null,
itemQty INT
CONSTRAINT UQ_OrderDtl_order_item_id UNIQUE(orderId,ItemId)  
);

INSERT INTO dbo.OrderDtl_uq
VALUES (1,10),(2,20),(3,10),(1,5);

INSERT INTO dbo.OrderDtl
VALUES (1,10),(2,20),(3,10),(1,5);

The tables have been loaded with few records, the data in OrderId and ItemId when combine together makes the record unique. The table OrderDtl_uq never allows duplicate records because of the existence of unique constraint for the columns OrderId and ItemId. Whereas uniqueness is not guaranteed for the OrderDtl table as unique constraint is not defined.

Let’s run the simple SELECT statement against two tables and verify the execution plan

SELECT OrderId, ItemId, ItemQty FROM dbo.OrderDtl
SELECT OrderId, ItemId, ItemQty FROM dbo.OrderDtl_uq

It returned same result and execution plan is also same for both the queries, the optimizer picked table scan operator for both queries and it doesn’t benefit with the existence of constraint.
Let’s look at another example

SELECT OrderId, ItemId, SUM(ItemQty) ItemQty 
FROM dbo.OrderDtl
GROUP BY OrderId, ItemId

SELECT OrderId, ItemId, SUM(ItemQty) ItemQty 
FROM dbo.OrderDtl_uq
GROUP BY OrderId, ItemId

The plan shows that the table with constraint is faster because the optimizer knows that each row is unique based on the columns used in the group query, so it opted to just scan the table and return the result.
The table without unique constraint has high cost compare to the second query plan. Because optimizer doesn’t know each row is unique even though they are, so it decided to scan the table to fetch all the records and sort them using sort operator followed by an aggregate operator.

Conclusion

Though the purpose of constraint is to enforce the data integrity, the presence of constraint is useful in query performance for some type of queries.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: