Impact of using local variable in Stored Procedure

Preparing an optimal execution plan for a given SQL query is crucial, and the Query Optimizer’s ability to do so heavily relies on the accuracy and freshness of the statistics associated with the database objects used in the query. When these statistics are inaccurate or out-of-date, it can result in the generation of suboptimal execution plans. Additionally, the use of local variables in SQL queries can also have a significant impact on the estimation process, potentially leading to non-optimal execution plans. for example, incorrect estimations can cause the optimizer to choose a clustered index scan over non clustered index seek, which can have an unfavorable effect on query performance.

Query compilation
During procedure compilation, SQL Server estimates the number of rows based on the parameter value and prepares an execution plan. However, when a local variable is used within the stored procedure, the value of the local variable is unknown at the time of compilation. As a result, SQL Server estimates the rows, based on an approximation, and a plan generated based on the approximation may not be optimal.

Example
Let’s look at the following two procedures and their execution plans to gain a better understanding of this behavior. The first procedure uses the parameter in the WHERE clause as is, while in the second procedure, the parameter value is assigned to a local variable in the procedure body. Since the assignment of the parameter to a local variable can only happen during runtime, the effective value of the local variable is unknown at the time of compilation.

CREATE PROCEDURE getCharge (@providerId INT)
AS 
BEGIN
SET NOCOUNT ON

	SELECT c.provider_no, p.* , m.* 
	FROM [dbo].[charge] c JOIN [dbo].[provider] p
	ON c.provider_no=p.provider_no
	JOIN member m 
	ON c.member_no = m.member_no
	WHERE c.provider_no = @providerId

SET NOCOUNT OFF
END

CREATE PROCEDURE getChargeLocalVar (@providerId INT)
AS 
BEGIN
SET NOCOUNT ON
	
	DECLARE @pid INT = @providerId

	SELECT c.provider_no, p.* , m.* 
	FROM [dbo].[charge] c JOIN [dbo].[provider] p
	ON c.provider_no=p.provider_no
	JOIN member m 
	ON c.member_no = m.member_no
	WHERE c.provider_no = @pid

SET NOCOUNT OFF
END

Result of statistics io and execution
The logical reads of the first query are relatively low compared to the second query which uses local variable. The use of local variable caused the SQL Server to build completely inefficient plan for which the query cost is 98% whereas the former query has only 2% query cost.

EXEC getCharge 24

Table 'charge'. Scan count 1, logical reads 51, physical reads 34,
Table 'member'. Scan count 0, logical reads 32, physical reads 2, 
Table 'provider'. Scan count 0, logical reads 2, physical reads 2,

EXEC getChargeLocalVar 24

Table 'charge'. Scan count 9, logical reads 9453, physical reads 0
Table 'member'. Scan count 9, logical reads 427, physical reads 0,
Table 'provider'. Scan count 0, logical reads 2, physical reads 0,


The SQL Server’s decision to build the execution plan shown in the above image was influenced by the ‘estimated number of rows.’ In this plan, SQL Server chose to scan the entire Clustered Index, leading to the introduction of parallelism, bitmap, and hash match operators. However, it’s worth noting that the same result could have been achieved using a simple loop join, similar to the approach taken in the first procedure.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: