前文
JOIN
条件範围时,执行计画预估值容易不準确,这也间接导致查询效能不好.
就算有建立Index也会遇到上述问题
假如我们想要提升JOIN
条件範围效能并让Index可以发挥最大最用可以怎么做?
就让我利用一个範例来跟大家分享.
本文同步发表Blog JOIN範围条件Index优化
案例
此範例有使用到三张表
Product表:拥有1-10编号产品ReportPeriod表:存放产每期报表的资讯(时间,和是否产报表)T99表:线上产品订单资讯CREATE TABLE [dbo].[Product]([ProductId] INT NOT NULL)CREATE TABLE [dbo].[T99]([TransactionId] [int] IDENTITY(1,1) NOT NULL,[Amount] DECIMAL(18,6),[CreateDate] [datetime2](3) NULL)GOCREATE TABLE [dbo].[ReportPeriod]([PerioidID] [int] IDENTITY(1,1) NOT NULL,[ProductId] INT NOT NULL,[IsGenerate] [bit] NULL,[StartDate] [datetime2](3) NULL,[EndDate] [datetime2](3) NULL) ON [PRIMARY]GO
我们利用
T99.CreateDate
来跟ReportPeriod
判断是属于哪期报表.
资料初始化
我们利用乱数产生Sample资料来模拟线上大资料状况.
ReportPeriod
期别由'2019-08-01'
到'2020-07-31'
因为产生报表以5分钟为区间,所以可以利用CTE递迴来帮我们产生资料.
INSERT INTO [dbo].[Product] VALUES (1);INSERT INTO [dbo].[Product] VALUES (2);INSERT INTO [dbo].[Product] VALUES (3);INSERT INTO [dbo].[Product] VALUES (4);INSERT INTO [dbo].[Product] VALUES (5);INSERT INTO [dbo].[Product] VALUES (6);INSERT INTO [dbo].[Product] VALUES (7);INSERT INTO [dbo].[Product] VALUES (8);INSERT INTO [dbo].[Product] VALUES (9);INSERT INTO [dbo].[Product] VALUES (10);declare @FromDate DATETIME2(3) = '2019-08-01'declare @ToDate DATETIME2(3) = '2020-07-31';WITH CTE AS (SELECT @FromDate fromDt,@ToDate endDtUNION ALLSELECT DATEADD(MINUTE,5,fromDt),endDtFROM CTE WHERE DATEADD(MINUTE,5,fromDt) < endDt)INSERT INTO [dbo].[ReportPeriod] ([ProductId],[IsGenerate],[StartDate],[EndDate])SELECT ProductId,0,fromDt,DATEADD(MINUTE,5,fromDt) FROM CTE CROSS JOIN dbo.ProductOPTION (MAXRECURSION 0); INSERT INTO T99 ([CreateDate],Amount)SELECT top 1000000 dateadd(SECOND, rand(checksum(newid()))*(1+datediff(SECOND, @FromDate, @ToDate)), @FromDate), CAST(RAND(CHECKSUM(NEWID())) * 100000 as INT) + 1FROM sys.all_columns c1 CROSS JOIN sys.all_columns c2
建立Index
CREATE CLUSTERED INDEX [CIX_ReportPeriod_StartDate] ON [dbo].[ReportPeriod]([StartDate] ASC, [EndDate] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]ALTER TABLE [dbo].[ReportPeriod] ADD CONSTRAINT [PK_ReportPeriod] PRIMARY KEY NONCLUSTERED ([PerioidID] ASC,[ProductId] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]CREATE CLUSTERED INDEX [CIX_T99_CreateDate] ON [dbo].[T99]([CreateDate] ASC)CREATE UNIQUE NONCLUSTERED INDEX [IX_T99_TransactionId] ON [dbo].[T99]([TransactionId] ASC)
查询语法
在线上我们会使用UDT当作参数来取得某些期别资讯.
这里为了方便模拟我使用Table Variable来取代.
set nocount onDECLARE @Transaction AS TABLE(TransactionId INT,ProductId INT);INSERT INTO @Transaction VALUES (1,1)INSERT INTO @Transaction VALUES (101,2)INSERT INTO @Transaction VALUES (1001,3)SELECT p.*FROM dbo.T99 t JOIN [dbo].[ReportPeriod] p ON t.CreateDate BETWEEN p.StartDate AND p.EndDateJOIN @Transaction t1 ON t.TransactionId = t1.TransactionId AND p.ProductId = t1.ProductId
使用上面语法我们只需查询三个期别资料,但看执行计画时能发现,ReportPeriod使用的Clustered预估资讯有1百多万笔
Q:我明明有对于条件建立Index,但为什么预估值却会跑真那么严重?
CREATE CLUSTERED INDEX [CIX_ReportPeriod_StartDate] ON [dbo].[ReportPeriod]([StartDate] ASC, [EndDate] ASC)
原因出在範围条件会因为查找範围过大导致预估值不準确
甚么意思? 让我们看看下图(代表ReportPeriod
内含日期资料)
而我们在JOIN
条件只有t.CreateDate BETWEEN p.StartDate AND p.EndDate
这就会导致,我们需要查找ReportPeriod
日期资料在挑出符合的资料
JOIN [dbo].[ReportPeriod] p ON t.CreateDate BETWEEN p.StartDate AND p.EndDateJOIN @Transaction t1 ON t.TransactionId = t1.TransactionId AND p.ProductId = t1.ProductId
最后就会看到走针的估计值
如何优化?
效能差问题,选择对Index和撰写合理的查询可以改善40%左右问题
我们思考一下如果可以把範围条件改成精準=
查找条件不就可以更精準预估资讯了?
t.CreateDate BETWEEN p.StartDate AND p.EndDate
那我们怎么把上面条件使用=
取代BETWEEN
範围查询呢?
这时我们可以利用空间来换取时间
建立一个新的COLUMN
运用算法来计算每个期数StartTime
例如:CreateDate = 2020/01/03 10:08:55
会归类在2020/01/03 10:05:00
中
ALTER TABLE dbo.T99 ADD PeriodDate AS DATEADD(MINUTE,DATEPART(MINUTE,CreateDate) %5 * -1,DATETIMEFROMPARTS(DATEPART(YEAR,CreateDate),DATEPART(MONTH,CreateDate),DATEPART(DAY,CreateDate),DATEPART(HOUR,CreateDate),DATEPART(MINUTE,CreateDate),0,0))
建立完新COLUMN
后别忘记加入一个Index
给此COLUMN
.
CREATE INDEX IX_PeriodDate_T99 ON dbo.T99(PeriodDate)
最后我们修改一下查询语法
SELECT p.*FROM dbo.T99 t JOIN [dbo].[ReportPeriod] p ON p.StartDate = t.PeriodDateJOIN @Transaction t1 ON t.TransactionId = t1.TransactionId AND p.ProductId = t1.ProductId
预估值和读取值已经可以大幅降低了!!
小结:
在JOIN
範围条件差效能问题,可以思考一下是否有办法利用算法或是公式来优化查询效能,如此次範例一样.