JOIN範围条件Index优化

前文

JOIN条件範围时,执行计画预估值容易不準确,这也间接导致查询效能不好.

就算有建立Index也会遇到上述问题

假如我们想要提升JOIN条件範围效能并让Index可以发挥最大最用可以怎么做?

就让我利用一个範例来跟大家分享.

本文同步发表Blog JOIN範围条件Index优化

案例

此範例有使用到三张表

Product表:拥有1-10编号产品ReportPeriod表:存放产每期报表的资讯(时间,和是否产报表)T99表:线上产品订单资讯
CREATE TABLE [dbo].[Product]([ProductId] INT NOT NULL)CREATE TABLE [dbo].[T99]([TransactionId] [int] IDENTITY(1,1) NOT NULL,[Amount] DECIMAL(18,6),[CreateDate] [datetime2](3) NULL)GOCREATE TABLE [dbo].[ReportPeriod]([PerioidID] [int] IDENTITY(1,1) NOT NULL,[ProductId] INT NOT NULL,[IsGenerate] [bit] NULL,[StartDate] [datetime2](3) NULL,[EndDate] [datetime2](3) NULL) ON [PRIMARY]GO

我们利用T99.CreateDate来跟ReportPeriod判断是属于哪期报表.

资料初始化

我们利用乱数产生Sample资料来模拟线上大资料状况.

ReportPeriod期别由'2019-08-01''2020-07-31'

因为产生报表以5分钟为区间,所以可以利用CTE递迴来帮我们产生资料.

INSERT INTO [dbo].[Product] VALUES (1);INSERT INTO [dbo].[Product] VALUES (2);INSERT INTO [dbo].[Product] VALUES (3);INSERT INTO [dbo].[Product] VALUES (4);INSERT INTO [dbo].[Product] VALUES (5);INSERT INTO [dbo].[Product] VALUES (6);INSERT INTO [dbo].[Product] VALUES (7);INSERT INTO [dbo].[Product] VALUES (8);INSERT INTO [dbo].[Product] VALUES (9);INSERT INTO [dbo].[Product] VALUES (10);declare @FromDate DATETIME2(3) = '2019-08-01'declare @ToDate DATETIME2(3) = '2020-07-31';WITH CTE AS (SELECT @FromDate fromDt,@ToDate endDtUNION ALLSELECT DATEADD(MINUTE,5,fromDt),endDtFROM CTE WHERE DATEADD(MINUTE,5,fromDt) < endDt)INSERT INTO  [dbo].[ReportPeriod] ([ProductId],[IsGenerate],[StartDate],[EndDate])SELECT ProductId,0,fromDt,DATEADD(MINUTE,5,fromDt) FROM CTE CROSS JOIN dbo.ProductOPTION (MAXRECURSION 0); INSERT INTO T99 ([CreateDate],Amount)SELECT top 1000000 dateadd(SECOND,           rand(checksum(newid()))*(1+datediff(SECOND, @FromDate, @ToDate)),                @FromDate),   CAST(RAND(CHECKSUM(NEWID())) * 100000 as INT) + 1FROM sys.all_columns c1 CROSS JOIN sys.all_columns c2

建立Index

CREATE CLUSTERED INDEX [CIX_ReportPeriod_StartDate] ON [dbo].[ReportPeriod]([StartDate] ASC,    [EndDate] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]ALTER TABLE [dbo].[ReportPeriod] ADD  CONSTRAINT [PK_ReportPeriod] PRIMARY KEY NONCLUSTERED ([PerioidID] ASC,[ProductId] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]CREATE CLUSTERED INDEX [CIX_T99_CreateDate] ON [dbo].[T99]([CreateDate] ASC)CREATE UNIQUE NONCLUSTERED INDEX [IX_T99_TransactionId] ON [dbo].[T99]([TransactionId] ASC)

查询语法

在线上我们会使用UDT当作参数来取得某些期别资讯.

这里为了方便模拟我使用Table Variable来取代.

set nocount onDECLARE @Transaction AS TABLE(TransactionId INT,ProductId INT);INSERT INTO @Transaction VALUES (1,1)INSERT INTO @Transaction VALUES (101,2)INSERT INTO @Transaction VALUES (1001,3)SELECT p.*FROM dbo.T99 t JOIN [dbo].[ReportPeriod] p ON t.CreateDate BETWEEN p.StartDate AND p.EndDateJOIN @Transaction t1 ON t.TransactionId = t1.TransactionId AND  p.ProductId = t1.ProductId

使用上面语法我们只需查询三个期别资料,但看执行计画时能发现,ReportPeriod使用的Clustered预估资讯有1百多万笔

Q:我明明有对于条件建立Index,但为什么预估值却会跑真那么严重?

CREATE CLUSTERED INDEX [CIX_ReportPeriod_StartDate] ON [dbo].[ReportPeriod]([StartDate] ASC,    [EndDate] ASC)

原因出在範围条件会因为查找範围过大导致预估值不準确

甚么意思? 让我们看看下图(代表ReportPeriod内含日期资料)

而我们在JOIN条件只有t.CreateDate BETWEEN p.StartDate AND p.EndDate这就会导致,我们需要查找ReportPeriod日期资料在挑出符合的资料

JOIN [dbo].[ReportPeriod] p ON t.CreateDate BETWEEN p.StartDate AND p.EndDateJOIN @Transaction t1 ON t.TransactionId = t1.TransactionId AND  p.ProductId = t1.ProductId

最后就会看到走针的估计值

如何优化?

效能差问题,选择对Index和撰写合理的查询可以改善40%左右问题

我们思考一下如果可以把範围条件改成精準=查找条件不就可以更精準预估资讯了?

t.CreateDate BETWEEN p.StartDate AND p.EndDate

那我们怎么把上面条件使用=取代BETWEEN範围查询呢?

这时我们可以利用空间来换取时间

建立一个新的COLUMN运用算法来计算每个期数StartTime

例如:CreateDate = 2020/01/03 10:08:55会归类在2020/01/03 10:05:00

ALTER TABLE dbo.T99 ADD PeriodDate AS DATEADD(MINUTE,DATEPART(MINUTE,CreateDate) %5 * -1,DATETIMEFROMPARTS(DATEPART(YEAR,CreateDate),DATEPART(MONTH,CreateDate),DATEPART(DAY,CreateDate),DATEPART(HOUR,CreateDate),DATEPART(MINUTE,CreateDate),0,0))

建立完新COLUMN后别忘记加入一个Index给此COLUMN.

CREATE INDEX IX_PeriodDate_T99 ON dbo.T99(PeriodDate)

最后我们修改一下查询语法

SELECT p.*FROM dbo.T99 t JOIN [dbo].[ReportPeriod] p ON p.StartDate = t.PeriodDateJOIN @Transaction t1 ON t.TransactionId = t1.TransactionId AND  p.ProductId = t1.ProductId

预估值和读取值已经可以大幅降低了!!

小结:

JOIN範围条件差效能问题,可以思考一下是否有办法利用算法或是公式来优化查询效能,如此次範例一样.


关于作者: 网站小编

码农网专注IT技术教程资源分享平台,学习资源下载网站,58码农网包含计算机技术、网站程序源码下载、编程技术论坛、互联网资源下载等产品服务,提供原创、优质、完整内容的专业码农交流分享平台。

热门文章