SQL Server 2016 CTP2 Optimizer does not use Incremental Statistics - 8 June 2015 08:38

0. Contents

First create folder C:\Databases\AdventureWorks2014_Partition

Get the AdventureWorks Database from Adventure Works 2014 Sample Databases https://msftdbprodsamples.codeplex.com/releases/view/125550 and restore as Database AdventureWorks2014_Partition

Save this backup in C:\x and restore the backup

RESTORE DATABASE AdventureWorks2014
FROM disk= 'C:\x\AdventureWorks2014.bak'
WITH MOVE 'AdventureWorks2014_data' TO 'C:\Databases\AdventureWorks2014_Partition\AdventureWorks2014.mdf',
MOVE 'AdventureWorks2014_Log' TO 'C:\Databases\AdventureWorks2014_Partition\AdventureWorks2014.ldf'
,REPLACE
ALTER DATABASE [AdventureWorks2014] Modify Name = AdventureWorks2014_Partition;
USE [AdventureWorks2014_Partition];

/* add filesgroups */
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILEGROUP [FG2011];
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILEGROUP [FG2012];
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILEGROUP [FG2013];
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILEGROUP [FG2014];
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILEGROUP [FG2015];

/* add files */
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILE 
( 
  FILENAME = N'C:\Databases\AdventureWorks2014_Partition\2011.ndf',
  NAME = N'2011', SIZE = 1024MB, MAXSIZE = 4096MB, FILEGROWTH = 512MB
) TO FILEGROUP [FG2011];
 
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILE 
(
  FILENAME = N'C:\Databases\AdventureWorks2014_Partition\2012.ndf',
  NAME = N'2012', SIZE = 512MB, MAXSIZE = 2048MB, FILEGROWTH = 512MB
) TO FILEGROUP [FG2012];
 
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILE 
(
  FILENAME = N'C:\Databases\AdventureWorks2014_Partition\2013.ndf',
  NAME = N'2013', SIZE = 2048MB, MAXSIZE = 4096MB, FILEGROWTH = 512MB
) TO FILEGROUP [FG2013];
 
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILE
(
  FILENAME = N'C:\Databases\AdventureWorks2014_Partition\2014.ndf',
  NAME = N'2014', SIZE = 2048MB, MAXSIZE = 4096MB, FILEGROWTH = 512MB
) TO FILEGROUP [FG2014];
 
ALTER DATABASE [AdventureWorks2014_Partition] ADD FILE 
(
  FILENAME = N'C:\Databases\AdventureWorks2014_Partition\2015.ndf',
  NAME = N'2015', SIZE = 2048MB, MAXSIZE = 4096MB, FILEGROWTH = 512MB
) TO FILEGROUP [FG2015];

/* create partition function */
CREATE PARTITION FUNCTION [OrderDateRangePFN] ([datetime])
AS RANGE RIGHT FOR VALUES 
(
  '20110101',  -- everything in 2011
  '20120101',  -- everything in 2012
  '20130101',  -- everything in 2013
  '20140101',  -- everything in 2014
  '20150101'   -- everything in 2015
);

 
/* create partition scheme */
CREATE PARTITION SCHEME [OrderDateRangePScheme]
AS PARTITION [OrderDateRangePFN] TO
  ([PRIMARY], [FG2011], [FG2012], [FG2013], [FG2014], [FG2015]);

/* create the table */
CREATE TABLE [dbo].[Orders]
(
 [PurchaseOrderID] [int] NOT NULL,
 [EmployeeID] [int] NULL,
 [VendorID] [int] NULL,
 [TaxAmt] [money] NULL,
 [Freight] [money] NULL,
 [SubTotal] [money] NULL,
 [Status] [tinyint] NOT NULL,
 [RevisionNumber] [tinyint] NULL,
 [ModifiedDate] [datetime] NULL,
 [ShipMethodID] [tinyint] NULL,
 [ShipDate] [datetime] NOT NULL,
 [OrderDate] [datetime] NOT NULL,
 [TotalDue] [money] NULL
) ON [OrderDateRangePScheme] (OrderDate);

/* add the clustered index and enable incremental stats */
ALTER TABLE [dbo].[Orders]
ADD CONSTRAINT [OrdersPK]
PRIMARY KEY CLUSTERED 
(
 [OrderDate],
 [PurchaseOrderID]
) 
ON [OrderDateRangePScheme] ([OrderDate]);
 
ALTER INDEX [OrdersPK] ON [dbo].[Orders] REBUILD WITH (STATISTICS_INCREMENTAL = ON);

Next load in about 4 million rows

SET NOCOUNT ON;
 
DECLARE @Loops SMALLINT = 0;
DECLARE @Increment INT = 3000;
 
WHILE @Loops < 1000
BEGIN
  INSERT [dbo].[Orders]
  ([PurchaseOrderID]
  ,[EmployeeID]
  ,[VendorID]
  ,[TaxAmt]
  ,[Freight]
  ,[SubTotal]
  ,[Status]
  ,[RevisionNumber]
  ,[ModifiedDate]
  ,[ShipMethodID]
  ,[ShipDate]
  ,[OrderDate]
  ,[TotalDue] )
  SELECT [PurchaseOrderID] + @Increment
  , [EmployeeID]
  , [VendorID]
  , [TaxAmt]
  , [Freight]
  , [SubTotal]
  , [Status]
  , [RevisionNumber]
  , [ModifiedDate]
  , [ShipMethodID]
  , DATEADD(DAY, 365, [ShipDate])
  , DATEADD(DAY, 365, [OrderDate])
  , [TotalDue] + 365
  FROM [Purchasing].[PurchaseOrderHeader];
 
  CHECKPOINT;
  SET @Loops = @Loops + 1;
  SET @Increment = @Increment + 5000;
END

Stats are now:

SELECT $PARTITION.[OrderDateRangePFN]([o].[OrderDate]) AS [Partition Number]
  , MIN([o].[OrderDate]) AS [Min_Order_Date]
  , MAX([o].[OrderDate]) AS [Max_Order_Date]
  , COUNT(*) AS [Rows_In_Partition]
FROM [dbo].[Orders] AS [o]
GROUP BY $PARTITION.[OrderDateRangePFN]([o].[OrderDate])
ORDER BY [Partition Number];

Stats after load

Update statistics full and check data in each partition

UPDATE STATISTICS [dbo].[Orders] WITH FULLSCAN;
 
SELECT $PARTITION.[OrderDateRangePFN]([o].[OrderDate]) AS [Partition Number]
  , MIN([o].[OrderDate]) AS [Min_Order_Date]
  , MAX([o].[OrderDate]) AS [Max_Order_Date]
  , COUNT(*) AS [Rows_In_Partition]
FROM [dbo].[Orders] AS [o]
GROUP BY $PARTITION.[OrderDateRangePFN]([o].[OrderDate])
ORDER BY [Partition Number];

Stats after full scan


Check undocumented DMV sys.dm_db_stats_properties_internal

SELECT *
  FROM [sys].[dm_db_stats_properties_internal](OBJECT_ID('dbo.Orders'),1)
  ORDER BY [node_id];

First DMV value


We run a query to access only partition 5 which returns no data and look at the estimated plan

SELECT *
FROM [dbo].[Orders]
WHERE [OrderDate] = '2014-04-01';

The optimizer thinks 4000 rows will be returned

First optimizer

We check the statistics

DBCC SHOW_STATISTICS('dbo.Orders','OrdersPK');

First show stats

We enable incremental statistics and see what we get.

ALTER INDEX [OrdersPK] ON [dbo].[Orders] 
  REBUILD WITH (STATISTICS_INCREMENTAL = ON);
 
UPDATE STATISTICS [dbo].[Orders] WITH FULLSCAN;

We recheck sys.dm_db_stats_properties_internal

SELECT *
  FROM [sys].[dm_db_stats_properties_internal](OBJECT_ID('dbo.Orders'),1)
  ORDER BY [node_id];

Second DMV

We rerun the query again, freeing the proccache to get a new plan

DBCC FREEPROCCACHE;
GO
 
SELECT *
  FROM [dbo].[Orders]
  WHERE [OrderDate] = '2014-04-01';

We get the same plan and estimate

Second optimizer

We get the same histogram

DBCC SHOW_STATISTICS('dbo.Orders','OrdersPK');

Second show stats


We use undocumented traceflag 2309 to see the stat only for the 2014 partition

DBCC TRACEON(2309);
DBCC SHOW_STATISTICS('dbo.Orders','OrdersPK', 6);

Third show stats

Here we see that, again, there is no step for 2014-04-01, but there are 0 RANGE_ROWS between 2014-02-13 and 2014-04-05, with an AVG_RANGE_ROWS of 1.

If the optimizer was using the histogram for the partition level statistics, then the estimate for the number of rows for 2014-04-01 would be 1.