In my previous blog we have seen the new feature of SQL Server 2014- Incremental Statistics as well as how we can specify this at the time of index creation. If you have not gone through that then below is the link for that blog.
We all know that statistics in SQL Server can also be manually updated. In this blog I will show you the improvements for SQL Server manual statistics update in SQL Server 2014 over previous version of SQL Server.
Environment: I have a table xtstatsdemo with columns id, name and balance. Table is having 3 partitions with id as Key. First partition is having rows with key range 1-9999; Second partition is having rows with key range 10000-19999; Third partition is having rows with key range 20000-30000. Here I have one clustered index on id column. Here Statistics also created automatically due to index creation.
Now for demo purpose let’s consider that first two partitions are static or you can say read only, while third partition will be used for all DML operations. Before SQL Server 2014 – Incremental Statistics, if we will do the manual statistics update then update stats operation will scan the complete table i.e. all partitions of that table. Statistics update will take too much time if static or read only partitions are having too much amount of data.
In SQL Server 2014- Incremental statistics, we can manually update the statistics only for specific partition(s). Then internally SQL Server will merge this update statistics with other partition statistics to generate the complete table statistics. That means now statistics update will take less resources. This is really a great improvement in SQL Server 2014. You can specify a single partition or multiple partitions as per your requirement while updating the statistics manually.
Now let me show you this practically. First I am going to setup the environment.
--Create a database for Demo at default files location Create database StatsDemo2014 go Use StatsDemo2014 go --Create Partition Function create partition function xpfstatsdemo (int) as range right for values (10000,20000) go --Create Partition Scheme create partition scheme xpsstatsdemo as partition xpfstatsdemo ALL to ([Primary]) go --Create table by using Partition Scheme on id column create table xtstatsdemo ( id int not null, balance int not null, name varchar(25) ) on xpsstatsdemo (id) go --Create a clustered index on id column [This will cover complete table] --Here I have also specify statistics_Incremental option create clustered index CI_xtstatsdemo on xtstatsdemo (id) with (statistics_incremental=ON); go --You can check here about how many partitions belongs to the table select * from sys.partitions where object_id=OBJECT_ID('xtstatsdemo') go -- Now I am inserting 30000 rows in the table. That means each partition will contain 10000 rows here -- Here id and balance both are having same values for demo purpose set nocount on declare @counter int =1 declare @balance int =1 while (@counter<=30000) begin if (@counter%2=0) begin insert into xtstatsdemo values(@counter,@balance,'data1') end else begin insert into xtstatsdemo values(@counter,@balance,'data2') end set @counter = @counter + 1; set @balance = @balance + 1; end go --You can check number of rows inside the table Select count(*) from xtstatsdemo go
Now let me show you the status of statistics auto created by index.
--Statement to update stats automatically because # of rows has been changed from 0 to 30000 Select * from xtstatsdemo where id>20000 go -- Current stats are showing that table is having total 30000 rows. That’s right. DBCC SHOW_STATISTICS('xtstatsdemo','CI_xtstatsdemo') go
Now I will delete 1800 rows from third partition. 1800 rows are below threshold for stats auto update. So there will be no auto update happen.
-- Now I am going to delete 1800 rows from third partition. delete from xtstatsdemo where id>28200 go --Statement to update the stats automatically Select id,name from xtstatsdemo where id>20000 go --Stats after 1800 rows deletion DBCC SHOW_STATISTICS('xtstatsdemo','CI_xtstatsdemo') Go
From the above output it is clear that there is no Auto update happens here. Now let me do it manually. Here I know that DML operation happen on partition 3rd only. So I will update the stats for partition 3rd only rather than the complete table.
--Here you can specify multiple partitions also like partition(2,3) Update statistics xtstatsdemo(CI_xtstatsdemo) With resample on partitions (3) go DBCC SHOW_STATISTICS('xtstatsdemo','CI_xtstatsdemo')
Here stats update operation will update the statistics for complete table while it will scan only 3rd partition rather than complete table. Consider the scenario where partition 1 and 2 both are having large amount of data (like millions of rows) then this operation will be fast here because it will scan partition 3 only rather than complete table. If you want then you can check this behaviour by using event trace for scan_started extended event where you will found that update stats will scan only third partition.