In this article, we will demonstrate specific ways to automate table partitioning in SQL Server. This article aims to help you avoid manual table activities of partition maintenance by automating it with T-SQL scripts and SQL Server jobs.
Automation with the partition task is required when the range of partition function is not enough to proceed for the newly inserted data. For example, when a partition function does not have a range for new rows and inserting rows into the table is out of the existing range. However, those rows will be inserted into the table but do not move to the appropriate filegroup and its files. Most probably, a partition is required on tables that are having a large number of rows. For those tables, the partition function separates the rows by month, year, or any other valuable column.
Initially, a database team will plan the predicted maximum range by adding a filegroup to the partition scheme and adding a range to the partition function. A new range with partition function and filegroup with the partition scheme is required when the existing range is near about to the ending. Adding a new range and filegroup to the current partitioning is the manual activity, and that can be performed by the database administrator while monitoring the range. A database administrator will split the range when partition’s range is near about the end. Here, this article suggests some ordinary solutions to automate with the help of the SQL Server scheduler.
SQL Server scheduler can help with splitting a new partition range and adding new filegroups as it is required for the partition functions and schemes. Users can write a short program to automate the partition with the help of T-SQL, and that program can be executed using a SQL Server job. Before designing the maintenance for partitioning in SQL Server, users should be aware of what is table partitioning in SQL Server? And why is partitioning required in SQL Server?
What is Table Partitioning in SQL Server? Why is partitioning required in SQL Server?
Table Partition is the logical division of information with the physical distribution in the filegroups. A table can be partitioned by applying the partition schema over the table schema. The role of the partition function is to divide the information in a logical division by partition range and partition scheme indexing the partition range to the filegroup. After applying the partition on the table, rows of the table will be distributed in different secondary files by the filegroup.
Day by day density of the information is increasing more and more, and the database size increases with storing more and more pieces of information. Ideally, companies maintain several databases for the product to generate a query result set from multiple tables to fulfill the business requirements. With the increase in the size of databases, table performance becomes slower for the client queries. But, proper indexes and index maintenance will exist in the database, but at last, records will be scanned out from the broad set of the index pointer value. What can the user do more? And How?
Partition is one of the beneficial approaches for query performance over the large table. Table Index will be a part of each partition in SQL Server to return a quick query response. Actual partition performance will be achieved when you use a query with the partition column because the partition column will target those partitions only, which are required by the partition column. It wouldn’t scan the information in all filegroups.
Partitioning in SQL Server divides the information into the smaller storage groups; It is about table data and indexes. Partition function can be used with the table column when a table creates. A partition can be defined with the name and its storage attributes. Let’s have a sample partition before setting up an automated task on table partitioning in SQL Server.
Table Partitioning in SQL Server – Step by Step
Partitioning in SQL Server task is divided into four steps:
- Create a File Group
- Add Files to File Group
- Create a Partition Function with Ranges
- Create a Partition Schema with File Groups
Creating a sample database to perform Table Partitioning in SQL Server:
1 |
CREATE DATABASE AutoPartition |
Adding File Groups to the database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
ALTER DATABASE AutoPartition ADD FILEGROUP FG_01_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_02_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_03_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_04_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_05_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_06_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_07_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_08_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_09_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_10_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_11_2020 GO ALTER DATABASE AutoPartition ADD FILEGROUP FG_12_2020 GO |
Here, we are going to set up a month-wise partition for the table. So, initially, we are creating 12 File Groups for the year 2020.
Adding Files to each File Group
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_012020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_012020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_01_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_022020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_022020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_02_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_032020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_032020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_03_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_042020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_042020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_04_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_052020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_052020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_05_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_062020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_062020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_06_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_072020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_072020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_07_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_082020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_082020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_08_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_092020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_092020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_09_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_102020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_102020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_10_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_112020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_112020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_11_2020 GO ALTER DATABASE AutoPartition ADD FILE ( NAME = [File_122020], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\File_122020.ndf', SIZE = 5 MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10 MB ) TO FILEGROUP FG_12_2020 GO |
Here, adding one file to each filegroup for data distribution to the physical storage to perform the table partitioning in SQL Server. If you have crores of row per day, then you can add multiple files to the single file group as well, and data will be distributed to the number of files in the filegroup.
Adding a Partition Function with Month wise range
1 2 3 4 5 6 7 8 9 10 |
USE AutoPartition GO CREATE PARTITION FUNCTION [PF_MonthlyPartition] (DATETIME) AS RANGE RIGHT FOR VALUES ( '2020-01-31 23:59:59.997', '2020-02-29 23:59:59.997', '2020-03-31 23:59:59.997', '2020-04-30 23:59:59.997', '2020-05-31 23:59:59.997', '2020-06-30 23:59:59.997', '2020-07-31 23:59:59.997', '2020-08-31 23:59:59.997', '2020-09-30 23:59:59.997', '2020-10-31 23:59:59.997', '2020-11-30 23:59:59.997', '2020-12-31 23:59:59.997' ); |
Here, 12 ranges are defined with the last day of the month and last ms of the day. Users can use the month and year combination as well to perform the table partitioning in SQL Server. But I would recommend to define it with full datetime to perform insert operations quickly. In the above sample code, the partition function is defined with the Right direction.
Adding a Partition Scheme with File Groups to the Partition Function
1 2 3 4 5 6 7 8 9 10 11 12 |
USE AutoPartition GO CREATE PARTITION SCHEME PS_MonthWise AS PARTITION PF_MonthlyPartition TO ( 'FG_01_2020', 'FG_02_2020', 'FG_03_2020', 'FG_04_2020', 'FG_05_2020', 'FG_06_2020', 'FG_07_2020', 'FG_08_2020', 'FG_09_2020', 'FG_10_2020', 'FG_11_2020', 'FG_12_2020', 'Primary' ); |
Adding a filegroup to the partition schema with attaching the partition function
Here, the Primary filegroup is an additional filegroup in the partition scheme definition. The primary filegroup is used to store those rows which are exceeding the partition range in the function. It works when users forget to add new ranges and new filegroups with the file.
Creating a Table with the Partition Scheme
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
USE AutoPartition GO CREATE TABLE orders ( [order_id] BIGINT IDENTITY(1,1) NOT NULL, [user_id] BIGINT, [order_amt] DECIMAL(10,2), [address_id] BIGINT, [status_id] TINYINT, [is_active] BIT, [order_date] [datetime] ) ON PS_MonthWise ([order_date]); GO CREATE CLUSTERED INDEX CI_orders_order_id ON orders(order_id) GO CREATE NONCLUSTERED INDEX IX_user_id ON orders(user_id) GO |
Here, the table is defined with applying the Partition to the column [order_date] of table orders.
Inserting data into Table [orders]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
INSERT INTO orders (user_id, order_amt, address_id, status_id, is_active, order_date) VALUES (43, 1623.78, 51, 1, 1, '2020-01-14 13:21:51.869'), (51, 8963.17, 43, 1, 1, '2020-02-17 05:07:43.193'), (27, 7416.93, 66, 1, 1, '2020-03-21 07:53:07.743'), (58, 9371.54, 45, 1, 1, '2020-04-26 16:19:27.852'), (53, 8541.56, 65, 1, 1, '2020-05-08 19:21:58.654'), (98, 6971.85, 54, 1, 1, '2020-06-17 21:34:52.426'), (69, 5217.74, 78, 1, 1, '2020-07-03 07:37:51.391'), (21, 9674.14, 98, 1, 1, '2020-08-27 23:49:53.813'), (52, 1539.96, 32, 1, 1, '2020-09-01 17:17:07.317'), (17, 7193.63, 21, 1, 1, '2020-10-23 10:23:37.307'), (68, 3971.25, 19, 1, 1, '2020-11-30 09:01:27.079'), (97, 5973.58, 97, 1, 1, '2020-12-06 13:43:21.190'), (76, 4163.95, 76, 1, 1, '2021-01-03 18:51:17.764') GO |
We have inserted sample rows with each partition range of the partition function. Now let’s check how many rows exist with partition and its filegroups.
Partition details with Row count
Below are the DMVs that return the number of rows that exist in the filegroup with partition range.
1 2 3 4 5 6 7 8 9 10 11 |
SELECT DISTINCT o.name as table_name, rv.value as partition_range, fg.name as file_groupName, p.partition_number, p.rows as number_of_rows FROM sys.partitions p INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id INNER JOIN sys.objects o ON p.object_id = o.object_id INNER JOIN sys.system_internals_allocation_units au ON p.partition_id = au.container_id INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id INNER JOIN sys.partition_functions f ON f.function_id = ps.function_id INNER JOIN sys.destination_data_spaces dds ON dds.partition_scheme_id = ps.data_space_id AND dds.destination_id = p.partition_number INNER JOIN sys.filegroups fg ON dds.data_space_id = fg.data_space_id LEFT OUTER JOIN sys.partition_range_values rv ON f.function_id = rv.function_id AND p.partition_number = rv.boundary_id WHERE o.object_id = OBJECT_ID('orders'); |
Here, the partition range was not defined for January 2021. But we added the last file group with the Primary. Therefore, if rows are exceeding the range, then those will be allocated to the primary filegroup.
Table Rows with Partition Number
Users can find the partition number with each row of the table as well. Users can bifurcate the row allocation to the logical partition number with the help of a $PARTITION() function.
1 2 |
SELECT $PARTITION.PF_MonthlyPartition(order_date) AS PartitionNumber, * FROM orders |
Automate the Partition flow
We have explained the above examples to understand the requirement of post activities and maintenance on the partition function. A primary filegroup is defined to manage those rows which are out of the partition range. However, the database team has to monitor that the range of any partition function is ending or not? To avoid some manual tasks, users can set up the SQL Server job to perform it automatically.
SQL Server job will be executed in a pre-defined scheduled time (monthly or weekly) and helps to find out the partition functions which are needed to be maintained. In the above sample, we used the DATETIME column type for the partition range. Users can write a program with T-SQL statements as below:
Find the Partition Function whose range does not exist for the next month. For example, in the above example, partition range is defined till Dec 2020, and a current timestamp is 2020-12-27 16:27:09.500. After three days, the partition range will be exceeded because the maximum range is 2020-12-31 23:59:59.997 for the order table. Now, we will find the partition functions which are required to be maintained using the below T-SQL.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT o.name as table_name, pf.name as PartitionFunction, ps.name as PartitionScheme, MAX(rv.value) AS LastPartitionRange, CASE WHEN MAX(rv.value) <= DATEADD(MONTH, 2, GETDATE()) THEN 1 else 0 END AS isRequiredMaintenance --INTO #temp FROM sys.partitions p INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id INNER JOIN sys.objects o ON p.object_id = o.object_id INNER JOIN sys.system_internals_allocation_units au ON p.partition_id = au.container_id INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id INNER JOIN sys.partition_functions pf ON pf.function_id = ps.function_id INNER JOIN sys.partition_range_values rv ON pf.function_id = rv.function_id AND p.partition_number = rv.boundary_id GROUP BY o.name, pf.name, ps.name |
Here, the above result set returned the partition function (PF_MonthlyPartition) for adding the new range.
The following code helps to insert information to the new temp table for those partition functions that are required to SPLIT.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT table_name, PartitionFunction, PartitionScheme, LastPartitionRange, CONVERT(VARCHAR, DATEADD(MONTH, 1, LastPartitionRange), 25) AS NewRange, 'FG_' + CAST(FORMAT(DATEADD(MONTH, 1, LastPartitionRange),'MM') AS VARCHAR(2)) + '_' + CAST(YEAR(DATEADD(MONTH, 1, LastPartitionRange)) AS VARCHAR(4)) AS NewFileGroup, 'File_'+ CAST(FORMAT(DATEADD(MONTH, 1, LastPartitionRange),'MM') AS VARCHAR(2)) + CAST(YEAR(DATEADD(MONTH, 1, LastPartitionRange)) AS VARCHAR(4)) AS FileName, 'C:\Program Files\Microsoft SQL Server\MSSQL13.JRAIYANI\MSSQL\DATA\' AS file_path INTO #generateScript FROM #temp WHERE isRequiredMaintenance = 1 |
We can also generate a dynamic script to create File Group, File, add a new file group to partition scheme, and new range to the partition function as below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
DECLARE @filegroup NVARCHAR(MAX) = '' DECLARE @file NVARCHAR(MAX) = '' DECLARE @PScheme NVARCHAR(MAX) = '' DECLARE @PFunction NVARCHAR(MAX) = '' SELECT @filegroup = @filegroup + CONCAT('IF NOT EXISTS(SELECT 1 FROM AutoPartition.sys.filegroups WHERE name = ''',NewFileGroup,''') BEGIN ALTER DATABASE AutoPartition ADD FileGroup ',NewFileGroup,' END;'), @file = @file + CONCAT('IF NOT EXISTS(SELECT 1 FROM AutoPartition.sys.database_files WHERE name = ''',FileName,''') BEGIN ALTER DATABASE AutoPartition ADD FILE (NAME = ''',FileName,''', FILENAME = ''',File_Path,FileName,'.ndf'', SIZE = 5MB, MAXSIZE = UNLIMITED, FILEGROWTH = 10MB ) TO FILEGROUP ',NewFileGroup, ' END;'), @PScheme = @PScheme + CONCAT('ALTER PARTITION SCHEME ', PartitionScheme, ' NEXT USED ',NewFileGroup,';'), @PFunction = @PFunction + CONCAT('ALTER PARTITION FUNCTION ', PartitionFunction, '() SPLIT RANGE (''',NewRange,''');') FROM #generateScript EXEC (@filegroup) EXEC (@file) EXEC (@PScheme) EXEC (@PFunction) |
Here, the script is generated with individual variables, and the dynamic code is executed in the sequence. We have utilized most of the static values, but users can prepare them completely dynamic too. The above snippets should be taken in a single procedure, and that procedure can be configured with the SQL Server job.
Conclusion
Adding a new range to the partition function should be an automated task always. The partition range is always depending on the row size in the partition function. Table partitioning in SQL Server is always helpful in terms of maintaining large tables.
- Page Life Expectancy (PLE) in SQL Server - July 17, 2020
- How to automate Table Partitioning in SQL Server - July 7, 2020
- Configuring SQL Server Always On Availability Groups on AWS EC2 - July 6, 2020