How to set up Data Collector to monitor performance
Data Collector is supported only in Microsoft SQL Server 2008 and higher, in the Enterprise, Standard, Business Intelligence, and Web editions. SQL Server Integration Services must be installed
There are several features you should set up and/or enable: SQL Server Agent, Management Data Warehouse, and Data Collection
- Make sure SQL Server Agent is running. If not, right-click it and select Start
- In SQL Server Management Studio Object Explorer, expand Management
Right-click Data Collection and select Configure Management Data Warehouse
Select Create or upgrade a management data warehouse as a task to perform
- Click Next
Select the SQL Server instance where the captured data will be stored. It’s recommended to select a different server from the one where data will be captured, to avoid capturing data for the storage database and reduce overhead
It’s also recommended to have a database that will only store collected data. If you haven’t created it earlier, click New and do it here
- Click Next
The Management Data Warehouse (MDW) uses three database roles: mdw_admin, mdw_reader, and mdw_writer. Map the appropriate roles to existing SQL Server logins, or click New Login to create a new login first
“The mdw_admin role will allow users to read, write, update and execute SQL Server Agent which can purge and cleanup data available within a MDW database.
The mdw_reader role will allow users to read data which is available within a MDW database.
The mdw_writer role will allow users to write and upload data to a MDW database” 
- Click Next
- Once MDW is configured, right-click Data Collection and select Configure Management Data Warehouse again
This time, select the Set up data collection option
Note that this sequence of actions and selecting the same option again is necessary. First time, you have to create the MDW. In the second run, you configure the SQL Server instance for data collecting. If you run the Set up data collection option without creating the MDW first, there will be no databases in the drop-down list, you will not be able to create a new one in this wizard and configure data collecting
Select the DataCapture database created earlier. The Cache directory is the directory where data is collected locally before uploading into the MDW. This option is used for the collection sets that upload data in cached mode
Data Collection pre-defined sets
To see the data collection sets created by Management Data Warehouse Wizard, expand Data Collection in Object Explorer
Three data collection sets are created. They collect the data that is commonly needed to diagnose and troubleshoot performance issues. Each collection set can be manually stopped and started, and the data collection and upload into the MDW can be manually initiated. Data is collected for all databases on the SQL Server instance. That’s why it’s not recommended to have the MDW on the same instance
The Disk Usage data collection set collects information about disk space used by the database data and log files
It has two collection items – one for collecting data for the data files (MDF, NDF), and the other for the transaction log file (LDF)
“The collection set gathers the following data:
- Snapshots of data file sizes obtained from the sys.partitions and sys.allocation_units views.
- Snapshots of log file sizes obtained from the DBCC SQLPERF (LOGSPACE) command.
- Snapshots of I/O statistics from the sys.dm_io_virtual_file_stats function.”
Code for each of the collection items is shown in the Input parameters field. As the Collection Frequency column shows, data for both collection items is collected every 60 seconds
By default, collected data is uploaded into the MDW every 6 hours. To change this behavior, click Pick and select one of existing schedules, or click New and create a new one
The data collection and upload mode can also be changed
By default, collected data is kept in the MDW for 2 years. As keeping collected data for so long can take up much of hard disk space, it’s recommended to reduce this number. First, determine how far back you need the collected data for analysis. Then calculate how much space data collecting for all your SQL Server instances will take. Keep in mind that in high transaction databases, MDW growth can be 200 – 300 MB a day
The Query Statistics data collection set has a single collection item that collects information about query statistics, activity, execution plans, and code on the SQL Server instance
“This collection set collects data from the following sources:
- sys.dm_exec_requests, sys.dm_exec_sessions, sys.dm_exec_query_stats and other related dynamic management views.
- The text of selected batches and queries.
- The plan of selected batches and queries.
- The normalized text of selected batches.”
By default, it collects data every 10 seconds and uploads it in the cached mode every 15 minutes, as shown in the Uploads tab
The Server Activity data collection set collects information about resources used by the SQL Server instance (processor, memory, disk I/O, and network usage), SQL Server instance activity (batch requests, logouts, SQL compilations and recompilations, transactions, user connections, and logins) and waits
“This collection set collects data samples from the following dynamic management views:
- sys.dm_exec_sessions, sys.dm_exec_requests, sys.dm_os_waiting_tasks (using a joined query)
Additionally, data samples are collected from several system and SQL Server performance counters.”
Also, several SQL Server Agent jobs that collect and upload the data are created
“The following rules govern job creation:
- Data collection by different collection sets are executed as separate jobs.
- Collection sets that use cached collection mode are scheduled as jobs that start when the SQL Server Agent starts. These jobs run continuously and are controlled by the data collector run-time component.
- Collection sets that use non-cached collection mode are scheduled as normal SQL Server Agent jobs. These jobs use a SQL Server Agent schedule that matches the schedule defined in the collection set. The actual duration of execution can be customized by the user.”
As shown, Data Collection set up requires a number of steps that can be executed using SQL Server Management Studio options. The feature is not available in all Microsoft SQL Server editions, only in Enterprise, Standard, Business Intelligence, and Web. While the feature provides built-in capturing of the most commonly needed performance metrics, adding new metrics to data collecting requires coding. All databases on the SQL Server instance have to be monitored, which adds overhead. Unlike Activity Monitor, Data Collection enables storing captured data for a specified time period and thus enables detailed performance analysis
In the next part of this article, we will show the reports available in Microsoft SQL Server Management Studio, metrics collected, and how they can help with troubleshooting performance issues
She has been working with SQL Server since 2005 and has experience with SQL 2000 through SQL 2014.
Her favorite SQL Server topics are SQL Server disaster recovery, auditing, and performance monitoring.
View all posts by Milena "Millie" Petrovic
Latest posts by Milena Petrovic (see all)
- Using custom reports to improve performance reporting in SQL Server 2014 – running and modifying the reports - September 12, 2014
- Using custom reports to improve performance reporting in SQL Server 2014 – the basics - September 8, 2014
- Performance Dashboard Reports in SQL Server 2014 - July 29, 2014