A good friend recently asked me to explain the difference between Performance Data Collector, Performance Studio, and Performance Datawarehouse. Well, this is an interesting question, and I am certain many people have been trying to figure out how all of these concepts relate to one another. We at Microsoft have been adjusting the names of many of the new features, in an effort to properly show how the different features may be leveraged. But, as these names are adjusted it leads to confusion about where to find each feature and how to search for different topics online or in Books Online.
Let’s start with Performance Studio. This is simply the wrapper of four solutions: Data Collector, performance and diagnostic monitoring, Management Data Warehouse for historical reporting, and troubleshooting through Policy based management. This is the vision of a complete Microsoft solution for performance data collection, storage for historical analysis and trending, and diagnostic reporting and dashboards.
The term/concept “Performance Studio” has been in a number of different older Microsoft presentations, although I see fewer people referring to “Performance Studio” in more recent material. Some people have been using the terms Performance Studio and Data Collector interchangeably – but Data Collector is merely a component of the full Performance Studio vision. You will not find anything in Books Online for “Performance Studio”, but you will find a plethora of information related to each individual component.
Performance Data Collector
Performance Data Collector is synonymous with Data Collector. We have dropped Performance from the name in later documentation, including Books Online – and for very good reason. On the surface, the Data Collector appears to be a tool to collect performance related data but I truly believe it is a process that can be leveraged to collect all types of data - both system-related and business-related. The data collectors can be configured on each server to collect data and store on a central data warehouse.
When you configure data collection with Management Studio, you will see three system data collector sets which are installed by default. These are collecting data related to disk usage, query statistics (poor performing queries) and system utilization (CPU, memory, IO, network, etc). These “Data Collection Sets” are detailed very well in Books Online, and in the future I will work on a blog related to each of these collection sets. Each of these data collection sets may be enabled or disabled independently from one another.
And… You can create your own custom data collection sets which are defined to use a SQL Server Trace, performance counters or a T-SQL query. For example, you can set up a custom data collector which will execute a TSQL statement against a table and load the data into the performance data warehouse. Since I am always thinking about compliance (I am a lot of fun at partiesJ), I think about setting up a custom data collector on my servers to centralize the results of policy evaluation across my enterprise – a single source to report on compliance for my entire enterprise! There is no gui interface for defining custom data collectors in this current version - you must create these through scripts.
The Performance Data Warehouse
That is where we will end our journey. The Performance Data Warehouse has also been renamed to Management Data Warehouse – again to properly depict the vision of a solution for centralizing all types of data – not just performance related. At a very high level, the Management Data Warehouse is the destination for the data from the data collector. When you configure data collection, you define the server instance and database where you would like to store the data for the data collector. This database may exist on the same instance, or (more likely) on a separate instance on a different server in your enterprise. This single database may serve as the destination for one or more instances which are configured for data collection.
The Management Data Warehouse is initially created with tables in one of two schemas: core and snapshot. The core tables are dimension tables and the snapshot tables are the fact tables. When you create a custom data collector, a third schema custom_snapshots is created to support the custom data collector sets.