Once configured for data collection, a SQL Server
instance is enabled with three default system collection sets and the
ability to define custom sets if required. In both cases, there are a
number of important considerations about upload frequencies/methods and
retention periods. All of these properties can be accessed and modified
in SQL Server Management Studio under the Management > Data
Collection folder by right-clicking on a collection set and choosing
Properties.
1. Upload method and frequency
Each data collection set
has its own data collection and upload method, with the two options
being Non-cached and Cached. As the name suggests, the cached method
collects and caches data locally before uploading to the MDW. In
contrast, the non-cached method collects and uploads at the same time,
using the same schedule.
Non-cached mode is
typically used for lightweight and/or infrequent collection sets such as
Disk Usage. In contrast, both the Query Statistics and Server Activity
sets should (and do) use cached mode because their collected content is
greater and occurs more frequently.
Non-cached mode collects
and uploads on the same schedule, which can be specified via the General
page of the collection set's Properties window by clicking the Pick or
New button. In figure 1, the Server Activity collection set is defined with the cached upload method. Earlier in the section (figure 4)
we covered the initial configuration of the data collection platform,
part of which was selecting a cache directory. This directory is used by
each data collection set using cached mode.
When cached mode is
selected, the Uploads page of the collection set's Properties window
lets you select or create an upload schedule, as per figure 2. The collection schedule is specified using the General page, as shown in figure 1, by entering a value in the Collection Frequency (sec) column.
When a large number of
servers are frequently collecting and uploading large collection sets,
the collective data volume may overwhelm the centralized MDW server.
Cached mode allows each uploading server to collect and upload on a
staggered schedule, thereby reducing the impact on the MDW server. For
example, one server may upload hourly on the hour, with the next server
uploading at 15 minutes past the hour, and so forth.
|
Once the upload mode
and schedules are defined for each collection set, SQL Server Agent
jobs are used to execute the collection and upload to the MDW. For a
cached mode collection set, two agent jobs will be used: one for the
collection and another for the upload. In contrast, a non-cached
collection set will have a single job for both collection and upload.
The names given to the SQL Server Agent jobs are based on their collection set number, as shown in figure 3.
The jobs can be renamed in order to easily identify the correlation
between a SQL Server Agent job and the collection set it's servicing.
For example, the collection_set_2_collection/upload jobs can be renamed
to Server Activity Collection and Server Activity Upload.
Before changing
the collection mode and/or collection and upload schedules, make sure
you understand the performance impacts of doing so, particularly in a
production environment with a large number of servers uploading to a
central MDW database. The reporting benefits of scheduling frequent
uploads need to be balanced against the performance impact on the MDW
server that the data is being loaded to. Staggering the uploads can
certainly help in reducing the load impact on the MDW database.
In addition to the
upload mode and schedule for each instance, another important
consideration is the backup of the MDW database.
2. Backup considerations
In a
production environment using the data collection platform, there are a
number of additional backup considerations, summarized as follows:
MDW database—Depending
on the collection sets and upload frequency, the MDW database can
expect to receive approximately 300MB of uploaded data per server per day.
It's easy to see how this database can grow very rapidly. Therefore,
you need to carefully consider doing backups and monitoring disk space
on the MDW server as well as archiving, which we'll cover shortly.
MDW recovery model—By
default, the MDW database is created with the simple recovery model. In
a production environment, you need to change this to the full recovery
model, unless you can accept the possibility of losing data collected
since the last full backup.
MSDB database—Each
uploading instance's collection sets, upload schedules, and log
histories are defined and stored in the MSDB database of each instance.
Regardless of whether the data collection platform is used, this
database should be backed up; however, with an active collection
configuration, the need to back up this database becomes even more
important.
To assist with
containing the growth of the MDW database, each collection set is
defined with a retention period, which we'll discuss next.
3. Retention period
As we've just discussed,
the MDW database will grow by approximately 300MB per server per day.
This obviously makes containing the growth of this database very
important, particularly in a large enterprise environment with a central
MDW database and many uploading servers. Fortunately, as we saw in figure 15.5, each collection set is defined with a retention period, specified as a number of days.
When the data
collection upload job executes, previous data from the collection set
that's older than the retention period is removed from the database. It
follows that for a given number of uploading servers, the MDW database
will grow to a certain size and then stabilize. The retention period of
each collection set should be based on a number of factors such as the
reporting requirements and available disk space.
Typically, things don't
always run according to plan, with a variety of potential problems
preventing the successful collection and upload of a collection set's
data. Fortunately, the logging component allows you to inspect the
history and detail of each set's activity.
15.3.4. Logging
Right-clicking a collection set (or data collection) and selecting View Logs brings up the Log File Viewer, which, as shown in figure 4, presents a detailed history of the collection and upload process.
As mentioned earlier, in addition to the default system data collection sets, you can create custom collection sets.