The health of your new SharePoint 2013
deployment is very important. Your organization, you, and your
administration team have likely spent considerable time installing,
configuring, and deploying SharePoint to accommodate the needs of the
enterprise. In my time as a SharePoint architect, I have seen a number
of organizations stop here, but the fact of the matter is that
SharePoint requires a certain amount of care and feeding, just like any
enterprise computer system. This is not to say that SharePoint left
alone will fall over in time, but as more users pump data into the
system, eating up storage space, and the system grows a larger user
base, administrators should expect to monitor SharePoint and the
underlying server infrastructure for stress areas and efficiency
optimization.
Organizations understand that it is costly to
stand up large-scale enterprise systems, and they rely on them as an
integral part of their daily business. Spending more money ensuring
that such systems remain healthy and sustain significant uptime is just
as important as the upfront investment in the creation of the system.
Consider how much money an organization might lose if its core
information system falls over and suffers downtime.
In the previous versions of SharePoint,
administrators tended to work in reactive mode—typically, users of the
system would report performance issues or loss of access to their data
in SharePoint, and the IT department would then jump on the case to
rectify the issue. SharePoint now provides health and monitoring
features to give the IT group a heads-up of potential issues in the
platform, long before users ever see an issue.
1. Logging
Logging is an important part of health
monitoring because it is via various log files that SharePoint may
alert administrators to issues in the system. The Unified Logging
Service (ULS provides administrators with an extensive dump of
information, warnings, and errors occurring in the platform. When
something goes wrong, the user typically sees either a custom-developed
“oops” message in his or her browser, or a default SharePoint error
message. It is the job of SharePoint administrators to find out what
went wrong, and the ULS logs will likely give an indication of the
problem—especially if it is recurring.
Note By default, the ULS logs live on each SharePoint 2013 server in the Logs folder of the hive, typically c:\program files\common files\Microsoft shared\web server extensions\15\logs.
Figure 1
shows the explorer view of the ULS log folder on my SharePoint 2013
development server. The log folder consists of a number of files, both
log and usage files (all text files), that have a file name in the
format of year, month, day, and time. If you crack open any of the log
files you can see lots of detail, reported by the various functional
areas of the SharePoint platform—notice that the Timer Service reports
lots of information events.
Viewing the ULS log files in the raw is not
always helpful. Fortunately, you can download a ULS viewer application
to browse ULS.
Note Download the ULS viewer tool from http://archive.msdn.microsoft.com/ULSViewer.
SharePoint allows you to fine-tune the ULS log
files to contain information most important to you. The Trace Log
Windows Service, which controls output of the ULS log files, also
operates in a variety of verbosity modes, ranging from error reporting
to very detailed information for every action in the platform. As you
might expect, Central Administration is the place to configure the ULS
settings, as demonstrated in the following steps:
- Open Central Administration.
- Click on the Monitoring link.
- Click the Configure Diagnostic Logging link.
- SharePoint shows a page like that in Figure 2.
- Expand the Categories node.
- Specify the types of events you wish SharePoint to log in the ULS logs.
- When an error occurs in the platform, SharePoint reports events to
both the ULS and Windows event log; you may control the severity
(verbosity level) of events logged to both in the Throttling section of
the page.
Note This
page does not show you the current configuration for throttling; it
defaults to empty drop-down controls and no categories selected.
- Flood protection consists of preventing SharePoint logging
the same repeated event to the Windows event log when a consistent
problem arises. For example, if a timer service job runs every five
minutes and fails, you really do not want hundreds of event log errors
of the same message because an administrator did not get to the issue
for a few hours.
- Finally, the Trace Log section defines the location of ULS log
files, the number of days of history to store, and the maximum size of
log files.
Note When
changing settings for diagnostic logging, I recommend you restart the
SharePoint 2013 Tracing Service in Windows Services. Also, stop this
service if you need to delete any of the ULS log files.