Clustering is not unique to SQL Server or even to
database platforms in general. A server cluster consists of two or more
servers, each configured identically, that are designed to consistently
serve up an application or platform even if an error or outage occurs on
one of the members of the cluster. Although this section focuses on how
to use clustering with SQL Server, you can use it to provide HA
capabilities for various platforms, such as Microsoft Exchange,
Microsoft Hyper-V, and many more.
This section puts the
spotlight on the failover clustering solution included in Windows Server
2008, but it is by no means the only clustering platform available to
you for your SharePoint and SQL Server environment. Other clustering
solutions are available in the marketplace to provide a viable HA
solution for your database environment, each offering unique
functionality, options, and challenges to give you some flexibility over
how you cluster your SharePoint database. Although some products may be
specific to the UNIX or Linux platforms, others, such as Symantec’s
Veritas Cluster Server, are completely compatible with SharePoint and
SQL Server and have been successfully implemented as enterprise
clustering solutions in the most demanding of situations.
Note
The decision to
highlight Windows Server failover clustering in this section is not
meant to endorse it as a clustering product or indict its competitors.
The goal is to show you how to implement a widely used clustering
product for your SharePoint and SQL Server environment, not laud one
product over another. It is up to you to evaluate the products in this
space and determine which one is the best solution for your enterprise,
its infrastructure, and its requirements. Like so many other aspects of
SharePoint, this is not a one-size-fits-all kind of situation.
The Server Components of Windows Server Failover Clustering
One advantage of clustering as
the HA solution for your SQL Server environment is the flexibility it
gives you in designing the architecture of your solution. To create a
cluster, you need at least two servers; that way you can create two
separate nodes within the cluster. Clustering’s flexibility is that you
can place more than one server in a node (failover clustering allows up
to 16 servers in a node, depending on the edition of Windows Server 2008
being used), and you are not
required to have the same number of servers in each node. So if you
want to create a node with one server and a node with two servers, that
option is available to you. You can also have up to 16 nodes in a
cluster. Each node is expected to be able to serve as the primary
provider of database services for the cluster, so that if a node is
taken down or suffers an outage, you can bring another node in the
cluster online to continue that service with no or little downtime.
Failover clustering is
available as an included component of Windows Server 2008’s Enterprise
and Datacenter editions. Microsoft is careful to state that failover
clustering is intended to be used as an HA solution but is not
completely fault tolerant. Fault tolerant
describes systems and solutions designed with an extremely high degree
of redundancy and the ability to provide nearly instantaneous recovery
times; the downside is that these systems often come with a
prohibitively high price tag to match. Failover clustering was designed
to enable systems to be highly available while using standardized,
cost-effective hardware and software, rather than the specialized
systems leveraged by a fault-tolerant solution. This is not to say that
failover clustering is necessarily a low-cost solution, but it can
implement an effective HA solution failover clustering at a much lower
cost than a fully tricked-out solution designed to be fault tolerant.
Some aspects of
clustering with failover clustering are inflexible—specifically the
hardware required for the servers in the cluster and the way that
hardware must be configured. The following list outlines the hardware
and networking needs you are likely to encounter for failover
clustering:
Servers.
As mentioned previously, at least two servers must be available to
create a database cluster with failover clustering. Unlike log shipping
and database mirroring, these servers cannot host databases that exist
outside the cluster. Take special care to evaluate the needs of your
database environment and confirm that the hardware configuration you
select can meet those needs in a clustered configuration. Note
In Windows Server 2003,
Microsoft Clustering Services (MSCS) required the hardware used for a
failover cluster to be on the Windows Hardware Compatibility List (HCL);
that’s changed in Windows Server 2008. Now, a failover cluster’s
hardware must be marked by its vendor as “Certified for Windows Server
2008,” and the entire configuration must be validated with a new tool,
the Validate a Configuration Wizard. (It is also known as the Cluster
Validation Tool, or CVT.) This tool consists of a series of simulations
and tests designed to confirm that the hardware planned for use in a
failover cluster meets the specifications necessary to run it. The
Validate a Configuration Wizard can also be run against a configured
failover cluster as an additional test of its configuration to further
ensure that it is ready for use, something we strongly encourage.
Identical configurations. Each server within the cluster must have an identical configuration for its RAM, CPU, system disk, and so on. Redundant network hardware. Each
server within the cluster must have at least two network interface
cards (NICs): one for communication with the clients accessing the
database server, and one to connect to its cluster node for heartbeat
and status updates. Advanced network hardware.
Each server within the cluster must be able to establish fast and
reliable communications with the other members of the cluster, usually
via specific hardware solutions such as a crossover cable (in the
simplest case) or fiber optic cable. Specialized storage.
Each server within the cluster must be able to access a centralized
storage system, such as a storage area network (SAN), to access the data
created, stored, and updated by a cluster, such as database files.
Failover clustering follows the “shared-nothing” model in its use of
storage within a cluster, meaning that all the servers in a cluster can
access the cluster’s storage repository, but it is updated and managed
by only one server at a time: the primary server or node in the cluster. Note
The maximum amount of shared
storage space that a SQL Server database can use when hosted in a
failover cluster is 2 terabytes (TB).
High-speed connection to shared storage.
Each server must have a high-speed connection to the cluster’s central
storage, such as Small Computer System Interface (SCSI), Fibre Channel
(FC), or Internet SCSI (iSCSI). Network resources.
At a minimum, you must provide a Network Basic Input/Output System
(NetBIOS) name and a unique static Internet Protocol (IP) address for
the cluster, as well as static IP addresses for all the NICs that
servers within the cluster use.
Configuring Windows Server Failover Clustering
After you have procured,
installed, and configured your hardware and network solution, you are
ready to start configuring a database failover cluster using SQL Server
2008 and failover clustering. When you have built your servers and
installed the Windows Server 2008 operating system on them, you must
complete some prerequisite steps in the operating system of each server:
Enable the failover clustering feature.
You can enable this feature from the Initial Configuration Tasks window
or the Server Manager in Windows Server 2008 Enterprise or Datacenter,
as well as Windows Server 2008 R2 Enterprise or Datacenter. Do not install antivirus. Microsoft recommends not installing antivirus software on the server nodes in your cluster, because it can cause conflicts or problems with MSCS. Do not compress hard drives. You must uncompress the hard drive on each server node where SQL Server is to be installed. Mount shared storage.
Windows Server allows additional drives or storage volumes to be
mounted, including those presented via shared storage. It also requires a
drive letter to be assigned to each drive when it is mounted, which
limits a server to 25 mount points. You can avoid this latter limitation
by mounting a local physical drive to a letter, such as D, and then
mounting your shared volumes as directories under the D volume, a
process known as a mount-point folder path.
Your system should now be
ready for failover clustering to be configured and a cluster to be
created with at least two servers functioning as nodes within the
cluster. Unfortunately, this article cannot provide a walkthrough of how
to configure a failover cluster; the shared storage required by the
cluster is not a resource that you can easily acquire, and the available
technical resources for creating the scenarios and walkthroughs in this
book do not include shared storage. The following list highlights
several issues to consider as part of planning and configuring your
server cluster with failover clustering for it to host a SQL Server
database instance:
Cluster service account.
Microsoft recommends the creation of a service account to be used as
the identity of the failover clustering service running on each server
node in the cluster. This account must be a domain account granted Local
Administrator rights on every server in the cluster. This account must
also be able to log into your clustered SQL Server database instance
with public rights to monitor its status. By default, the server’s Local
Administrators group has this right, but in some cases database
administrators remove that access as a security measure. SQL Server service accounts.
The service accounts to be used as the identity of SQL Server’s various
services running on each server node in the cluster must be domain
accounts, not local accounts on each server node. Turning on and off server nodes and storage.
Review Microsoft’s instructions for configuring failover clustering
(http://technet.microsoft.com/en-us/library/dd197547%28WS.10%29.aspx),
because they contain specific information regarding when to turn on and
off the various server nodes and storage resources during a cluster’s
configuration. Quorum mode.
With Windows Server 2008, Microsoft changed the way failover clustering
tracked the status and health of the cluster. MSCS previously used a
storage resource, called a quorum disk,
to store the cluster’s configuration data and log files on a dedicated
volume, which was inevitably a single point of failure. Failover
clustering’s new approach for determining quorum requires that each node
submit a “vote” for its status, and if a majority of
votes are available, the cluster has achieved quorum. This removes the
dependency on a single item, making failover clusters much more stable.
You can actually use multiple quorum modes in a failover cluster; the
Validate a Configuration Wizard recommends a quorum mode when it runs,
and Microsoft’s advice is to use that recommendation unless you have
specific reasons to select another mode. Failover Cluster Management application.
If your installed version of Windows Server 2008 includes the failover
clustering feature, you can find this application in the Start menu’s
Administrative Tools directory. This is the tool you must use to create
and manage your server clusters. Cluster name.
The name of your cluster should follow Domain Name Services (DNS)
naming rules. You can use upper- and lowercase letters, numbers, and
dashes in the name, which must be between 1 and 63 characters. The name
should also be unique within its parent domain. Storage configuration options.
When running the New Server Cluster Wizard through the Cluster
Administrator tool, in its Select Computer page, you are prompted to
enter the name of the first computer to be added as a node in the new
cluster. This page also includes an Advanced button that, when clicked,
opens a dialog box where you can allow the wizard to automatically
configure the cluster’s shared storage (called the Typical
configuration) or to manually do it yourself (Advanced configuration).
With the Typical configuration, the wizard selects all the disks in the
mounted shared source as disks available to the cluster and creates
resources within the cluster for these disks. If you select the Advanced
configuration, you must use the Cluster.exe executable to configure the cluster’s shared storage. Heartbeat.
After you’ve created the cluster and added additional server nodes to
it, make sure to configure the heartbeats that the cluster uses to
confirm that the network interfaces for each node are functioning
properly. Without this configuration, the cluster has no way to know if a
server node is available within a cluster. Configuration review and testing.
Just because you have successfully created and configured your cluster
does not mean your work is done. You should immediately test your
cluster and confirm that it functions without error and is able to
successfully fail over from one node to another when the primary node is
unavailable. Review all server logs to confirm that no errors are being
reported within the cluster. You should establish regular tests of this
process, and any other cluster functions that you find mission
critical, to verify that the cluster continues to function as designed. Now
that you have created your failover cluster, complete with at least two
server nodes within it, you are ready to install SQL Server and create
your database instance in the failover cluster. As with the creation of
the server cluster, due to resource limitations, it’s not possible to
provide you with a detailed description of the steps necessary to create
your database instance successfully. However, the following is a checklist of items that you should review and evaluate while completing the process: Follow SQL Server security best practices.
Configure your new instance with the same security settings and
measures as nonclustered instances, while taking into account the
special requirements of the cluster service account and the fact that
your SQL Server service accounts must be domain accounts. Install SQL Server on a cluster.
To install SQL Server on each server node in the cluster, simply log on
to the cluster at its shared IP address (rather than the address of the
server acting as the active node in the cluster) and run the SQL Server
installer. SQL Server is built to be aware of and work in a clustered
environment. The installer can detect the cluster environment and
install the software to each server node in the cluster you select
through the wizard. Validate the components to install.
If you are installing SQL Server via the GUI wizard, check the Create a
SQL Server Failover Cluster check box in the Components to Install
page. It appears as an indented item underneath the SQL Server Database
Services check box and is not checked by default. You must check it for
the installer to install SQL Server to all the nodes within the cluster. Determine how to name your instances.
You can create failover clusters using either the default instance for
the cluster or a named instance. The choice is up to you. Review your failover configuration.
Installing a single database instance in the cluster is referred to as
an Active/Passive failover configuration. You can also configure
multiple instances to be hosted within a single cluster, referred to as
an Active/Active failover configuration. In an Active/Active
configuration, you must assign each instance a different primary server
within the cluster. This configuration allows SharePoint’s databases to
be separated between the instances for scalability purposes. Caution
If you are considering
implementing an Active/Active failover configuration, remember that in a
failover scenario, multiple active clusters can be hosted on a single
node within the cluster. This means that each node in the cluster must
be configured with sufficient hardware resources to host both clusters,
or you must be willing to accept degraded performance for both clusters
until an additional node can be brought online to accept one of the
active clusters.
Correctly name the virtual server.
The value provided in the Virtual Server Name page of the installation
wizard should be the name of the cluster, not the name of the current
active node within the cluster. Install SQL Server on every node in the cluster. In the Cluster Node Configuration page, select every server node in the cluster so that SQL Server is installed to all of them. Test your system. When
the installation wizard completes, completely test your system to
confirm that the database instance is available to client connections,
is not reporting errors, and can be successfully failed over from one
node to another. Establish regular tests of this process, and any other
cluster functions that you find mission critical, to verify that the
cluster continues to function as designed.
SharePoint and Database Clustering
Now that you have
successfully created a failover cluster for a SQL Server database
instance, you can consider the implications of using that instance to
host SharePoint databases. One major advantage to the use of a failover
cluster for your SharePoint database instance(s) is that you can use it
to host all types of SharePoint databases without a special
configuration (beyond what it takes to create and configure the
cluster). The only step requiring specific attention is how you identify
the address of the database instance when creating the SharePoint farm;
you must submit the name of the cluster, not the name of the active
server node for the cluster.
Note
SharePoint 2010 requires
that SQL Server 2008 be patched at least to Service Pack 1 (SP1) and
Cumulative Update 2 (CU2) if using it in a failover cluster.
SharePoint views the
clustered instance as it does any other database instance. During
installation of your farm, it creates all its needed databases without
error. The configuration, Central Administration content, and Search
databases can be hosted in the clustered instance because the name of
the cluster is used and written to these databases instead of the name
of the active server node in the cluster. So, in the case of an outage
on the active server node, when the cluster fails over to another server
node, you can still use these databases. The only outage that
SharePoint experiences is during the failover itself; when the new
active node comes online in the cluster, service is returned to normal
without requiring updates to the SharePoint farm.
Note
Keep in mind that you can
use SQL aliases with a failover cluster, even though the address for the
clustered instance that SharePoint uses does not change regardless of
which node in the cluster is active. You should still consider using SQL
aliases to further abstract the location of the clustered instance away
from SharePoint to give yourself greater flexibility and scalability
with your SQL Server back end.
Database Clustering Pros
Database clustering
is a powerful, high-availability tool for SQL Server 2008 that offers
several reasons for being a viable option for your SharePoint databases.
The following list covers the most compelling reasons for its use:
True automatic failover.
When an active node within a cluster suffers an outage or failure, the
cluster automatically fails over to another node within the cluster.
Because SharePoint references
the identity of the cluster and not a specific node within it, you do
not need to update a farm’s configuration data to recognize the change
in database hosts. Patching without outages.
You can complete Windows and SQL Server patching without making the
cluster itself unavailable. Simply apply your patches to the inactive
nodes in the cluster, then fail over the cluster manually to those
update nodes, and patch the remaining nodes. You can do this without
interrupting the services that the cluster provides by taking advantage
of the cluster’s failover functionality. Rapid failover.
Clustering your database means that, in the case of an outage, your
system has a drastically shorter time to return to normal service. It
only takes the amount of time required for the cluster administration
process to switch over to another server in the node; no manual
intervention or configuration is required to implement the failover. Scalable.
Because Windows Server 2008, failover clustering, and SQL Server 2008
support up to 16 server nodes within a cluster and use flexible shared
technology for storage, you can configure your clustering solution in a
variety of ways to meet the needs of your system and easily expand it to
grow with your system. Compatible with log shipping.
Like database mirroring, databases hosted with a cluster can be log
shipping to another instance to provide even more redundancy for your
data. Note
Failover clustering is
also compatible with database mirroring, but we wouldn’t recommend it
because of the complexity and high costs of implementing such a hybrid
solution.
Choice of SQL Server backup model.
Unlike in database mirroring, you can back up databases in a cluster
using any backup model. The only exception to this is if you are also
using log shipping or database mirroring with your cluster, in which
case the constraints of the associated technology also apply.
Database Clustering Cons
Unfortunately,
database clustering also comes with some disadvantages that can prove to
be stumbling blocks to its implementation. Following are those
disadvantages:
Network requirements.
Although server nodes within a cluster can be located in separate
datacenters, the bandwidth requirements for heartbeats and shared
storage connectivity mean that nodes usually cannot be more than a few
miles from one another. SAN storage requirements.
The technology required to implement shared storage, from both a
hardware and software perspective, requires special expertise to
implement, operate, and maintain. This also adds a dependency on yet
another system for your SharePoint environment’s overall health and
well-being. Costs. In
addition to the effort required to implement shared storage, the
hardware and software for the technology come at a high price. Various
providers and configurations are available in the marketplace, but even
the low end of the cost spectrum may prove prohibitive for your budget. Fault tolerance.
Log shipping and database mirroring provide a certain level of fault
tolerance because the redundant data they preserve is stored on a
storage medium completely separate from that of its source. Because
clustering uses shared storage to store the data files for your
databases, an outage to that shared storage configuration affects your
entire cluster and the applications that use it.
|