SQL Server 2008 High Availability : Database Clustering

6/23/2011 3:55:35 PM

Clustering is not unique to SQL Server or even to database platforms in general. A server cluster consists of two or more servers, each configured identically, that are designed to consistently serve up an application or platform even if an error or outage occurs on one of the members of the cluster. Although this section focuses on how to use clustering with SQL Server, you can use it to provide HA capabilities for various platforms, such as Microsoft Exchange, Microsoft Hyper-V, and many more.

This section puts the spotlight on the failover clustering solution included in Windows Server 2008, but it is by no means the only clustering platform available to you for your SharePoint and SQL Server environment. Other clustering solutions are available in the marketplace to provide a viable HA solution for your database environment, each offering unique functionality, options, and challenges to give you some flexibility over how you cluster your SharePoint database. Although some products may be specific to the UNIX or Linux platforms, others, such as Symantec’s Veritas Cluster Server, are completely compatible with SharePoint and SQL Server and have been successfully implemented as enterprise clustering solutions in the most demanding of situations.

Note

The decision to highlight Windows Server failover clustering in this section is not meant to endorse it as a clustering product or indict its competitors. The goal is to show you how to implement a widely used clustering product for your SharePoint and SQL Server environment, not laud one product over another. It is up to you to evaluate the products in this space and determine which one is the best solution for your enterprise, its infrastructure, and its requirements. Like so many other aspects of SharePoint, this is not a one-size-fits-all kind of situation.

The Server Components of Windows Server Failover Clustering

One advantage of clustering as the HA solution for your SQL Server environment is the flexibility it gives you in designing the architecture of your solution. To create a cluster, you need at least two servers; that way you can create two separate nodes within the cluster. Clustering’s flexibility is that you can place more than one server in a node (failover clustering allows up to 16 servers in a node, depending on the edition of Windows Server 2008 being used), and you are not required to have the same number of servers in each node. So if you want to create a node with one server and a node with two servers, that option is available to you. You can also have up to 16 nodes in a cluster. Each node is expected to be able to serve as the primary provider of database services for the cluster, so that if a node is taken down or suffers an outage, you can bring another node in the cluster online to continue that service with no or little downtime.

Failover clustering is available as an included component of Windows Server 2008’s Enterprise and Datacenter editions. Microsoft is careful to state that failover clustering is intended to be used as an HA solution but is not completely fault tolerant. Fault tolerant describes systems and solutions designed with an extremely high degree of redundancy and the ability to provide nearly instantaneous recovery times; the downside is that these systems often come with a prohibitively high price tag to match. Failover clustering was designed to enable systems to be highly available while using standardized, cost-effective hardware and software, rather than the specialized systems leveraged by a fault-tolerant solution. This is not to say that failover clustering is necessarily a low-cost solution, but it can implement an effective HA solution failover clustering at a much lower cost than a fully tricked-out solution designed to be fault tolerant.

Some aspects of clustering with failover clustering are inflexible—specifically the hardware required for the servers in the cluster and the way that hardware must be configured. The following list outlines the hardware and networking needs you are likely to encounter for failover clustering:

Servers. As mentioned previously, at least two servers must be available to create a database cluster with failover clustering. Unlike log shipping and database mirroring, these servers cannot host databases that exist outside the cluster. Take special care to evaluate the needs of your database environment and confirm that the hardware configuration you select can meet those needs in a clustered configuration.
Note

In Windows Server 2003, Microsoft Clustering Services (MSCS) required the hardware used for a failover cluster to be on the Windows Hardware Compatibility List (HCL); that’s changed in Windows Server 2008. Now, a failover cluster’s hardware must be marked by its vendor as “Certified for Windows Server 2008,” and the entire configuration must be validated with a new tool, the Validate a Configuration Wizard. (It is also known as the Cluster Validation Tool, or CVT.) This tool consists of a series of simulations and tests designed to confirm that the hardware planned for use in a failover cluster meets the specifications necessary to run it. The Validate a Configuration Wizard can also be run against a configured failover cluster as an additional test of its configuration to further ensure that it is ready for use, something we strongly encourage.
Identical configurations. Each server within the cluster must have an identical configuration for its RAM, CPU, system disk, and so on.
Redundant network hardware. Each server within the cluster must have at least two network interface cards (NICs): one for communication with the clients accessing the database server, and one to connect to its cluster node for heartbeat and status updates.
Advanced network hardware. Each server within the cluster must be able to establish fast and reliable communications with the other members of the cluster, usually via specific hardware solutions such as a crossover cable (in the simplest case) or fiber optic cable.
Specialized storage. Each server within the cluster must be able to access a centralized storage system, such as a storage area network (SAN), to access the data created, stored, and updated by a cluster, such as database files. Failover clustering follows the “shared-nothing” model in its use of storage within a cluster, meaning that all the servers in a cluster can access the cluster’s storage repository, but it is updated and managed by only one server at a time: the primary server or node in the cluster.
Note

The maximum amount of shared storage space that a SQL Server database can use when hosted in a failover cluster is 2 terabytes (TB).
High-speed connection to shared storage. Each server must have a high-speed connection to the cluster’s central storage, such as Small Computer System Interface (SCSI), Fibre Channel (FC), or Internet SCSI (iSCSI).
Network resources. At a minimum, you must provide a Network Basic Input/Output System (NetBIOS) name and a unique static Internet Protocol (IP) address for the cluster, as well as static IP addresses for all the NICs that servers within the cluster use.
Note

For more detailed information from Microsoft on the hardware requirements of Windows Server 2008 failover clustering, see http://technet.microsoft.com/en-us/library/dd197454%28WS.10%29.aspx.

Configuring Windows Server Failover Clustering

After you have procured, installed, and configured your hardware and network solution, you are ready to start configuring a database failover cluster using SQL Server 2008 and failover clustering. When you have built your servers and installed the Windows Server 2008 operating system on them, you must complete some prerequisite steps in the operating system of each server:

Enable the failover clustering feature. You can enable this feature from the Initial Configuration Tasks window or the Server Manager in Windows Server 2008 Enterprise or Datacenter, as well as Windows Server 2008 R2 Enterprise or Datacenter.
Do not install antivirus. Microsoft recommends not installing antivirus software on the server nodes in your cluster, because it can cause conflicts or problems with MSCS.
Do not compress hard drives. You must uncompress the hard drive on each server node where SQL Server is to be installed.
Mount shared storage. Windows Server allows additional drives or storage volumes to be mounted, including those presented via shared storage. It also requires a drive letter to be assigned to each drive when it is mounted, which limits a server to 25 mount points. You can avoid this latter limitation by mounting a local physical drive to a letter, such as D, and then mounting your shared volumes as directories under the D volume, a process known as a mount-point folder path.

Your system should now be ready for failover clustering to be configured and a cluster to be created with at least two servers functioning as nodes within the cluster. Unfortunately, this article cannot provide a walkthrough of how to configure a failover cluster; the shared storage required by the cluster is not a resource that you can easily acquire, and the available technical resources for creating the scenarios and walkthroughs in this book do not include shared storage. The following list highlights several issues to consider as part of planning and configuring your server cluster with failover clustering for it to host a SQL Server database instance:

Cluster service account. Microsoft recommends the creation of a service account to be used as the identity of the failover clustering service running on each server node in the cluster. This account must be a domain account granted Local Administrator rights on every server in the cluster. This account must also be able to log into your clustered SQL Server database instance with public rights to monitor its status. By default, the server’s Local Administrators group has this right, but in some cases database administrators remove that access as a security measure.
SQL Server service accounts. The service accounts to be used as the identity of SQL Server’s various services running on each server node in the cluster must be domain accounts, not local accounts on each server node.
Turning on and off server nodes and storage. Review Microsoft’s instructions for configuring failover clustering (http://technet.microsoft.com/en-us/library/dd197547%28WS.10%29.aspx), because they contain specific information regarding when to turn on and off the various server nodes and storage resources during a cluster’s configuration.
Quorum mode. With Windows Server 2008, Microsoft changed the way failover clustering tracked the status and health of the cluster. MSCS previously used a storage resource, called a quorum disk, to store the cluster’s configuration data and log files on a dedicated volume, which was inevitably a single point of failure. Failover clustering’s new approach for determining quorum requires that each node submit a “vote” for its status, and if a majority of votes are available, the cluster has achieved quorum. This removes the dependency on a single item, making failover clusters much more stable. You can actually use multiple quorum modes in a failover cluster; the Validate a Configuration Wizard recommends a quorum mode when it runs, and Microsoft’s advice is to use that recommendation unless you have specific reasons to select another mode.
Failover Cluster Management application. If your installed version of Windows Server 2008 includes the failover clustering feature, you can find this application in the Start menu’s Administrative Tools directory. This is the tool you must use to create and manage your server clusters.
Cluster name. The name of your cluster should follow Domain Name Services (DNS) naming rules. You can use upper- and lowercase letters, numbers, and dashes in the name, which must be between 1 and 63 characters. The name should also be unique within its parent domain.
Storage configuration options. When running the New Server Cluster Wizard through the Cluster Administrator tool, in its Select Computer page, you are prompted to enter the name of the first computer to be added as a node in the new cluster. This page also includes an Advanced button that, when clicked, opens a dialog box where you can allow the wizard to automatically configure the cluster’s shared storage (called the Typical configuration) or to manually do it yourself (Advanced configuration). With the Typical configuration, the wizard selects all the disks in the mounted shared source as disks available to the cluster and creates resources within the cluster for these disks. If you select the Advanced configuration, you must use the Cluster.exe executable to configure the cluster’s shared storage.
Heartbeat. After you’ve created the cluster and added additional server nodes to it, make sure to configure the heartbeats that the cluster uses to confirm that the network interfaces for each node are functioning properly. Without this configuration, the cluster has no way to know if a server node is available within a cluster.
Configuration review and testing. Just because you have successfully created and configured your cluster does not mean your work is done. You should immediately test your cluster and confirm that it functions without error and is able to successfully fail over from one node to another when the primary node is unavailable. Review all server logs to confirm that no errors are being reported within the cluster. You should establish regular tests of this process, and any other cluster functions that you find mission critical, to verify that the cluster continues to function as designed.
Now that you have created your failover cluster, complete with at least two server nodes within it, you are ready to install SQL Server and create your database instance in the failover cluster. As with the creation of the server cluster, due to resource limitations, it’s not possible to provide you with a detailed description of the steps necessary to create your database instance successfully. However, the following is a checklist of items that you should review and evaluate while completing the process:
Follow SQL Server security best practices. Configure your new instance with the same security settings and measures as nonclustered instances, while taking into account the special requirements of the cluster service account and the fact that your SQL Server service accounts must be domain accounts.
Install SQL Server on a cluster. To install SQL Server on each server node in the cluster, simply log on to the cluster at its shared IP address (rather than the address of the server acting as the active node in the cluster) and run the SQL Server installer. SQL Server is built to be aware of and work in a clustered environment. The installer can detect the cluster environment and install the software to each server node in the cluster you select through the wizard.
Validate the components to install. If you are installing SQL Server via the GUI wizard, check the Create a SQL Server Failover Cluster check box in the Components to Install page. It appears as an indented item underneath the SQL Server Database Services check box and is not checked by default. You must check it for the installer to install SQL Server to all the nodes within the cluster.
Determine how to name your instances. You can create failover clusters using either the default instance for the cluster or a named instance. The choice is up to you.
Review your failover configuration. Installing a single database instance in the cluster is referred to as an Active/Passive failover configuration. You can also configure multiple instances to be hosted within a single cluster, referred to as an Active/Active failover configuration. In an Active/Active configuration, you must assign each instance a different primary server within the cluster. This configuration allows SharePoint’s databases to be separated between the instances for scalability purposes.
Caution

If you are considering implementing an Active/Active failover configuration, remember that in a failover scenario, multiple active clusters can be hosted on a single node within the cluster. This means that each node in the cluster must be configured with sufficient hardware resources to host both clusters, or you must be willing to accept degraded performance for both clusters until an additional node can be brought online to accept one of the active clusters.
Correctly name the virtual server. The value provided in the Virtual Server Name page of the installation wizard should be the name of the cluster, not the name of the current active node within the cluster.
Install SQL Server on every node in the cluster. In the Cluster Node Configuration page, select every server node in the cluster so that SQL Server is installed to all of them.
Test your system. When the installation wizard completes, completely test your system to confirm that the database instance is available to client connections, is not reporting errors, and can be successfully failed over from one node to another. Establish regular tests of this process, and any other cluster functions that you find mission critical, to verify that the cluster continues to function as designed.

SharePoint and Database Clustering

Now that you have successfully created a failover cluster for a SQL Server database instance, you can consider the implications of using that instance to host SharePoint databases. One major advantage to the use of a failover cluster for your SharePoint database instance(s) is that you can use it to host all types of SharePoint databases without a special configuration (beyond what it takes to create and configure the cluster). The only step requiring specific attention is how you identify the address of the database instance when creating the SharePoint farm; you must submit the name of the cluster, not the name of the active server node for the cluster.

Note

SharePoint 2010 requires that SQL Server 2008 be patched at least to Service Pack 1 (SP1) and Cumulative Update 2 (CU2) if using it in a failover cluster.

SharePoint views the clustered instance as it does any other database instance. During installation of your farm, it creates all its needed databases without error. The configuration, Central Administration content, and Search databases can be hosted in the clustered instance because the name of the cluster is used and written to these databases instead of the name of the active server node in the cluster. So, in the case of an outage on the active server node, when the cluster fails over to another server node, you can still use these databases. The only outage that SharePoint experiences is during the failover itself; when the new active node comes online in the cluster, service is returned to normal without requiring updates to the SharePoint farm.

Note

Keep in mind that you can use SQL aliases with a failover cluster, even though the address for the clustered instance that SharePoint uses does not change regardless of which node in the cluster is active. You should still consider using SQL aliases to further abstract the location of the clustered instance away from SharePoint to give yourself greater flexibility and scalability with your SQL Server back end.

Database Clustering Pros

Database clustering is a powerful, high-availability tool for SQL Server 2008 that offers several reasons for being a viable option for your SharePoint databases. The following list covers the most compelling reasons for its use:

True automatic failover. When an active node within a cluster suffers an outage or failure, the cluster automatically fails over to another node within the cluster. Because SharePoint references the identity of the cluster and not a specific node within it, you do not need to update a farm’s configuration data to recognize the change in database hosts.
Patching without outages. You can complete Windows and SQL Server patching without making the cluster itself unavailable. Simply apply your patches to the inactive nodes in the cluster, then fail over the cluster manually to those update nodes, and patch the remaining nodes. You can do this without interrupting the services that the cluster provides by taking advantage of the cluster’s failover functionality.
Rapid failover. Clustering your database means that, in the case of an outage, your system has a drastically shorter time to return to normal service. It only takes the amount of time required for the cluster administration process to switch over to another server in the node; no manual intervention or configuration is required to implement the failover.
Scalable. Because Windows Server 2008, failover clustering, and SQL Server 2008 support up to 16 server nodes within a cluster and use flexible shared technology for storage, you can configure your clustering solution in a variety of ways to meet the needs of your system and easily expand it to grow with your system.
Compatible with log shipping. Like database mirroring, databases hosted with a cluster can be log shipping to another instance to provide even more redundancy for your data.
Note

Failover clustering is also compatible with database mirroring, but we wouldn’t recommend it because of the complexity and high costs of implementing such a hybrid solution.
Choice of SQL Server backup model. Unlike in database mirroring, you can back up databases in a cluster using any backup model. The only exception to this is if you are also using log shipping or database mirroring with your cluster, in which case the constraints of the associated technology also apply.

Database Clustering Cons

Unfortunately, database clustering also comes with some disadvantages that can prove to be stumbling blocks to its implementation. Following are those disadvantages:

Network requirements. Although server nodes within a cluster can be located in separate datacenters, the bandwidth requirements for heartbeats and shared storage connectivity mean that nodes usually cannot be more than a few miles from one another.
SAN storage requirements. The technology required to implement shared storage, from both a hardware and software perspective, requires special expertise to implement, operate, and maintain. This also adds a dependency on yet another system for your SharePoint environment’s overall health and well-being.
Costs. In addition to the effort required to implement shared storage, the hardware and software for the technology come at a high price. Various providers and configurations are available in the marketplace, but even the low end of the cost spectrum may prove prohibitive for your budget.
Fault tolerance. Log shipping and database mirroring provide a certain level of fault tolerance because the redundant data they preserve is stored on a storage medium completely separate from that of its source. Because clustering uses shared storage to store the data files for your databases, an outage to that shared storage configuration affects your entire cluster and the applications that use it.

Other -----------------

- SQL Server 2008 High Availability : Database Mirroring (part 2) - SharePoint and Database Mirroring

- SQL Server 2008 High Availability : Database Mirroring (part 1) - How to Configure Database Mirroring

- Sharepoint 2010 : SharePoint Disaster Recovery Testing and Maintenance

- Microsoft PowerPoint 2010 : Working Together on Office Documents - Publishing Slides to a SharePoint Library

- Microsoft PowerPoint 2010 : Working Together on Office Documents - Inviting Others to a Groove Workspace & Saving a Document to a SharePoint Server

- Microsoft PowerPoint 2010 : Working Together on Office Documents - Sharing Documents in a Groove Workspace

- Using Microsoft Dynamics CRM for Outlook : Synchronizing Contacts, Tasks, and Appointments

- Using Microsoft Dynamics CRM for Outlook : Accessing CRM Records Within Microsoft Dynamics CRM for Outlook

- SQL Server 2008 : Upgrading to Microsoft SQL Server 2008 - SQL Server Integration Services & Post-Upgrade Procedures

- SQL Server 2008 : Upgrade Strategies (part 2) - Side-by-Side Upgrade