SharePoint 2010 Disaster Recovery Development : Designing Applications for Disaster Recovery Readiness

5/28/2011 3:32:29 PM

Although conventional best practices relating to .NET programming call for the implementation of certain code patterns, some of these patterns can actually run counter to the “bigger picture” (which includes disaster recovery) if it is taken into account. What’s best for performance, for instance, can actually operate counter to a strategy that maintains maximum supportability and location portability at its core.

This section approaches SharePoint development (and .NET code development in general) from a disaster recovery mindset and makes a handful of suggestions that are consistent with maximum recoverability, redundancy, and supportability for most custom applications in the event of a disaster.

Storage of Application Configuration Data

Nearly all applications, regardless of origin or intent, depend on some form of configuration data for proper operation. Configuration data, in this case, is defined as data that (a) is required for proper application operation, and (b) can vary based on the environment in which the application is installed and executed. This data can take many forms and be stored in many locations, including the following:

Paths to file system–based configuration data
Database sources and their associated connection strings
Resources describing internal error codes and their associated descriptions
Application credentials (encrypted or not) to access local and remote resources
Locale-specific settings and assemblies
References to assemblies that contain shared components
Logging settings and associated reporting information
If capable of unattended execution, schedules for noninteractive processing
E-mail recipients, templates, and conditions under which e-mail should be sent
Product IDs, registrations, and other codes
Version information

When it comes to disaster recovery, the rule of thumb regarding the storage of configuration data is this: if the data can be externalized, every reasonable attempt should be made to do so. Configuration and operational data should also be separated from actual application logic whenever possible. Practices such as embedding string literals within application code are not recommended. Under these guidelines, rethink custom code that demonstrates a reddish-brown color within the Visual Studio environment (indicative of the use of string literals) and declarative programming patterns.

Development within the .NET environment is made substantially easier (from a disaster recovery perspective) with the use of web.config files for Web-based applications and app.config files for Windows forms applications. These files, which are tied to an application, can abstract the storage of application-specific settings, database connections strings, external type registrations, and more in a way that readily supports disaster recovery. If you’re leveraging these configuration files, though, you must realize that the configuration files are typically tied to the installation location of their associated application. If the application is not installed to a directory or drive that backup operations support, the configuration data present in the externalized file or files is typically lost with the application in the event of a disaster.

In addition to the use of web.config and app.config files, storage of application configuration data can be externalized through the use of a database, a separate custom settings file (such as an XML configuration file), Web services, or a host of other options. Each option offers a different set of strengths and weaknesses, so the decision regarding which to use depends on the acceptable trade-offs. Storage of configuration data in a database is attractive from a supportability and abstraction standpoint because the database itself is likely standalone and backed up, but use of a database in this fashion can result in a poorly performing solution. The use of an XML file tends to be better performing, but it also tends to encourage a custom storage scheme that is less supportable across an enterprise unless schemas are standardized.

When you’re storing application configuration data for custom SharePoint solutions, both web.config storage and SharePoint database storage are highly feasible options and should be considered for use based on application needs and governance requirements. Because a Share-Point site is an ASP.NET site, it’s easy to store and retrieve settings data from web.config files. The SharePoint object model also includes some specialized types (such as the SPWebConfigModification type) that make it easy to integrate configuration data changes during installation or activation of custom code. At the same time, many SharePoint object types representing easily recognized entities (such as SPFarm, SPService, and SPWeb) have a Properties collection that can be used to persist custom data to the associated SharePoint databases. This means that use of the Properties collection to store configuration data for the aforementioned types results in that data being included in any backup approach that covers the SharePoint databases.

The only proscribed options for configuration storage have been mentioned. Placing string literals in-line with application code greatly reduces supportability and location portability. One notable addition to the list is the Windows Registry. In the days of COM, storage of settings in the Windows Registry was considered a step forward; from a disaster recovery perspective, storage of application settings in such a fashion is not recommended if you can avoid it. Although current backup mechanisms often capture the Registry and its settings, accessing and modifying the settings contained within the Registry is much more involved and less friendly than working with external settings files or Web services positioned for configuration storage. It would be a challenge to identify circumstances under which the storage of SharePoint settings in the Registry would be preferable to the use of the SPFarm.Properties collection.

Storage of Transient and Persistent Application Business Data

Configuration data may be responsible for getting an application running and identifying how it should interact within its runtime environment, but it is an application’s business data that is tied to the real value that the application brings to an organization. Business data takes many forms; the list that follows contains just a few of the multitude of file and data types that fall into this category:

Spreadsheets
Written documents, including e-mail messages
Presentations, multimedia files, and other audio/visual assets

Whereas configuration data is required for an application to simply execute, business data can generally be thought of as the data that is produced or consumed in the day-to-day operations of an application. Business data can be persistent and live beyond the scope of execution of the application; it can also be transient or temporary data that an application uses during computations, auto-saves, and so on.

The question of where an application should store business data is not a new one. The following are some recommendations and points for consideration:

Clearly separate business data from other data. On both servers and client workstations, a best practice is to format at least two separate logical disks for local storage requirements. One logical drive typically contains the Windows system and program files (typically C:\), whereas another (oftentimes E:\) contains application and business data. The use of at least two separate logical drives in this fashion makes the creation, maintenance, and targeting of backup operations much easier.
Leverage environment variables. Environmental portability and disaster recovery are aided significantly when you avoid assumptions regarding the structure of the file system hosting an application. This is particularly true when it comes to the storage of transient application data. Many applications need to use the hosting system’s file system for activities such as compression/decompression, encryption/decryption, and other stream-related operations. In these instances, you can use environment variables that the hosting operating system supplies to ensure that proper file system locations are employed. In the case of temporary or working files, for instance, the %TEMP% environment variable defines the default temporary files location for users who are currently logged onto the operating system.
Make business data storage locations configurable. This is an extension to the point that was made with the previous item. When the storage of persistent data is a requirement, you must provide some mechanism to permit the configuration of the storage location. This could be something as common as the Save As dialog box seen throughout the Windows world, or it could be an application configuration file setting that drives all data to a known location. Regardless of the mechanism selected, avoid assumptions about the hosting system’s file system structure at all costs.
Employ network-available services when possible. Disaster recovery operations are significantly aided when you can centralize critical business data for backup and restore purposes. Traditional file shares represent one example of how such centralization can be achieved, but they are by no means the only mechanism. Databases, custom business services, and even SharePoint (through WebDAV and the WebClient service) can be utilized for this purpose.
Consider the cloud. Microsoft, Amazon, Google, and many other vendors have been steadily increasing the capability and reliability of their cloud-based storage offerings. At the same time, the tools and APIs needed to interact with cloud-based storage have been getting easier for developers to learn and use. When it comes to offsite storage that is itself redundant and ready for disaster recovery tasks, it is well worth the time invested to see if you can integrate cloud-based storage into your design.

With SharePoint custom solutions, the storage of transient data should obey the points just described. The storage of persistent business data, however, tends not to be a large issue. Simply storing business data in SharePoint lists and document libraries ensures that the business data is covered in the event of a disaster provided you have a well-conceived, implemented, and tested SharePoint disaster recovery strategy.

Accessing Network Resources

In today’s highly interconnected computing environments, network resources are a common reality and storage location for much of the data leveraged by applications. The following are common examples of network resources:

File shares (that is, file system storage locations not resident on local disks)
E-mail stores (POP3, IMAP) for e-mail-enabled applications
Databases
FTP sites
Any HTTP/HTTPS-enabled sites and services (overlaid file shares, Web services, and so on)

The best support for disaster recovery scenarios for network resources comes when those resources are accessed through indirection or some form of abstraction layer. Although the abstraction of such resources can be an application-specific exercise, several mechanisms are built into common operating systems and network stacks to decouple the naming of such resources and services from their actual implementations:

Domain Name Services (DNS). DNS is perhaps the most common approach to separating uniform resource locators (URLs) and namespaces from actual resource implementations. DNS is the standard for Internet naming. If a user supplies a common English name (such as www.amazon.com ) to a DNS server, the DNS server resolves the host name to an IP address (72.21.203.1). DNS decouples names from IP addresses, but it comes with a cost in the form of DNS servers, increased management overhead, and some need to update names and their associated IP addresses.
Distributed File System (DFS). Practically speaking, DFS can be regarded as a “file system switchboard” service. Enterprise wide, DFS supports the practice of specifying, mapping, and redirecting network file paths. This approach decouples file path references from their underlying implementations, but it carries with it the need for additional maintenance and administration.
SQL Server aliases. Use of a SQL Server alias creates a machine-local abstraction layer between a SQL Server instance and applications that want to interact with databases on a SQL server. When an alias is defined on a machine, you must specify a minimum of two parameters: a server name and an alias name. Once you have established such an alias, connection strings that would normally use the server name can instead use the alias name. If the server name needs to be changed or updated, updating the alias is all that is required. The use of aliases affords a great deal of flexibility and portability for SharePoint farms and applications that leverage them.
Mapped network drives. Mapped network drives are a common approach to identifying network resources using local path specifications. Nearly all applications and platforms support the notion of mapped drives in some sense, making them a solid backward-compatible approach to separating identifier from implementation. Unfortunately, mapped drives tend to be established on a per-user or per-session basis. This limits their potential usefulness in many cases, particularly regarding activities that are carried out within the context of a noninteractive account.

The only methods that should be avoided wholesale when accessing network resources are those involving direct IP address access and the use of NetBIOS or straight machine names. Both of these methods fail to leverage an abstraction layer of some sort, so their viable use within a functional disaster recovery environment is questionable. After all, most “live” data centers (or failover targets) have servers and naming schemes that differ from those being used in the standard production environments that are being protected by the disaster recovery implementation.

When it comes to custom SharePoint applications, developers are advised to simply use DNS whenever possible if calls to other sites or network resources are required. SharePoint’s alternate access mapping (AAM) capability simplifies the process of extending any SharePoint site that may have an IP address in its URL to make it addressable by DNS name, so SharePoint is exceptionally DNS friendly for applications attempting to access its sites and Web services. Because SharePoint’s AAM capabilities and zone mappings are also accessible through the SharePoint object model (using the SPWebApplication.AlternateUrls property, for instance), it’s easy to ensure that custom SharePoint applications can cope with environmental changes and gracefully fall back to alternate access points to a site if needed.

Application Logging and Monitoring

The previous design readiness suggestions focused primarily on ways to decouple addressing and usage of application resources and data. The final recommendations offered in this section focus on providing insight and understanding into how an application is operating.

Logging and monitoring are fairly common application requirements, but these areas are often inadequately addressed or supported when development is undertaken. Many times, they are seen as a “nice to have,” rather than a critical facet of a fully functional and well-architected application.

In a disaster recovery scenario, logging and monitoring take on additional importance. This is especially true when a custom application may have a recovery time objective (RTO) that is measured in hours or maybe even minutes. You simply don’t have the luxury of taking any measurable amount of time in such circumstances to focus on troubleshooting a problematic application. If an application has issues coming online when recovered, the reasons for those issues need to be clearly spelled out.

At a minimum, applications should communicate not only errors, but critical informational items regarding where data is being accessed and utilized, security checks that pass and fail, anytime an application is falling back to a default value, and so on. A common mechanism for the communication of this information is the Windows Event Log, but items that are more “informational” in nature are often better supported and controlled through the use of trace switches and flags.

Being built upon ASP.NET, SharePoint has access to ASP.NET’s full array of event tracing and notification capabilities. Errors, warnings, and other informational items can be written to the ASP.NET trace logs and event sinks. Critical application errors can be added to the AllErrors collection of the SPHttpContext for further processing and analysis downstream in the ASP. NET pipeline. In addition to these capabilities, SharePoint has its own unified logging service (ULS) to which developers can write messages of any sort. SharePoint 2010 also introduces correlation IDs for troubleshooting and the logging database for aggregating information from across a farm. These capabilities greatly simplify the problem of pinpointing issues that arise with custom SharePoint code and applications.

Including Windows performance counters is another step forward that can promote greater supportability and troubleshooting with mission-critical applications. This is particularly true for applications that operate as services or lack any form of interface. Thoughtfully chosen and implemented counters can mean the difference between befuddled head scratching and insight when attempting to identify the source of a problem during recovery.

Both SharePoint and ASP.NET come with a variety of performance counters that you can leverage out of the box to troubleshoot application and performance problems. In addition, developers have the standard abilities offered by .NET to create performance counters of their own for their SharePoint applications.

Other -----------------

- SharePoint 2010 Disaster Recovery Development : Rolling Your Own Backup and Restore Approach

- SharePoint 2010 Disaster Recovery Development : Volume Shadow Copy Service

- BizTalk 2010 Recipes : Business Activity Monitoring - Creating a Tracking Profile

- BizTalk 2010 Recipes : Business Activity Monitoring - Creating a BAM Service Request

- BizTalk 2010 Recipes : Business Activity Monitoring - Using the BAM Interceptor

- Exchange Server 2010 : Managing Anti-Spam and Antivirus Countermeasures (part 4)

- Exchange Server 2010 : Managing Anti-Spam and Antivirus Countermeasures (part 3) - Implementing File-Level Antivirus Scanning

- Exchange Server 2010 : Managing Anti-Spam and Antivirus Countermeasures (part 2) - Configuring Antivirus Features

- Exchange Server 2010 : Managing Anti-Spam and Antivirus Countermeasures (part 1) - Configuring Anti-Spam Features

- SharePoint 2010 : The SharePoint Object Model (part 3) - Programmatically Using SQL Snapshots