We outline some of the major
performance and tuning design guidelines here. There are, of course,
many more, but if you a least consider and apply the ones outline here,
you should end up with a decently performing SQL Server implementation.
As we have described previously, performance and tuning should first be
“designed in” to your SQL Server implementation. Many of the guidelines
discussed here can be adopted easily in this way. However, when you put
off the performance and tuning until later, you have fewer options to
apply and less performance improvement when you do make changes.
Remember, addressing performance and tuning is like peeling an onion.
And, for this reason, we present our guidelines in that way—layer by
layer. This approach helps provide you with a great reference point for
each layer and a list you can check off as you develop your SQL
Server–based implementation. Just ask yourself whether you have
considered the specific layer guidelines when you are dealing with that
layer.
Hardware and Operating System Guidelines
Let’s start with the salient hardware and operation system guidelines that you should be considering:
Hardware/Physical Server:
Server sizing/CPUs—
Physical (or virtual) servers that will host a SQL Server instance
should be roughly sized to handle the maximum processing load plus 35%
more CPUs (and you should always round up). As an example, for a
workload that you anticipate may be fully handled by a four-CPU server
configuration, we recommend automatically increasing the number of CPUs
to six. We also always
leave at least one CPU for the operating system. So, if six CPUs are on
the server, you should allocate only five to SQL Server to use. Memory—
The amount of memory you might need is often directly related to the
amount of data you need to be in the cache to achieve 100% or near 100%
cache hit ratios. This, of course, yields higher overall performance. We
don’t believe there is too much memory for SQL Server, but we do
recognize that some memory must be left to the operating system to
handle OS-level processing, connections, and so on. So, in general, you
should make 90% of memory available to SQL Server and 10% to the OS. Disk/SAN/NAS/RAID—
Your disk subsystem can be a major contributor to performance
degradation if not handled properly. We recognize that there are many
different options available here. We generally try to have some separate
devices on different
I/O channels so that disk I/O isolation techniques can be used. This
means that you isolate heavy I/O away from other heavy I/O activity;
otherwise, disk head contention causes massive slowdowns in physical
I/O. When you use SAN/NAS storage, much of the storage is just logical
drives that are heavily cached. This type of situation limits the
opportunity to spread out heavy I/O, but the caching layers often
alleviate that problem. In general, RAID 10 is great for high update
activity, and RAID 5 is great for mostly read-only activity.
Operating System:
Page file location—
When physical memory is exceeded, paging occurs to the page file. You
need to make sure that the page file is not located on one of your
database disk locations; otherwise, performance of the whole server
degrades rapidly. Processes’ priority—
You should never lower the SQL Server processes in priority or to the
background. You should always have them set as high as possible. Memory—
As mentioned earlier, you should make sure that at least 10% of memory
is available to the OS for all its housekeeping, connection handling,
process threads, and so on. OS version—
You should make sure you are using the most recent version of the
operating systems as you can and have updated with the latest patches or
service packs. Also, often you must remove other software on your
server, such as specialized virus protection. We have lost track of the
number of SQL Server implementations we have found that had some
third-party virus software installed (and enabled) on it, and all files
and communication to the server were interrupted by the virus scans.
Rely on Microsoft Windows and your firewalls for this protection rather
than a third-party virus solution that gets in the way of SQL Server. If
your organization requires some type of virus protection on the server,
at least disable scanning of the database device files.
Network:
Packet sizes/traffic—
With broader bands and faster network adapters (typically at least 1GB
now), we recommend you utilize the larger packet sizes to accommodate
your heavier-traffic SQL Server instances. Routers/switches/balancers—
Depending on if you are using SQL clustering or have multitiered
application servers, you likely should utilize some type of load
balancing at the network level to spread out connections from the
network and avoid bottlenecks.
SQL Server Instance Guidelines
Next comes the SQL Server instance itself and the critical items that must be considered:
SQL Server configuration—
We do not list many of the SQL Server instance options here, but many
of the default options are more than sufficient to deal with most SQL
Server implementations. SQL Server device allocations—
Devices should be treated with care and not over-allocated. SQL
databases utilize files and devices as their underlying allocation from
the operating system. You do not want dozens and dozens of smaller files
or devices for each database. Having all these files or devices becomes
harder to administer, move, and manipulate. We often come into a SQL
Server implementation and simplify the device allocations before we do
any other work on the database. At a minimum, you should create data
devices and log devices so that you can easily isolate (separate) them. tempdb database— Perhaps the most misunderstood SQL Server shared resource is tempdb. General guidelines for tempdb
is to minimize explicit usage (overusage) of it by limiting temp table
creation, sorts, queries using DISTINCT clause, so on. You are creating a
hot spot in your SQL Server instance that is mostly not
in your control. You might find it hard to believe, but indexing, table
design, and even not executing certain SQL statements can have a huge
impact on what gets done in tempdb and have a huge effect on performance. And, of course, you need to isolate tempdb away from all other databases. master database— There is one simple guideline here: protect the master database at all costs. This means frequent backups and isolation of master away from all other databases. model database— It seems harmless enough, but all databases in SQL Server utilize the model database as their base allocation template. We recommend you tailor this for your particular environment. Memory—
The best way to utilize and allocate memory to SQL Server depends on a
number of factors. One is how many other SQL Server instances are
running on the same physical server. Another is what type of SQL
Server–based application it is: heavy update versus heavy reads. And yet
another is how much of your application has been written with stored
procedures, triggers, and so on. In general, you want to give as much of
the OS memory to SQL Server as you can. But this amount should never
exceed 90% of the available memory at the OS level. You don’t want SQL
Server or the OS to start thrashing via the page file or competing
against each other for memory. Also, when more than one SQL Server
instance is on the same physical server, you need to divide the memory
correctly for each. Don’t pit them against each other.
Database-Level Guidelines
Database allocations—
We like to use an approach of putting database files for heavily used
databases on the same drives as lightly used databases when more than
one database is being managed by a single SQL Server instance. In other
words, pair big with small, not big with big. This approach is termed reciprocal database pairing. You
should also not have too many databases on a single SQL Server
instance. If the server fails, so do all the applications that were
using the databases managed by this one SQL Server instance. It’s all
about risk mitigation. Remember the adage “never put all your eggs in
one basket.” Databases
have two primary file allocations: one for their data portion and the
other for their transaction log portion. You should always isolate these
file allocations from each other onto separate disk subsystems with
separate I/O channels if possible. The transaction log is a hot spot for
highly volatile applications (that have frequent update activity).
Isolate, isolate, and isolate some more. You
need to size your database files appropriately large enough to avoid
database file fragmentation. Heavily fragmented database files can lead
to excessive file I/O within the operating system and poor I/O
performance. For example, if you know your database is going to grow to
500GB, size your database files at 500GB from the start so that the
operating system can allocate a contiguous 500GB file. In addition, be
sure to disable the Auto-Shrink database option. Allowing your database
files to continuously grow and shrink also leads to excessive file
fragmentation as file space is allocated and deallocated in small
chunks. Database backup/recovery/administration—
You should create a database backup and recovery schedule that matches
the database update volatility and recovery point objective. All too
often a set schedule is used when, in fact, it is not the schedule that
drives how often you do backups or how fast you must recover from
failure.
Table Design Guidelines
Table designs—
Given the massively increased CPU, memory, and disk I/O speeds that now
exist, you should use a general guideline to create as “normalized” a
table design as is humanly possible. No longer is it necessary to
massively denormalize for performance. Most normalized table designs are
easily supported by SQL Server. Normalized table designs ensure that
data has high integrity and low overall redundant data maintenance. See
Dr. E. F. Codd’s original work on relational database design (The Relational Model for Database Management: Version 2, Addison Wesley, 1990). Note
Too often, we have seen
attempts by developers and database designers to guess at the
performance problems they expect to encounter denormalizing the database
design before any real performance testing has even been done, This,
more often than not, results in an unnecessarily, and sometimes
excessively, denormalized database design. Overly denormalized databases
require creating additional code to maintain the denormalized data, and
this often ends up creating more performance problems than it attempts
to solve, not to mention the greater potential for data integrity issues
when data is heavily denormalized. It is always best to start with as
normalized a database as possible, and begin testing early in the
development process with real data volumes to identify potential areas
where denormalization may be necessary for performance reasons. Then,
and only when absolutely necessary, you can begin to look at areas in
your table design where denormalization may provide a performance
benefit.
Data types—
You must be consistent! In other words, you need to take the time to
make sure you have the same data type definitions for columns that will
be joined and/or come from the same data domain—Int to Int,
and so on. Often the use of user-defined data types goes a long way to
standardize the underlying data types across tables and databases. This
is a very strong method of ensuring consistency. Defaults—
Defaults can help greatly in providing valid data values in columns
that are common or that have been specified as mandatory (not NULL). Defaults are tied to the column and are consistently applied, regardless of the application that touches the table. Check constraints—
Check constraints can also be useful if you need to have checks of data
values as part of your table definition. Again, it is a consistency
capability at the column level that guarantees that only correct data
ends up in the column. Let us add a word of warning, though: you have to
be aware of the insert and update errors that can occur in your
application from invalid data values that don’t meet the check
constraints. Triggers—
Often triggers are used to maintain denormalized data, custom audit
logs, and referential integrity. Triggers are often used when you want
certain behavior to occur when updates, inserts, and deletes occur,
regardless of where they are initiated from. Triggers can result in
cascading changes to related (dependent) tables or failures to perform
modifications because of restrictions. Keep in mind that triggers add
overhead to even the simplest of data modification operations in your
database and are a classic item to look at for performance issues. You
should implement triggers sparingly and implement only triggers that are
“appropriate” for the level of integrity or activity required by your
applications, and no more than is necessary. Also, you need to be
careful to keep the code within your triggers as efficient as possible so the impact on your data modifications is kept to a minimum. Primary keys/foreign keys—
For OLTP and normalized table designs, you need to utilize explicit
primary key and foreign key constraints where possible. For many
read-only tables, you may not even have to specify a primary key or
foreign key at all. In fact, you will often be penalized with poorer
load times or bulk updates to tables that are used mostly as lookup
tables. SQL Server must invoke and enforce integrity constraints if they
are defined. If you don’t absolutely need them (such as with read-only
tables), don’t specify them. Table allocations—
When creating tables, you should consider using the fill factor (free
space) options (when you have a clustered index) to correspond to the
volatility of the updates, inserts, and deletes that will be occurring
in the table. Fill factor leaves free space in the index and data pages,
allowing room for subsequent inserts without incurring a page split.
You should avoid page splits as much as possible because they increase
the I/O cost of insert and update operations. Table partitioning—
It can be extremely powerful to segregate a table’s data into physical
partitions that are naturally accessed via some natural subsetting such
as date or key range. Queries that can take advantage of partitions can
help reduce I/O by searching only the appropriate partitions rather than
the entire table. Purge/archive strategy—
You should anticipate the growth of your tables and determine whether a
purge/archive strategy will be needed. If you need to archive or purge
data from large tables that are expected to continue to grow, it is best
to plan for archiving and purging from the beginning. Many times, your
archive/purge method may require modifications to your table design to
support an efficient archive/purge method. In addition, if you are
archiving data to improve performance of your OLTP applications, but the
historical data needs to be maintained for reporting purposes, this
also often requires incorporating the historical data into your database
and application design. It is much easier to build in an archive/purge
method to your database and application from the start than have to
retrofit something back into an existing system. Performance of the
archive/purge process often is better when it’s planned from the
beginning as well.
Indexing Guidelines
In general, you need to be sure
not to overindex your tables, especially for tables that require good
performance for data modifications! Common mistakes include creating
redundant indexes on primary keys that already have primary key
constraints defined or creating multiple indexes with the same set of
leading columns. You should understand when an index is required based
on need, not just the desire to have an index. Also, you should
make sure that the indexes you define have sufficient cardinality to be
useful for your queries. In most performance and tuning engagements
that we do, we spend a good portion of our time removing indexes or
redefining them correctly to better support the queries being executed
against the tables.
Following are some indexing guidelines:
- Have an
indexing strategy that matches the database/table usages; this is
paramount. Do not index OLTP tables with a DSS indexing strategy and
vice versa.
- For composite indexes, try to keep the more selective columns leftmost in the index.
- Be sure to index columns used in joins. Joins are processed inefficiently if no index on the column(s) is specified in a join.
- Tailor
your indexes for your most critical queries and transactions. You
cannot index for every possible query that might be run against your
tables. However, your applications will perform better if you can
identify your critical and most frequently executed queries and design
indexes to support them.
- Avoid indexes on columns that
have poor selectivity. The Query Optimizer is not likely to use the
indexes, so they would simply take up space and add unnecessary overhead
during inserts, updates, and deletes.
- Use clustered
indexes when you need to keep your data rows physically sorted in a
specific column order. If your data is growing sequentially or is
primarily accessed in a particular order (such as range retrievals by
date), the clustered index allows you to achieve this more efficiently.
- Use
nonclustered indexes to provide quicker direct access to data rows than
a table scan when searching for data values not defined in your
clustered index. Create nonclustered indexes wisely. You can often add a
few other data columns in the nonclustered index (to the end of the
index definition) to help satisfy SQL queries completely in the index
(and not have to read the data page and incur some extra I/O). This is
termed “covering your query.” All query columns can be satisfied from
the index structure.
- Consider specifying a clustered
index fill factor (free space) value to minimize page splits for
volatile tables. Keep in mind, however, that the fill factor is lost
over time as rows are added to the table and pages fill up. You might
need to implement a database maintenance job that runs periodically to
rebuild your indexes and reapply the fill factor to the data and index
pages.
- Be extremely aware of the table/index statistics
that the optimizer has available to it. When your table has changed by
more than 20% from updates, inserts, or deletes, the data distribution
can be affected quite a bit, and the optimizer decisions can change
greatly. You’ll often want to ensure that the Auto-Update Statistics
option is enabled for your databases to help ensure that index
statistics are kept up-to-date as your data changes.
View Design Guidelines
In general, you can have as many views as you want. Views are not
tables and do not take up any storage space (unless you create an index
on the view). They are merely an abstraction for convenience. Except
for indexed views, views do not store any data; the results of a view
are materialized at the time the query is run against the view and the
data is retrieved from the underlying tables. Views can be used to hide
complex queries, can be used to control data access, and can be used in
the same place as a table in the FROM statement of any SQL statement.
Following are some view design guidelines:
- Use views to
hide tables that change their structure often. By using views to provide
a stable data access view to your application, you can greatly reduce
programming changes.
- Utilize views to control security and control access to table data at the data value level.
- Be
careful of overusing views containing complex multitable queries,
especially code that joins such views together. When the query is
materialized, what may appear as a simple join between two or three
views can result in an expensive join between numerous tables, sometimes
including joins to a single table multiple times.
- Use
indexed views to dramatically improve performance for data accesses done
via views. Essentially, SQL Server creates an indexed lookup via the
view to the underlying table’s data. There is storage and overhead
associated with these views, so be careful when you utilize this
performance feature. Although indexed views can help improve the
performance of SELECT statements, they add overhead to INSERT, UPDATE, and DELETE
statements because the rows in the indexed view need to be maintained
as data rows are modified, similar to the maintenance overhead of
indexes.
Transact-SQL Guidelines
Overall, how you write
your Transact-SQL (T-SQL) code can have one of the greatest impacts on
your SQL Server performance. Regardless of how well you’ve optimized
your server configuration and database design, poorly written and
inefficient SQL code still results in poor performance. The following
sections list some general guidelines to help you write efficient,
faster-performing code.
General T-SQL Coding Guidelines
- Use IF EXISTS instead of SELECT COUNT(*) when checking only for the existence of any matching data values. IF EXISTS stops the processing of the SELECT query as soon as the first matching row is found, whereas SELECT COUNT(*) continues searching until all matches are found, wasting I/O and CPU cycles.
- Using
Exists/Not Exists in a sub-query is preferable to IN/ NOT IN for sets
that are queried. As the potential target size of the set used in the IN
gets larger, the performance benefit increases.
- Avoid unnecessary ORDER BY or DISTINCT
clauses. Unless the Query Optimizer determines that the rows will be
returned in sorted order or all rows are unique, these operations
require a worktable for processing the results, which incurs extra
overhead and I/O. Avoid these operations if it is not imperative for the
rows to be returned in a specific order or if it’s not necessary to
eliminate duplicate rows.
- Use UNION ALL instead of UNION if you do not need to eliminate duplicate result rows from the result sets being combined with the UNION operator. The UNION statement has to combine the result sets into a worktable to remove any duplicate rows from the result set. UNION ALL simply concatenates the result sets together, without the overhead of putting them into a worktable to remove duplicate rows.
- Use
table variables instead of temporary tables whenever possible or
feasible. Table variables are memory resident and do not incur the I/O
overhead and system table and I/O contention that can occur in tempdb with normal temporary tables.
- If
you need to use temporary tables, keep them as small as possible so
they are created and populated more quickly and use less memory and
incur less I/O. Select only the required columns rather than using SELECT *,
and retrieve only the rows from the base table that you actually need
to reference. The smaller the temporary table, the faster it is to
create and access the table.
- If a temporary table is of
sufficient size and will be accessed multiple times, it is often cost
effective to create an index on it on the column(s) that will be
referenced in the search arguments (SARGs) of queries against the
temporary table. Do this only if the time it takes to create the index
plus the time the queries take to run using the index is less than the
sum total of the time it takes the queries against the temporary table
to run without the index.
- Avoid unnecessary function executions. If you call a SQL Server function (for example, getdate())
repeatedly within T-SQL code, consider using a local variable to hold
the value returned by the function and use the local variable repeatedly
throughout your SQL statements rather than repeatedly executing the SQL
Server function. This saves CPU cycles within your T-SQL code.
- Try
to use set-oriented operations instead of cursor operations whenever
possible and feasible. SQL Server is optimized for set-oriented
operations, so they are almost always faster than cursor operations
performing the same task. However, one potential exception to this rule
is if performing a large set-oriented operation lead to locking
concurrency issues. Even though a single update runs faster than a
cursor, while it is running, the single update might end up locking the
entire table, or large portions of the table, for an extended period of
time. This would prevent other users from accessing the table during the
update. If concurrent access to the table is more important than the
time it takes for the update itself to complete, you might want to
consider using a cursor.
- Consider using the MERGE statement introduced in SQL Server 2008 when you need to perform multiple updates against a table (UPDATE, INSERT, or DELETE)
because it enables you to perform these operations in a single pass of
the table rather than perform a separate pass for each operation.
- Consider using the OUTPUT clause to return results from INSERT, UPDATE, or DELETE statements rather than having to perform a separate lookup against the table.
- Use
search arguments that can be effectively optimized by the Query
Optimizer. Try to avoid using any negative logic in your SARGs (for
example, !=, <>, not in) or performing
operations on, or applying functions to, the columns in the SARG. Avoid
using expressions in your SARGs where the search value cannot be
evaluated until runtime (such as local variables, functions, and
aggregations in subqueries) because the optimizer cannot accurately
determine the number of matching rows because it doesn’t have a value to
compare against the histogram values during query optimization.
Consider putting the queries into stored procedures and passing in the
value of the expression as a parameter. The Query Optimizer evaluates
the value of a parameter prior to optimization. SQL Server evaluates the
expression prior to optimizing the stored procedure.
- Avoid data type mismatches on join columns.
- Avoid
writing large complex queries whenever possible. Complex queries with a
large number of tables and join conditions can take a long time to
optimize. It may not be possible for the Query Optimizer to analyze the
entire set of plan alternatives, and it is possible that a suboptimal
query plans could be chosen. Typically, if a query involves more than 12
tables, it is likely that the Query Optimizer will have to rely on
heuristics and shortcuts to generate a query plan and may miss some
optimal strategies.
Stored Procedure Guidelines
- Use stored
procedures for SQL execution from your applications. Stored procedure
execution can be more efficient that ad hoc SQL due to reduced network
traffic and query plan caching for stored procedures.
- Use
stored procedures to make your database sort of a “black box” as far as
the as your application code is concerned. If all database access is
managed through stored procedures, the applications are shielded from
possible changes to the underlying database structures. You can simply
modify the existing stored procedures to reflect the changes to the
database structures without requiring any changes to the front-end
application code.
- Ensure that your parameter data types
match the column data types they are being compared against to avoid
data type mismatches and poor query optimization.
- Avoid
transaction nesting issues in your stored procedures by developing a
consistent error-handling strategy for failed transactions or other
errors that occur in transactions within your stored procedures.
Implement that strategy consistently across all procedures and
applications. Within stored procedures that might be nested, you need to
check whether the procedure is already being called from within a
transaction before issuing another BEGIN TRAN statement. If a transaction is already active, you can issue a SAVE TRAN
statement so that the procedure can roll back only the work that it has
performed and allow the calling procedure that initiated the
transaction to determine whether to continue or abort the overall
transaction.
- Break up large, complex stored procedures
into smaller, more manageable stored procedures. Try to create very
modular pieces of code that are easily reused and/or nested.
Coding Efficient Transactions and Minimizing Locking Contention
Poorly written or
inefficient transactions can have a detrimental effect on concurrency of
access to data and overall application performance. To reduce locking
contention for resources, you should keep transactions as short and
efficient as possible. During development, you might not even notice
that a problem exists; the problem might become noticeable only after
the system load is increased and multiple users are executing
transactions simultaneously.
Following are some
guidelines to consider when coding transactions to minimize locking
contention and improve application performance:
- Do not return
result sets within a transaction. Doing so prolongs the transaction
unnecessarily. Perform all data retrieval and analysis outside the
transaction.
- Never
prompt for user input during a transaction. If you do, you lose all
control over the duration of the transaction. (Even the best programmers
miss this one on occasion.) On the failure of a transaction, be sure to
issue the rollback before putting up a message box telling the user
that a problem occurred.
- Use optimistic locking or
snapshot isolation. If user input is unavoidable between data retrieval
and modification and you need to handle the possibility of another user
modifying the data values read, leverage the necessary locking strategy
(or isolation) to guarantee that no other user corrupts this data.
Simple things like re-read and compare, as opposed to holding the
resource.
- Keep statements that comprise a transaction in
a single batch to eliminate unnecessary delays caused by network
input/output between the initial BEGIN TRAN statement and the subsequent COMMIT TRAN commands. Additionally, keeping the BEGIN TRAN and COMMIT/ROLLBACK statements within the same batch helps avoid the possibility of leaving transactions open should the COMMIT/ROLLBACK statement not be issued in a subsequent batch.
- Consider
coding transactions entirely within stored procedures. Stored
procedures typically run faster than commands executed from a batch. In
addition, because they are server resident, stored procedures reduce the
amount of network I/O that occurs during execution of the transaction,
resulting in faster completion of the transaction.
- Keep
transactions as short and concise as possible. The shorter the period of
time locks are held, the less chance for lock contention. Keep commands
that are not essential to the unit of work being managed by the
transaction (for example, assignment selects, retrieval of updated or
inserted rows) outside the transaction.
- Use the lowest
level of locking isolation required by each process. For example, if
dirty reads are acceptable and accurate results are not imperative,
consider using transaction Isolation Level 0. Use the Repeatable Read or
Serializable Read isolation levels only if absolutely necessary.
Application Design Guidelines
Locking/deadlock considerations—
These considerations are often the most misunderstood part of SQL
Server implementations. Start by standardizing on update, insert, and
delete order for all applications that modify data. You do not want to
design in locking or deadlocking issues because of inconsistent resource
locking orders that result in a “deadly embrace. Stateless application design—
To scale out, your application needs to take advantage of
load-balancing tiers, application server clustering, and other scaleout
options. If you don’t force the application or database to carry state,
you will have much more success in your scaleout plans. Remote Procedure Calls/linked servers—
Often data can be accessed via linked server connections rather than by
redundantly copying or replicating data into a database. You can take
advantage of this capability with SQL Server to reduce the redundant
storage of data and eliminate synchronization issues between redundant
data stores. Because Remote Procedure Calls are being deprecated in SQL
Server, you should stay away from them. Transactional integrity—
There is no excuse for sacrificing transactional integrity for
performance. The extra overhead (and possible performance impact) comes
with holding resources (and locks) until the transaction commit point to
ensure data integrity. However, if you keep the logical unit of work
(the business transaction) as small as possible, you can usually
minimize the impact. In other words, you should keep your transaction
sizes small and tight.
Distributed Data Guidelines
Distribute for disaster recovery— Those
organizations that have a disaster recovery requirement that they would
like to fulfill with distributed data can use several options. One is
traditional bit-level stretch clustering (using third-party products
such as from Symantec) to your disaster recovery site. Another is simple
log shipping to a secondary data center at some interval. Keep in mind,
though, that log shipping will be deprecated at some point. Other
options include database mirroring (asynchronous mode), periodic full
database backups that are sent to another site and restored to a standby
server, and a few variations of data replication. Distribute to satisfy partitioned data accesses—
If you have very discrete and separate data access by some natural key
such as geography or product types, it is often easy to have a huge
performance increase by distributing or partitioning your tables to
serve these accesses. Data replication options such as peer-to-peer and
multiple publishers fit this well when you also need to isolate the data
to separate servers and even on separate continents. Distribute for performance—
Taking the isolation approach a bit further, you can devise a variety
of SQL Server configurations that greatly isolate entire classes of data
access, such as reporting access isolated away from online
transactional processing, and so on. Classic SQL Server–based methods
for this now include the use of database mirroring and snapshots on the
mirror, a few of the data replication options, and others.
High-Availability Guidelines
Understand your high-availability (HA) needs first—
More important than applying a single technical solution to achieve
high availability is to actually decide what you really need. You should
evaluate exactly what your HA requirements might be with a formal
assessment and a cost to the company if you do not have HA for your
application. Know your options for different levels of HA achievement—
With SQL Server, there are several ways to achieve nearly the same
level of high availability, including SQL clustering, data replication,
database mirroring, log shipping, so on. But deciding on the right one
often depends on many other variables. Be aware of sacrifices for HA at the expense of performance— High availability often comes at the expense of performance. As an example, if you use database mirroring in
its high availability/automatic failover configuration, you actually
end up with slower transaction processing. This can hurt if your SLAs
are for subsecond transactions. Be extremely careful here. Apply the HA
solution that matches your entire application’s service-level
agreements.
We
have listed many guidelines for you to consider. Our hope is that you
run through them for every SQL Server–based system you build. Use them
as a checklist so that you catch the big design issues early and that
you are designing in performance from the start.
|