What’s New in Query Optimization
SQL Server 2008 introduces a
few new features and capabilities related to query optimization and
query performance in an attempt to deliver on the theme of “predictable
performance.” The primary new features and enhancements are as follows:
An enhancement has been added to the OPTIMIZE FOR query hint option to include a new UNKNOWN
option, which specifies that the Database Engine use statistical data
to determine the values for one or more local variables during query
optimization, instead of the initial values. Table
hints can now be specified as query hints in the context of plan guides
to provide advanced query performance tuning options A new FORCESEEK
table hint has been added. This hint specifies that the query optimizer
should use only an index seek operation as the access path to the data
referenced in the query. Hash values are available for finding and tuning similar queries. The sys.dm_exec_query_stats and sys.dm_exec_requests
catalog views provide query hash and query plan hash values that you
can use to help determine the aggregate resource usage for similar
queries and similar query execution plans. This can help you find and
tune similar queries that individually consume minimal system resources
but collectively consume significant system resources. The new filtered indexes feature in SQL Server 2008 is considered for estimating index usefulness. Parallel query processing on partitioned objects has been improved.
One of the key improvements in SQL Server 2008 is the simplification of the creation and use of plan guides:
The sp_create_plan_guide stored procedure now accepts XML execution plan output directly via the @hints parameter instead of having to embed the output in the USE PLAN hint. A new stored procedure, sp_create_plan_guide_from_handle, allows you to create one or more plan guides from an existing query plan in the plan cache. You can create multiple OBJECT
or SQL plan guides for the same query and batch or module although only
one of these plan guides can be enabled at any given time. A new system function, sys.fn_validate_plan_guide, enables you to validate a plan guide. New
SQL Profiler event classes, Plan Guide Successful and Plan Guide
Unsuccessful, enable you to verify whether plan guides are being used by
the Query Optimizer. New
Performance Monitor counters in the SQL Server, SQL Statistics
Object—Guided Plan Executions/sec and Misguided Plan Executions/sec—can
be used to monitor the number of plan executions in which the query plan
has been successfully or unsuccessfully generated by using a plan
guide. Built-in
support is now available for creating, deleting, enabling, disabling,
or scripting plan guides in SQL Server Management Studio (SSMS). Plan
guides now are located in the Programmability folder in Object Explorer.
Note
Many of the internals of the
Query Optimizer and its costing algorithms are considered proprietary
and have not been made public. Much of the information provided here is
based on analysis and observation of query plans generated for various
queries and search values.
What Is the Query Optimizer?
For any given SQL statement,
the source tables can be accessed in many ways to return the desired
result set. The Query Optimizer analyzes all the possible ways the
result set can be generated and chooses the most appropriate method,
called the query plan or execution plan.
SQL Server uses a cost-based Query Optimizer. The Query Optimizer
assigns a cost to every possible execution plan in terms of CPU resource
usage and page I/O. The Query Optimizer then chooses the execution plan
with the lowest associated cost.
Thus, the primary goal of the
Query Optimizer is to find the least expensive execution plan that
minimizes the total time required to process a query. Because I/O is the
most significant factor in query processing time, the Query Optimizer
analyzes the query and primarily searches for access paths and
techniques to minimize the number of logical and physical page accesses
as much as possible. The lower the number of logical and physical I/Os
performed, the faster the query should run.
Query Compilation and Optimization
Query compilation
is the complete process from the submission of a query to its actual
execution. There are many steps involved in query compilation—one of
which is optimization. All T-SQL statements are compiled, but not all
are optimized. Primarily, only the standard SQL Data Manipulation
Language (DML) statements—SELECT, INSERT, UPDATE, and DELETE—require optimization. The other procedural constructs in T-SQL (IF, WHILE,
local variables, and so on) are compiled as procedural logic but do not
require optimization. DML statements are set-oriented requests that the
Query Optimizer must translate into procedural code that can be
executed efficiently to return the desired results.
Note
SQL Server also optimizes some Data Definition Language (DDL) statements, such as CREATE INDEX or ALTER TABLE,
against the data tables. For example, a displayed query plan for the
creation of an index shows optimization steps for accessing the table,
sorting data, and inserting into the index tree.
Compiling DML Statements
When SQL Server compiles an execution plan for a DML statement, it performs the following basic steps:
1. | The
query is parsed and checked for proper syntax, and the T-SQL statements
are parsed into keywords, expressions, operators, and identifiers to
generate a query tree. The query tree (sometimes referred to as the sequence tree)
is an internal format of the query that SQL Server can operate on. It
is essentially the logical steps needed to transform the query into the
desired result.
| 2. | The
query tree is then normalized and simplified. During normalization, the
tables and columns are verified, and the metadata (data types, null
properties, index statistics, and so on) about them is retrieved. In
addition, any views are resolved to their underlying tables, and
implicit conversions are performed (for example, an integer compared
with a float value). Also during this phase, any redundant operations
(for example, unnecessary or redundant joins) are removed, and the query
tree is simplified.
| 3. | The
Query Optimizer analyzes the different ways the source tables can be
accessed and selects the series of steps that return the results fastest
while typically using the fewest resources. The query tree is updated
with the optimized series of steps, and an execution plan (also referred
to as query plan) is generated from the final, optimized version of the
sequence tree.
| 4. | After the optimized execution plan is generated, SQL Server stores the optimized plan in the procedure cache.
| 5. | SQL
Server reads the execution plan from the procedure cache and executes
the query plan, returning the result set (if any) to the client.
|
The optimized execution plan is
then left in the procedure cache. If the same query or stored procedure
is executed again and the plan is still available in the procedure
cache, the steps to optimize and generate the execution plan are
skipped, and the stored query execution plan is reused to execute the
query or stored procedure.
Optimization Steps
When the query tree is passed
to the Query Optimizer, the Query Optimizer performs a series of steps
to break down the query into its component pieces for analysis to
generate an optimal execution plan:
1. | Query analysis—The query is analyzed to determine search arguments and join clauses. A search argument is defined as a WHERE clause that compares a column to a constant. A join clause is a WHERE clause that compares a column from one table to a column from another table.
| 2. | Row estimation and index selection—Indexes
are selected based on search arguments and join clauses (if any exist).
Indexes are evaluated based on their distribution statistics and are
assigned a cost.
| 3. | Join selection—The
join order is evaluated to determine the most appropriate order in
which to access tables. In addition, the Query Optimizer evaluates the
most appropriate join algorithm to match the data.
| 4. | Execution plan selection—Execution
costs are evaluated, and a query execution plan is created that
represents the most efficient solution found by the optimizer.
|
|