Saturday, July 28, 2007

Interpreting the Query Execution Plan

A common misunderstanding is that seeks are always good and scans are always bad. This is not completely true. Because SQL Server is optimized to retrieve data sequentially, scans are excellent for gathering large blocks of data, but poor at retrieving a few isolated rows. Index seeks are great at retrieving a few rows here and a few rows there, but retrieving a large percentageof a table using index seeks performs extremely poorly compared to a scan.

Query Execution Plan Operators

Clustered index scan In a clustered index scan, SQL Server sequentially
reads the entire clustered index or a range within
the clustered index. SQL Server chooses this operation when
the set of rows requested by the where clause can be
determined as a range within the order of the clustered
index, or a large percentage

Clustered index seek In a clustered index seek, SQL Server rapidly
navigates the clustered index b-tree to retrieve
specific rows. The benefit of the clustered index seek is
that when the row(s) are determined all the columns are
immediately available.

Filter In some situations, SQL Server retrieves all the data
from a table and then uses filter operations to select
the correct rows. Sometimes the Query Optimizer will use a
Filter for performance reasons, but it’s more often due to
the lack of a useful index.

Hash match A hash match is an unordered join method that
builds a temp table and iteratively matches with
data from another table. A hash match is more
efficient if one table is significantly larger than the
other table.
This is the worst-case join method and is used when
no suitable index is available.

Merge join The merge join is the fastest method of comparing
two tables if both tables are pre-sorted.

Nested loop A nested loop iterates through two tables and
identifies matches.
Typically, nested-loop joins are best when a large
index is joined with a small table.

Nonclustered index scan In a nonclustered index scan, SQL Server reads
through all or part of the index sequentially looking
for the data.
A nonclustered index seek navigates the b-tree index
from the root node, through the intermediate node,
to the leaf node, and finally to the row. For selecting
a few individual rows this can be a very fast
operation. The benefit of a nonclustered index seek
is that it tends to be narrow (have few columns), so
more rows can fit on a page.
Once the correct row is identified, if all the required
columns are found in the index, then the seek is
complete because the index covers the needs of
the query. Conversely, if additional columns are
needed, SQL Server will need to fetch them from
the data pages.

Sort In some situations SQL Server retrieves all the data
from a table and then sorts the data to prepare it for
operations to perform the order by. Filters and
sorts are slow operations and indicate a lack of
useful indexes.

RID lookup The RID lookup locates rows in the data pages.
Typically, a RID lookup works with a nested loop to
locate the data pages following a clustered index
seek or scan.

Spool In a spool operation, SQL Server has to save off a
temporary set of data.

Table scan In a table scan, SQL Server sequentially
reads the entire table to select the rows that meet the criteria.
Depending on the number of rows needed, this
might be the fastest method of retrieving the data.
For instance, if SQL Server needs to select 80% of
the table, then it’s far faster to read the entire table
sequentially than to seek each row individually.
However, if fewer rows are needed and the Query
Optimizer still chooses a table scan, it may be
because there’s no suitable index to use instead of
the table scan, the index statistics are wrong, or the
table is so small that it doesn’t matter.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home