 |
|
Oracle Concepts by Burleson
Consulting |
Understanding the SQL
Optimizers
In order to
understand the evolution of SQL optimization we have to take a
historical perspective. In the early 1980s when commercial
relational databases were first introduced, the SQL optimizers were
very primitive and relied on simple heuristic in order to determine
the optimal execution plan for any given query. This was called
rule-based optimization. The rule-based optimizer uses simple
data dictionary statistics in order to derive the SQL execution plan.
However,
rule-based optimization is not aware of detailed statistics about the
nature of the tables and indexes. For example, the rule-based
optimizer does not know the number of distinct values within an index,
or the distribution of values within index. Hence, rule-based
optimizer is far more likely than cost-based optimization to choose an
inappropriate index to service take query. An inappropriate
index is an index that is not selective. For example, consider
this query:
select
book_title
from
book
where
book_type = ‘computer’
and
book_title = ‘DOS for Dummies’;
In this case, a
book_type index or the book_title index could be used, but the
book_title index is likely to be far more selective than the book_type
index.
Recognizing the
shortcomings of rule-based SQL optimization, the relational database
vendors began to add additional intelligence into their SQL optimizing
techniques. In order to choose the best execution plan for a
given query, the vendors recognized that it was necessary to gather
detailed statistics about the tables and indexes. Some of these
statistics include:
* The number of
rows in a table
* The average
row length of rows in the table
* The carnality
of the indexes on the table
* The
distribution of data column values within the table
The cost-based
optimizer was created to use these statistics in order to make a more
intelligent decision about the best way to service to query. However
here we must refine our definition of what we mean by “best” execution
plan.
Is the best
execution plan the one that begins to return rows to the query the
fastest, or is the best execution plan the one that services the query
with the minimal amount of machine resources? Let's illustrate
this choice with a simple example.
Assume that we
have a simple query that selects 1,000,000 rows from the customer
table, and orders the result by customer name:
select
cust_name from customer order by cust_name;
Let's also
assume that we have an index on the cust_name column. The SQL
optimizer has a choice of methods to produce the result set:
Choice 1 - The
Oracle database can use the cust_name index to retrieve the customer
table rows. This will alleviate the need for sorting the result
set at the end of the query, but using the index has the downside of
causing additional I/O within the Oracle database as the index nodes
are accessed.
Choice 2 - The
Oracle database can perform a parallel full table scan against the
table and then sort the result set on desk. This execution plan
will generally result in less overall disk I/O resources than using
the index, but the downside to this optimization technique that no
rows from the query will be available until the entire query has been
completed. For a giant query, this could take several minutes.
Hence, we see
two general approaches to SQL query optimization. The use of the
indexes to avoiding sorting been codified within Oracle as the
first_rows optimization technique. Under first_rows
optimization, the optimizer goal is to begin to return rows to the
query as quickly as possible, even if it means extra disk I/O.
Conversely, the all_rows optimizer goal is designed to minimize
overall machine resources. Under all_rows optimization the goal
is to minimize the amount of machine resources and disk I/O for the
query. Hence, the all_rows optimizer mode tends to favor full
table scans, and is generally used in large data warehouses where
immediate response time is not required.
Now that we are
familiar with the general concepts, please read chapter 5 from the
Burleson book.
For more details, see the "Easy
Oracle Series" a set of books especially designed by Oracle
experts to get you started fast with Oracle database technology.
|
Download your Oracle scripts now:
www.oracle-script.com
The
definitive Oracle Script collection for every Oracle professional DBA
|
|