 |
|
 |
Donald K. Burleson
Oracle Tips |
Enabling Your Database to
perform hash joins
In cases where a very small table is being joined to a large table, the
Oracle hash join will often dramatically speed-up the query. Hash joins
are far faster than nested loop joins in some cases, but it is not easy to
get your database to use hash joins.
Unfortunately, the hash join is more memory intensive than a nested loop
join. To be faster than a nested loop join, we must set the
hash_area_size large enough to hold the entire hash table in memory
(about 1.6 times the sum of the rows in the table). If the hash join
overflows the hash_area_size memory, the hash join will page into
the TEMP tablespace, severely degrading the performance of the hash join.
You can use the following script, hash_area.sql, to dynamically
allocate the proper hash_area_size for your SQL query in terms of
the size of your target table.
hash_area.sql
set
heading off;
set feedback off;
set verify off;
set pages 999;
spool run_hash.sql
select
'alter session set hash_area_size='||trunc(sum(bytes)*1.6)||';'
from
dba_segments
where
segment_name = upper('&1');
spool off;
@run_hash
Here is the output from this script. As you see, we pass the driving table
name, and the script generates the appropriate alter session
command to ensure that we have enough space in hash_area_size RAM
to hold the driving table.
SQL> @hash_area customer
alter session set hash_area_size=3774873;
In addition to seeing the hash_area_size, we must also be able to
adjust the degree of parallelism in cases where we use a full-table scan
to access the tables in a hash join.
This short tip does not have complete information about the hash join, but
you can get complete information from Oracle High Performance SQL Tuning
by Oracle Press. The hash join is very finicky, and there are many
conditions that must be satisfied. It is not uncommon to find that a
use_hash hint is ignored, and here are some common causes of this problem.
-
Check initialization
parameters Make
sure that you have the proper settings for optimizer_index_cost_adj and
optimizer_max_permutations to limit the number of table join
evaluations. Also check your values for hash_area_size and
hash_multiblock_io_count.
-
Verify the driving
Table Make
sure that the smaller table is the driving table (the first table in the
from clause when using the ordered hint). This is because a hash join
builds the memory array using the driving table.
-
Analyze CBO
statistics Check
that tables and/or columns of the join tables are appropriately
analyzed.
-
Check for skewed
columns Histograms
are recommended only for nonuniform column distributions. If necessary,
you can override the join order chosen by the cost-based optimizer using
the ordered hint.
-
Check RAM region Ensure
that hash_area_size is large enough to hold the smaller table in memory.
Otherwise, Oracle must write to the TEMP tablespace, slowing down the
hash join. Oracle recommends that the hash_area_size for the driving
table be set at 1.6 times the sum of bytes for the driving table, and
you can use the hash_area.sql script to set the hash_area_size for your
query session.
If you have followed these steps, you should be able to add a use_hash
hint ton your SQL to invoke a hash join. Remember, it is always a good
idea to run a explain plan on your SQL to ensure that the hash join is
being invoked.

|
|