Oracle7 Server Tuning | ![]() Library |
![]() Product |
![]() Contents |
![]() Index |
This chapter explains when to use the available optimization modes and how to use hints to enhance Oracle performance.
The cost-based approach generally chooses an execution plan that is as good as or better than the plan chosen by the rule-based approach, especially for large queries with multiple joins or multiple indexes. The cost-based approach also improves productivity by eliminating the need for you to tune your SQL statements yourself. Finally, many Oracle performance features are available only through the cost-based approach.
Cost based optimization must be used to achieve efficient star query performance. Similarly, it must be used with hash joins and histograms. Cost-based optimization is always used with parallel query and with partition views. You must therefore perform ANALYZE at the partition level with partition views.
You can also enable cost-based optimization in these ways:
ANALYZE TABLE emp COMPUTE STATISTICS FOR COLUMNS sal SIZE 10;
The SIZE keyword states the maximum number of buckets for the histogram. You would create a histogram on the SAL column if there were an unusual number of employees with the same salary and few employees with other salaries.
See Also: Oracle7 Server SQL Reference for more information about the ANALYZE command and its options.
If the number of frequently occurring distinct values in a column is relatively small, then it is useful to set the number of buckets to be greater than the number of frequently occurring distinct values.
USER_HISTOGRAMS
Find the number of buckets in each column's histogram in:
USER_TAB_COLUMNS
See Also: "Using Histograms" in Appendix A, "Optimizer Concepts"
Oracle7 Server Reference for more information and column descriptions of data dictionary views.
Oracle can generate statistics using these techniques:
Because of the time and space required for the computation of table statistics, it is usually best to perform an estimation with a 20% sample size for tables and clusters. For indexes, computation does not take up as much time or space, so it is best to perform a computation.
When you generate statistics for a table, column, or index, if the data dictionary already contains statistics for the analyzed object, Oracle updates the existing statistics with the new ones. Oracle invalidates any currently parsed SQL statements that access any of the analyzed objects. When such a statement is next executed, the optimizer automatically chooses a new execution plan based on the new statistics. Distributed statements issued on remote databases that access the analyzed objects use the new statistics when they are next parsed.
Some statistics are always computed, regardless of whether you specify computation or estimation. If you choose estimation and the time saved by estimating a statistic is negligible, Oracle computes the statistic.
You can generate statistics with the ANALYZE command.
Example: This example generates statistics for the EMP table and its indexes:
ANALYZE TABLE emp
ESTIMATE STATISTICS;
For example, consider a join statement that can be executed with either a nested loops operation or a sort-merge operation. The sort-merge operation may return the entire query result faster, while the nested loops operation may return the first row faster. If the goal is best throughput, the optimizer is more likely to choose a sort-merge join. If the goal is best response time, the optimizer is more likely to choose a nested loops join.
Choose a goal for the optimizer based on the needs of your application:
ALTER SESSION SET OPTIMIZER_GOAL = FIRST_ROWS;
Note that the following sort parameters can now be modified using ALTER SESSION SET ... or ALTER SYSTEM SET ... DEFERRED:
SORT_AREA_RETAINED_SIZE
SORT_DIRECT_WRITES
SORT_WRITE_BUFFERS
SORT_WRITE_BUFFER_SIZE
SORT_READ_FAC
Analyzing a table uses more system resources than analyzing an index. It may be helpful to analyze the indexes for a table separately, with a higher sampling rate.
Use of access path and join method hints causes the cost-based optimization to be invoked. Since cost-based optimization is dependent on statistics, it is important to analyze all tables referenced in a query which has hints, even though rule-based optimization may have been selected as the system default.
If you neither collect statistics nor add hints to your SQL statements, your statements will use rule-based optimization. However, you should eventually migrate your existing applications to use the cost-based approach, because the rule-based approach will not be available in future versions of Oracle.
You can enable cost-based optimization on a trial basis simply by collecting statistics. You can then return to rule-based optimization by deleting them or by setting either the value of the OPTIMIZER_MODE initialization parameter or the OPTIMIZER_GOAL parameter of the ALTER SESSION command to RULE. You can also use this value if you want to collect and examine statistics for your data without using the cost-based approach.
Hints are suggestions that you give the optimizer for optimizing a SQL statement. Hints allow you to make decisions usually made by the optimizer. You can use hints to specify
You can send hints for a SQL statement to the optimizer by enclosing them in a Comment within the statement.
See Also: For more information on Comments, see Chapter 2, "Elements of SQL", of the Oracle7 Server SQL Reference.
A statement block can have only one Comment containing hints. This Comment can only follow the SELECT, UPDATE, or DELETE keyword. The syntax diagrams show the syntax for hints contained in both styles of Comments that Oracle supports within a statement block.
where:
If you specify hints incorrectly, Oracle ignores them but does not return an error:
The optimizer only recognizes hints when using the cost-based approach. If you include any hint (except the RULE hint) in a statement block, the optimizer automatically uses the cost-based approach.
The following sections show the syntax of each hint.
If a SQL statement contains a hint that specifies an optimization approach and goal, the optimizer uses the specified approach regardless of the presence or absence of statistics, the value of the OPTIMIZER_MODE initialization parameter, and the OPTIMIZER_GOAL parameter of the ALTER SESSION command.
SELECT /*+ ALL_ROWS */ empno, ename, sal, job
FROM emp
WHERE empno = 7566;
This hint causes the optimizer to make these choices:
SELECT /*+ FIRST_ROWS */ empno, ename, sal, job
FROM emp
WHERE empno = 7566;
The optimizer ignores this hint in DELETE and UPDATE statement blocks and in SELECT statement blocks that contain any of the following syntax:
If you specify either the ALL_ROWS or FIRST_ROWS hint in a SQL statement and the data dictionary contains no statistics about any of the tables accessed by the statement, the optimizer uses default statistical values (such as allocated storage for such tables) to estimate the missing statistics and subsequently to choose an execution plan. Since these estimates may not be as accurate as those generated by the ANALYZE command, you should use the ANALYZE command to generate statistics for all tables accessed by statements that use cost-based optimization. If you specify hints for access paths or join operations along with either the ALL_ROWS or FIRST_ROWS hint, the optimizer gives precedence to the access paths and join operations specified by the hints.
SELECT /*+ CHOOSE */
empno, ename, sal, job
FROM emp
WHERE empno = 7566;
SELECT --+ RULE
empno, ename, sal, job
FROM emp
WHERE empno = 7566;
The RULE hint, along with the rule-based approach, will not be available in future versions of Oracle.
You must specify the table to be accessed exactly as it appears in the statement. If the statement uses an alias for the table, you must use the alias, rather than the table name, in the hint. The table name within the hint should not include the schema name, if the schema name is present in the statement.
FULL(table)
where table specifies the name or alias of the table on which the full table scan is to be performed.
For example, Oracle performs a full table scan on the ACCOUNTS table to execute this statement, even if there is an index on the ACCNO column that is made available by the condition in the WHERE clause:
SELECT /*+ FULL(a) Don't use the index on ACCNO */ accno, bal
FROM accounts a
WHERE accno = 7086854;
ROWID(table)
where table specifies the name or alias of the table on which the table access by ROWID is to be performed.
CLUSTER(table)
where table specifies the name or alias of the table to be accessed by a cluster scan.
The following example illustrates the use of the CLUSTER hint.
SELECT --+ CLUSTER emp
ename, deptno
FROM emp, dept
WHERE deptno = 10 AND
emp.deptno = dept.deptno;
HASH(table)
where table specifies the name or alias of the table to be accessed by a hash scan.
HASH_AJ(table)
where table specifies the name or alias of the table to be accessed.
See Also: "How to Use an Anti-Join" on page 8-5
where:
table
|
Specifies the name or alias of the table associated with the index to be scanned.
|
index
|
Specifies an index on which an index scan is to be performed.
|
This hint may optionally specify one or more indexes:
SELECT name, height, weight
FROM patients
WHERE sex = 'M';
Assume that there is an index on the SEX column and that this column contains the values M and F. If there are equal numbers of male and female patients in the hospital, the query returns a relatively large percentage of the table's rows and a full table scan is likely to be faster than an index scan. However, if a very small percentage of the hospital's patients are male, the query returns a relatively small percentage of the table's rows and an index scan is likely to be faster than a full table scan.
The number of occurrences of each distinct column value is not available to the optimizer. The cost-based approach assumes that each value has an equal probability of appearing in each row. For a column having only two distinct values, the optimizer assumes each value appears in 50% of the rows, so the cost-based approach is likely to choose a full table scan rather than an index scan.
If you know that the value in the WHERE clause of your query appears in a very small percentage of the rows, you can use the INDEX hint to force the optimizer to choose an index scan. In this statement, the INDEX hint explicitly chooses an index scan on the SEX_INDEX, the index on the SEX column:
SELECT /*+ INDEX(patients sex_index) Use SEX_INDEX, since there
are few male patients */
name, height, weight
FROM patients
WHERE sex = 'M';
Each parameter serves the same purpose as in the INDEX hint.
Because Oracle's default behavior for a range scan is to scan index entries in ascending order of their indexed values, this hint does not currently specify anything more than the INDEX hint. However, since Oracle Corporation does not guarantee that the default behavior for an index range scan will remain the same in future versions of Oracle, you may want to use the INDEX_ASC hint to specify ascending range scans explicitly, should the default behavior change.
Each parameter serves the same purpose as in the INDEX hint. This hint has no effect on SQL statements that access more than one table. Such statements always perform range scans in ascending order of the indexed values. For example, consider this table, which contains the temperature readings of a tank of water holding marine life:
CREATE TABLE tank_readings
(time DATE CONSTRAINT un_time UNIQUE,
temperature NUMBER );
Each of the table's rows stores a time and the temperature measured at that time. A UNIQUE constraint on the TIME column ensures that the table does not contain more than one reading for the same time.
Oracle enforces this constraint with an index on the TIME column. Consider this complex query, which selects the most recent temperature reading taken as of a particular time T. The subquery returns either T or the latest time before T at which a temperature reading was taken. The parent query then finds the temperature taken at that time:
SELECT temperature
FROM tank_readings
WHERE time = (SELECT MAX(time)
FROM tank_readings
WHERE time <= TO_DATE(:t) );
The execution plan for this statement looks like the following figure:
SELECT /*+ INDEX_DESC(tank_readings un_time) */ temperature
FROM tank_readings
WHERE time <= TO_DATE(:t)
AND ROWNUM = 1
ORDER BY time DESC;
The execution plan for this query looks like the following figure:
See Also: "FAST FULL SCAN" on page 8-12
MERGE_AJ(table)
where table specifies the name or alias of the table to be accessed.
See Also: "How to Use an Anti-Join" on page 8-5
where:
SELECT /*+ ORDERED */ tab1.col1, tab2.col2, tab3.col3
FROM tab1, tab2, tab3
WHERE tab1.col1 = tab2.col1
AND tab2.col1 = tab3.col1;
If you omit the ORDERED hint from a SQL statement performing a join, the optimizer chooses the order in which to join the tables.
You may want to use the ORDERED hint to specify a join order if you know something about the number of rows selected from each table that the optimizer does not. Such information would allow you to choose an inner and outer table better than the optimizer could.
Usually, if you analyze the tables the optimizer will choose an efficient star plan. You can also use hints to improve the plan. The most precise method is to order the tables in the FROM clause in the order of the keys in the index, with the large table last. Then use the following hints:
/*+ ORDERED USE_NL(facts) INDEX(facts fact_concat) */
A more general method is to use the STAR hint /*+ STAR */.
See Also: Appendix A, "Optimizer Concepts"
You must specify a table to be joined exactly as it appears in the statement. If the statement uses an alias for the table, you must use the alias rather than the table name in the hint. The table name within the hint should not include the schema name, if the schema name is present in the statement.
The USE_NL and USE_MERGE hints must be used with the ORDERED hint. Oracle uses these hints when the referenced table is forced to be the inner table of a join, and they are ignored if the referenced table is the outer table.
where table is the name or alias of a table to be used as the inner table of a nested loops join.
For example, consider this statement, which joins the ACCOUNTS and CUSTOMERS tables. Assume that these tables are not stored together in a cluster:
SELECT accounts.balance, customers.last_name, customers.first_name
FROM accounts, customers
WHERE accounts.custno = customers.custno;
Since the default goal of the cost-based approach is best throughput, the optimizer will choose either a nested loops operation or a sort-merge operation to join these tables, depending on which is likely to return all the rows selected by the query more quickly.
However, you may want to optimize the statement for best response time, or the minimal elapsed time necessary to return the first row selected by the query, rather than best throughput. If so, you can force the optimizer to choose a nested loops join by using the USE_NL hint. In this statement, the USE_NL hint explicitly chooses a nested loops join with the CUSTOMERS table as the inner table:
SELECT /*+ ORDERED USE_NL(customers) Use N-L to get first row
faster */
accounts.balance, customers.last_name, customers.first_name
FROM accounts, customers
WHERE accounts.custno = customers.custno;
In many cases, a nested loops join returns the first row faster than a sort-merge join. A nested loops join can return the first row after reading the first selected row from one table and the first matching row from the other and combining them, while a sort-merge join cannot return the first row until after reading and sorting all selected rows of both tables and then combining the first rows of each sorted row source.
where table is a table to be joined to the row source resulting from joining the previous tables in the join order using a sort-merge join.
This hint is most often used to reduce the number of possible permutations for a query and make optimization faster. This hint has no arguments. For example,
SELECT * FROM t1, (SELECT /*+ NO_MERGE */ * from t2) v ...
causes view v not to be merged.
where table is a table to be joined to the row source resulting from joining the previous tables in the join order using a hash join.
See Also: Chapter 18, "Parallel Query Tuning"
The PARALLEL hint must use the table alias if an alias is specified in the query. The PARALLEL hint can then take two values separated by commas after the table name. The first value specifies the degree of parallelism for the given table, the second value specifies how the table is to be split among the instances of a parallel server. Specifying DEFAULT or no value signifies the query coordinator should examine the settings of the initialization parameters (described in a later section) to determine the default degree of parallelism.
In the following example, the PARALLEL hint overrides the degree of parallelism specified in the EMP table definition:
SELECT /*+ FULL(scott_emp) PARALLEL(scott_emp, 5) */
ename
FROM scott.emp scott_emp;
In the next example, the PARALLEL hint overrides the degree of parallelism specified in the EMP table definition and tells the optimizer to use the default degree of parallelism determined by the initialization parameters. This hint also specifies that the table should be split among all of the available instances, with the default degree of parallelism on each instance.
SELECT /*+ FULL(scott_emp) PARALLEL(scott_emp, DEFAULT,DEFAULT) */
ename
FROM scott.emp scott_emp;
SELECT /*+ NOPARALLEL(scott_emp) */
ename
FROM scott.emp scott_emp;
The NOPARALLEL hint is equivalent to specifying the hint
SELECT /*+ FULL (scott_emp) CACHE(scott_emp) */
ename
FROM scott.emp scott_emp;
SELECT /*+ FULL(scott_emp) NOCACHE(scott_emp) */
ename
FROM scott.emp scott_emp;
The hint will have no effect if the subquery is applied to a remote table or one that is joined using a merge join.
This example shows the execution plans for two SQL statements that perform the same function. Both statements return all the departments in the DEPT table that have no employees in the EMP table. Each statement searches the EMP table with a subquery. Assume there is an index, DEPTNO_INDEX, on the DEPTNO column of the EMP table.
This is the first statement and its execution plan:
SELECT dname, deptno
FROM dept
WHERE deptno NOT IN
(SELECT deptno FROM emp);
However, this SQL statement selects the same rows by accessing the index:
SELECT dname, deptno
FROM dept
WHERE NOT EXISTS
(SELECT deptno
FROM emp
WHERE dept.deptno = emp.deptno);
If you have statements in your applications that use the NOT IN operator, as the first query in this example does, you should consider rewriting them so that they use the NOT EXISTS operator. This would allow such statements to use an index, if one exists.
![]() ![]() Prev Next |
![]() Copyright © 1996 Oracle Corporation. All Rights Reserved. |
![]() Library |
![]() Product |
![]() Contents |
![]() Index |