Oracle7 Server Tuning

Library

Product

Contents

Index

Transforming ORs into Compound Queries

Optimizer Concepts
Evaluating Expressions and Conditions
Transforming ORs into Compound Queries
Transforming Complex Statements into Join Statements
Optimizing Statements that Access Views
Choosing an Optimization Approach and Goal
Choosing Access Paths
Choosing Access Paths with the Cost-Based Approach
Optimizing Join Statements
Optimizing Star Queries
Optimizing Compound Queries
Using Histograms

Optimizer Concepts

This appendix discusses various optimization issues, to supplement "The Optimizer" chapter in Oracle7 Server Concepts. Topics include the following:

Evaluating Expressions and Conditions

The optimizer fully evaluates expressions whenever possible and translates certain syntactic constructs into equivalent constructs. This is done either because Oracle can more quickly evaluate the resulting expression than the original expression or because the original expression is merely a syntactic equivalent of the resulting expression. Since there are different SQL constructs that can operate identically (for example: = ANY (subquery) and IN (subquery)), Oracle maps these to a single construct.

Constants Any computation of constants is performed only once when the statement is optimized rather than each time the statement is executed. Consider these conditions that test for monthly salaries greater than 2000:

sal > 24000/12

sal > 2000

sal*12 > 24000

If a SQL statement contains the first condition, the optimizer simplifies it into the second condition.

Note that the optimizer does not simplify expressions across comparison operators. The optimizer does not simplify the third expression into the second. For this reason, application developers should write conditions that compare columns with constants whenever possible, rather than conditions with expressions involving columns.

LIKE The optimizer simplifies conditions that use the LIKE comparison operator to compare an expression with no wildcard characters into an equivalent condition that uses an equality operator instead. For example, the optimizer simplifies the first condition below into the second:

ename LIKE 'SMITH'

ename = 'SMITH'

The optimizer can simplify these expressions only when the comparison involves variable-length datatypes. For example, if ENAME was of type CHAR(10), the optimizer cannot transform the LIKE operation into an equality operation due to the comparison semantics of fixed-length datatypes.

IN The optimizer expands a condition that uses the IN comparison operator to an equivalent condition that uses equality comparison operators and OR logical operators. For example, the optimizer expands the first condition below into the second:

ename IN ('SMITH', 'KING', 'JONES')

ename = 'SMITH' OR ename = 'KING' OR ename = 'JONES'

ANY or SOME The optimizer expands a condition that uses the ANY or SOME comparison operator followed by a parenthesized list of values into an equivalent condition that uses equality comparison operators and OR logical operators. For example, the optimizer expands the first condition below into the second:

sal > ANY (:first_sal, :second_sal)

sal > :first_sal OR sal > :second_sal

The optimizer transforms a condition that uses the ANY or SOME operator followed by a subquery into a condition containing the EXISTS operator and a correlated subquery. For example, the optimizer transforms the first condition below into the second:

x > ANY (SELECT sal

		FROM emp

		WHERE job = 'ANALYST')

EXISTS (SELECT sal

		FROM emp

		WHERE job = 'ANALYST'

		  AND x > sal)

ALL The optimizer expands a condition that uses the ALL comparison operator followed by a parenthesized list of values into an equivalent condition that uses equality comparison operators and AND logical operators. For example, the optimizer expands the first condition below into the second:

sal > ALL (:first_sal, :second_sal)

sal > :first_sal AND sal > :second_sal

The optimizer transforms a condition that uses the ALL comparison operator followed by a subquery into an equivalent condition that uses the ANY comparison operator and a complementary comparison operator. For example, the optimizer transforms the first condition below into the second:

x > ALL (SELECT sal

		FROM emp 		
		WHERE deptno = 10)

NOT (x <= ANY (SELECT sal

			FROM emp 			
			WHERE deptno = 10) )

The optimizer then transforms the second query into the following query using the rule for transforming conditions with the ANY comparison operator followed by a correlated subquery:

NOT EXISTS (SELECT sal

			FROM emp

			WHERE deptno = 10
			  AND x <= sal)

BETWEEN The optimizer always replaces a condition that uses the BETWEEN comparison operator with an equivalent condition that uses the >= and <= comparison operators. For example, the optimizer replaces the first condition below with the second:

sal BETWEEN 2000 AND 3000

sal >= 2000 AND sal <= 3000

NOT The optimizer simplifies a condition to eliminate the NOT logical operator. The simplification involves removing the NOT logical operator and replacing a comparison operator with its opposite comparison operator. For example, the optimizer simplifies the first condition below into the second one:

NOT deptno = (SELECT deptno FROM emp WHERE ename = 'TAYLOR')

deptno <> (SELECT deptno FROM emp WHERE ename = 'TAYLOR')

Often a condition containing the NOT logical operator can be written many different ways. The optimizer attempts to transform such a condition so that the subconditions negated by NOTs are as simple as possible, even if the resulting condition contains more NOTs. For example, the optimizer simplifies the first condition below into the second and then into the third.

NOT (sal < 1000 OR comm IS NULL)

NOT sal < 1000 AND comm IS NOT NULL

sal >= 1000 AND comm IS NOT NULL

Transitivity If two conditions in the WHERE clause involve a common column, the optimizer can sometimes infer a third condition using the transitivity principle. The optimizer can then use the inferred condition to optimize the statement. The inferred condition could potentially make available an index access path that was not made available by the original conditions.

Imagine a WHERE clause containing two conditions of these forms:

WHERE column1 comp_oper constant

  AND column1 = column2

In this case, the optimizer infers the condition

	column2 comp_oper constant

where:

comp_oper

Is any of the comparison operators =, !=, ^=, <, <>, <, >, <=, or >=.

constant

Is any constant expression involving operators, SQL functions, literals, bind variables, and correlation variables.

Note: Transitivity is used only by the cost-based approach.

Example: Consider this query in which the WHERE clause containing two conditions that each use the EMP.DEPTNO column:

SELECT *

	FROM emp, dept

	WHERE emp.deptno = 20

	  AND emp.deptno = dept.deptno;

Using transitivity, the optimizer infers this condition:

dept.deptno = 20

If there is an index on the DEPT.DEPTNO column, this condition makes available access paths using that index.

Note: The optimizer only infers conditions that relate columns to constant expressions, rather than columns to other columns. Imagine a WHERE clause containing two conditions of these forms:

WHERE column1 comp_oper column3

  AND column1 = column2

In this case, the optimizer does not infer this condition:

column2 comp_oper column3

Transforming ORs into Compound Queries

Since SQL is such a flexible query language, there often are many statements you could formulate to achieve a given goal. Sometimes the optimizer transforms one such statement into another that achieves the same goal if the second statement can be executed more efficiently. This section discusses transforming queries containing ORs into compound statements containing UNION ALL.

If a query contains a WHERE clause with multiple conditions combined with OR operators, the optimizer transforms it into an equivalent compound query that uses the UNION ALL set operator if this will make it execute more efficiently:

If each condition individually makes an index access path available, the optimizer can make the transformation. The optimizer then chooses an execution plan for the resulting statement that accesses the table multiple times using the different indexes and then puts the results together.
If any condition requires a full table scan because it does not make an index available, the optimizer does not transform the statement. The optimizer chooses a full table scan to execute the statement, and Oracle tests each row in the table to determine whether it satisfies any of the conditions.

For information on access paths and how indexes make them available, see "Choosing Access Paths" on page A-23 and the sections that follow it. For statements that use the cost-based approach, the optimizer may also use statistics to determine whether to make the transformation by estimating and then comparing the costs of executing the original statement versus the resulting statement.

Example: Consider this query with a WHERE clause that contains two conditions combined with an OR operator:

SELECT *

	FROM emp

	WHERE job = 'CLERK'

	  OR deptno = 10;

If there are indexes on both the JOB and DEPTNO columns, the optimizer may transform the query above into the equivalent query below:

SELECT *

	FROM emp

	WHERE job = 'CLERK'

UNION ALL

SELECT *

	FROM emp

	WHERE deptno = 10

	  AND job <> 'CLERK';

If you are using the rule-based approach, the optimizer makes this transformation because each component query of the resulting compound query can be executed using an index. The rule-based approach assumes that executing the compound query using two index scans is faster than executing the original query using a full table scan.

If you are using the cost-based approach, the optimizer compares the cost of executing the original query using a full table scan with that of executing the resulting query when deciding whether to make the transformation. The execution plan for the transformed statement might look like this:

Execution Plan for a Compound Query

To execute the transformed query, Oracle performs these steps:

Steps 3 and 5 scan the indexes on the JOB and DEPTNO columns using the conditions of the component queries. These steps obtain ROWIDs of the rows that satisfy the component queries.
Steps 2 and 4 use the ROWIDs from Steps 3 and 5 to locate the rows that satisfy each component query.
Step 1 puts together the row sources returned by Steps 2 and 4.

If either of the JOB or DEPTNO columns is not indexed, the optimizer does not even consider the transformation because the resulting compound query would require a full table scan to execute one of its component queries. Executing the compound query with a full table scan in addition to an index scan could not possibly be faster than executing the original query with a full table scan.

Example: Consider this query and assume there is an index on the ENAME column:

SELECT *

	FROM emp

	WHERE ename = 'SMITH'

	  OR sal > comm;

Transforming the query above would result in the compound query below:

SELECT *

	FROM emp

	WHERE ename = 'SMITH'

UNION ALL

SELECT *

	FROM emp

	WHERE sal > comm;

Since the condition in the WHERE clause of the second component query (SAL > COMM) does not make an index available, the compound query requires a full table scan. For this reason, the optimizer does not consider making the transformation and chooses a full table scan to execute the original statement.

Transforming Complex Statements into Join Statements

The optimizer transforms a complex statement into a join statement whenever such a transformation results in a join statement that is guaranteed to return exactly the same rows as the complex statement. This transformation allows Oracle to execute the statement by taking advantage of join optimization techniques described in the section "Optimizing Join Statements" on page A-37.

Consider this complex statement that selects all rows from the ACCOUNTS table whose owners appear in the CUSTOMERS table:

SELECT *

	FROM accounts

	WHERE custno IN

		(SELECT custno FROM customers);

If the CUSTNO column of the CUSTOMERS table is a primary key or has a UNIQUE constraint, the optimizer can transform the complex query into this join statement that is guaranteed to return the same data:

SELECT accounts.*

	FROM accounts, customers

	WHERE accounts.custno = customers.custno;

The execution plan for this statement might look like the following figure:

Execution Plan for a Nested Loops Join

To execute this statement, Oracle performs a nested loops join operation. For information on nested loops joins, see the section "Join Operations" on page A-37.

If the optimizer cannot transform a complex statement into a join statement, the optimizer chooses execution plans for the parent statement and the subquery as though they were separate statements. Oracle then executes the subquery and uses the rows it returns to execute the parent query.

Consider this complex statement that returns all rows from the ACCOUNTS table that have balances that are greater than the average account balance:

SELECT *

	FROM accounts

	WHERE accounts.balance >

		(SELECT AVG(balance) FROM accounts);

There is no join statement that can perform the function of this statement, so the optimizer does not transform the statement. Note that complex queries whose subqueries contain group functions such as AVG cannot be transformed into join statements.

Optimizing Statements that Access Views

To optimize a statement that accesses a view, the optimizer chooses one of these alternatives:

Transforming Statements that Access Views
Optimizing Statements that Cannot Be Transformed

Transforming Statements that Access Views

To transform a statement that accesses a view into an equivalent statement that accesses the view's base tables, the optimizer can use one of these techniques:

merging the view's query into the accessing statement
merging the accessing statement into the view's query

The optimizer then optimizes the resulting statement.

Merging the View's Query into the Accessing Statement To merge the view's query into the accessing statement, the optimizer replaces the name of the view with the name of its base table in the accessing statement and adds the condition of the view's query's WHERE clause to the accessing statement's WHERE clause.

Example: Consider this view of all employees who work in department 10:

CREATE VIEW emp_10 
AS SELECT empno,ename, job, mgr, hiredate, sal, comm, deptno 
		FROM emp
		WHERE deptno = 10;

Consider this query that accesses the view. The query selects the IDs greater than 7800 of employees who work in department 10:

SELECT empno 
	FROM emp_10
	WHERE empno > 7800;

The optimizer transforms the query into the following query that accesses the view's base table:

SELECT empno 
	FROM emp 
	WHERE deptno = 10
	  AND empno > 7800;

If there are indexes on the DEPTNO or EMPNO columns, the resulting WHERE clause makes them available.

Merging the Accessing Statement into the View's Query The optimizer cannot always merge the view's query into the accessing statement. Such a transformation is not possible if the view's query contains

set operators (UNION, UNION ALL, INTERSECT, MINUS)
a GROUP BY clause
a CONNECT BY clause
a DISTINCT operator in the select list
group functions (AVG, COUNT, MAX, MIN, SUM) in the select list

To optimize statements that access such views, the optimizer can merge the statement into the view's query.

Example: Consider the TWO_EMP_TABLES view, which is the union of two employee tables. The view is defined with a compound query that uses the UNION set operator:

CREATE VIEW two_emp_tables

	(empno, ename, job, mgr, hiredate, sal, comm, deptno) AS

	SELECT empno, ename, job, mgr, hiredate, sal, comm,
		deptno FROM emp1 UNION

	SELECT empno, ename, job, mgr, hiredate, sal, comm,
		deptno FROM emp2;

Consider this query that accesses the view. The query selects the IDs and names of all employees in either table who work in department 20:

SELECT empno, ename

	FROM two_emp_tables

	WHERE deptno = 20;

Since the view is defined as a compound query, the optimizer cannot merge the view query into the accessing query. Instead, the optimizer transforms the query by adding its WHERE clause condition into the compound query. The resulting statement looks like this:

SELECT empno, ename FROM emp1 WHERE deptno = 20

	UNION

SELECT empno, ename FROM emp2 WHERE deptno = 20;

If there is an index on the DEPTNO column, the resulting WHERE clauses make it available. The following figure shows the execution plan of the resulting statement.

Accessing a View Defined with the UNION Set Operator

To execute this statement, Oracle performs these steps:

Steps 5 and 6 perform full scans of the EMP1 and EMP2 tables.
Step 4 performs a UNION ALL operation returning all rows returned by either Step 5 or Step 6, including all copies of duplicates.
Step 3 sorts the result of Step 4, eliminating duplicate rows.
Step 2 extracts the desired columns from the result of Step 3.
Step 1 indicates that the view's query was not merged into the accessing query.

Example: Consider the view EMP_GROUP_BY_DEPTNO, which contains the department number, average salary, minimum salary, and maximum salary of all departments that have employees:

CREATE VIEW emp_group_by_deptno

	AS SELECT deptno,

	       AVG(sal) avg_sal,

		MIN(sal) min_sal,

	       MAX(sal) max_sal

		  FROM emp

		  GROUP BY deptno;

Consider this query, which selects the average, minimum, and maximum salaries of department 10 from the EMP_GROUP_BY_DEPTNO view:

SELECT *

	FROM emp_group_by_deptno

	WHERE deptno = 10;

The optimizer transforms the statement by adding its WHERE clause condition into the view's query. The resulting statement looks like this:

SELECT deptno,

	AVG(sal) avg_sal,

	MIN(sal) min_sal,

	MAX(sal) max_sal,

  FROM emp

  WHERE deptno = 10

  GROUP BY deptno;

If there is an index on the DEPTNO column, the resulting WHERE clause makes it available.

The following figure shows the execution plan for the resulting statement. The execution plan uses an index on the DEPTNO column.

Accessing a View Defined with a GROUP BY Clause

To execute this statement, Oracle performs these operations:

Step 4 performs a range scan on the index named EMP_DEPTNO_INDEX to retrieve the ROWIDs of all rows in the EMP table with a DEPTNO value of 10. EMP_DEPTNO_INDEX is an index on the DEPTNO column of the EMP table.
Step 3 accesses the EMP table using the ROWIDs retrieved by Step 4.
Step 2 sorts the rows returned by Step 3 to calculate the average, minimum, and maximum SAL values.
Step 1 indicates that the view's query was not merged into the accessing query.

Example: Consider this query, which accesses the EMP_GROUP_BY_DEPTNO view defined in the previous example. This query derives the averages for the average department salary, the minimum department salary, and the maximum department salary from the employee table:

SELECT AVG(avg_sal), AVG(min_sal), AVG(max_sal)

	FROM emp_group_by_deptno;

The optimizer transforms this statement by applying the AVG group function to the select list of the view's query:

SELECT AVG(AVG(sal)), AVG(MIN(sal)), AVG(MAX(sal))

	FROM emp

	GROUP BY deptno;

The following figure shows the execution plan of the resulting statement:

Applying Group Functions to a View Defined with a GROUP BY Clause

To execute this statement, Oracle performs these operations:

Step 4 performs a full scan of the EMP table.
Step 3 sorts the rows returned by Step 4 into groups based on their DEPTNO values and calculates the average, minimum, and maximum SAL value of each group.
Step 2 indicates that the view's query was not merged into the accessing query.
Step 1 calculates the averages of the values returned by Step 2.

Optimizing Statements that Cannot Be Transformed

The optimizer cannot transform all statements that access views into equivalent statements that access base table(s). To execute a statement that cannot be transformed, Oracle issues the view's query, collects the resulting set of rows, and then accesses this set of rows with the original statement as though it were a table.

Example: Consider the EMP_GROUP_BY_DEPTNO view defined in the previous section:

CREATE VIEW emp_group_by_deptno

	AS SELECT deptno,

	      AVG(sal) avg_sal,
		MIN(sal) min_sal,

	       MAX(sal) max_sal

		  FROM emp 
		  GROUP BY deptno;

Consider this query, which accesses the view. The query joins the average, minimum, and maximum salaries from each department represented in this view and to the name and location of the department in the DEPT table:

SELECT emp_group_by_deptno.deptno, avg_sal, min_sal,
	max_sal, dname, loc

	FROM emp_group_by_deptno, dept 
	WHERE emp_group_by_deptno.deptno = dept.deptno;

Since there is no equivalent statement that accesses only base tables, the optimizer cannot transform this statement. Instead, the optimizer chooses an execution plan that issues the view's query and then uses the resulting set of rows as it would the rows resulting from a table access.

The following figure shows the execution plan for this statement.

Joining a View Defined with a Group BY Clause to a Table

To execute this statement, Oracle performs these operations:

Step 4 performs a full scan of the EMP table.
Step 3 sorts the results of Step 4 and calculates the average, minimum, and maximum SAL values selected by the query for the EMP_GROUP_BY_DEPTNO view.
Step 2 indicates that the previous two steps were executed to select data for a view.
For each row returned by Step 2, Step 6 uses the DEPTNO value to perform a unique scan of the PK_DEPT index.
Step 5 uses each ROWID returned by Step 6 to locate the row in the DEPTNO table with the matching DEPTNO value.
Oracle combines each row returned by Step 2 with the matching row returned by Step 5 and returns the result.

For more information on how Oracle performs a nested loops join operation, see the section "Join Operations" on page A-37.

Choosing an Optimization Approach and Goal

The optimizer's behavior when choosing an optimization approach and goal for a SQL statement is affected by these factors:

the OPTIMIZER_MODE initialization parameter
statistics in the data dictionary
the OPTIMIZER_GOAL parameter of the ALTER SESSION command
hints in the statement

OPTIMIZER_MODE Initialization Parameter

The OPTIMIZER_MODE initialization parameter establishes the default behavior for choosing an optimization approach for the instance. This parameter can have these values:

CHOOSE	This value causes the optimizer to choose between the rule-based approach and the cost-based approach based on whether the statistics used for the cost-based approach are present. If the data dictionary contains statistics for at least one of the accessed tables, the optimizer uses the cost-based approach and optimizes with a goal of best throughput. If data dictionary contains no statistics for any of the accessed tables, the optimizer uses the rule-based approach. This is the default value for the parameter.
RULE	This value causes the optimizer to choose the rule-based approach for all SQL statements issued to the instance regardless of the presence of statistics.
ALL_ROWS	This value causes the optimizer to use the cost-based approach for all SQL statements in the session regardless of the presence of statistics and to optimize with a goal of best throughput (minimum resource usage to complete the entire statement).
FIRST_ROWS	This value causes the optimizer to use the cost-based approach for all SQL statements in the session regardless of the presence of statistics and to optimize with a goal of best response time (minimum resource usage to return the first row of the result set).

Note that PL/SQL ignores the OPTIMIZER_MODE=FIRST_ROWS initialization parameter setting.

If the optimizer uses the cost-based approach for a SQL statement, and some tables accessed by the statement have no statistics, the optimizer uses internal information such as the number of data blocks allocated to these tables to estimate other statistics for these tables.

OPTIMIZER_GOAL Parameter of the ALTER SESSION Command

The OPTIMIZER_GOAL parameter of the ALTER SESSION command can override the optimization approach and goal established by the OPTIMIZER_MODE initialization parameter for an individual session. This parameter can have these values:

CHOOSE	This value causes the optimizer to choose between the rule-based approach and the cost-based approach based on whether the statistics used for the cost-based approach are present. If the data dictionary contains statistics for at least one of the accessed tables, the optimizer uses the cost-based approach and optimizes with a goal of best throughput. If the data dictionary contains no statistics for any of the accessed tables, the optimizer uses the rule-based approach.
ALL_ROWS	This value causes the optimizer to use the cost-based approach for all SQL statements in the session regardless of the presence of statistics and to optimize with a goal of best throughput (minimum resource usage to complete the entire statement).
FIRST_ROWS	This value causes the optimizer to use the cost-based approach for all SQL statements in the session regardless of the presence of statistics and to optimize with a goal of best response time (minimum resource usage to return the first row of the result set).
RULE	This value causes the optimizer to choose the rule-based approach for all SQL statements issued to the instance regardless of the presence of statistics.

The value of this parameter affects the optimization of SQL statements issued by stored procedures and functions called during the session, but it does not affect the optimization of recursive SQL statements that Oracle issues during the session. The optimization approach for recursive SQL statements is only affected by the value of the OPTIMIZER_MODE initialization parameter.

Use hints to determine the access path for SQL statements submitted from within PL/SQL. The ALTER SESSION OPTIMIZER_GOAL statement does not affect SQL that is run from within PL/SQL.

The FIRST_ROWS, ALL_ROWS, and RULE Hints

The FIRST_ROWS, ALL_ROWS, CHOOSE, and RULE hints can override the affects of the OPTIMIZER_MODE initialization parameter and the OPTIMIZER_GOAL parameter of the ALTER SESSION command for an individual SQL statement. For information on these hints, see Chapter 7, "Optimization Modes and Hints".

Choosing Access Paths

One of the most important choices the optimizer makes when formulating an execution plan is how to retrieve the data from the database. For any row in any table accessed by a SQL statement, there may be many access paths by which that row can be located and retrieved. The optimizer chooses one of them. For more information about access paths, see "The Optimizer" chapter in Oracle7 Server Concepts.

The optimizer can only choose to use a particular access path for a table if the statement contains a WHERE clause condition or other construct that makes that access path available. Each of the following sections describes an access path and discusses

when it is available
the method Oracle uses to access data with it
the output generated for it by the EXPLAIN PLAN command

Path 1: Single Row by ROWID This access path is only available if the statement's WHERE clause identifies the selected rows by ROWID or with the CURRENT OF CURSOR embedded SQL syntax supported by the Oracle Precompilers. To execute the statement, Oracle accesses the table by ROWID.

Example: This access path is available in the following statement:

SELECT * FROM emp WHERE ROWID = '00000DC5.0000.0001';

The EXPLAIN PLAN output for this statement might look like this:

OPERATION		OPTIONS	 	OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT   
	TABLE ACCESS			 		BY ROWID 		EMP

Path 2: Single Row by Cluster Join This access path is available for statements that join tables stored in the same cluster if both of these conditions are true:

The statement's WHERE clause contains conditions that equate each column of the cluster key in one table with the corresponding column in the other table.
The statement's WHERE clause also contains a condition that guarantees that the join returns only one row. Such a condition is likely to be an equality condition on the column(s) of a unique or primary key.

These conditions must be combined with AND operators. To execute the statement, Oracle performs a nested loops operation. For information on the nested loops operation, see the section "Join Operations" on page A-37.

Example: This access path is available for the following statement in which the EMP and DEPT tables are clustered on the DEPTNO column and the EMPNO column is the primary key of the EMP table:

SELECT *

	FROM emp, dept

	WHERE emp.deptno = dept.deptno

	  AND emp.empno = 7900;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 		OPTIONS		 OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

	NESTED LOOPS

	  TABLE ACCESS		BY ROWID	EMP

		INDEX			UNIQUE SCAN	PK_EMP

	  TABLE ACCESS		CLUSTER		DEPT

PK_EMP is the name of an index that enforces the primary key.

Path 3: Single Row by Hash Cluster Key with Unique or Primary Key This access path is available if both of these conditions are true:

The statement's WHERE clause uses all columns of a hash cluster key in equality conditions. For composite cluster keys, the equality conditions must be combined with AND operators.
The statement is guaranteed to return only one row because the columns that make up the hash cluster key also make up a unique or primary key.

To execute the statement, Oracle applies the cluster's hash function to the hash cluster key value specified in the statement to obtain a hash value. Oracle then uses the hash value to perform a hash scan on the table.

Example: This access path is available in the following statement, in which the ORDERS and LINE_ITEMS tables are stored in a hash cluster, and the ORDERNO column is both the cluster key and the primary key of the ORDERS table:

SELECT *

	FROM orders
	WHERE orderno = 65118968;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION	 		OPTIONS 		OBJECT_NAME 
----------------------------------------------------- 
SELECT STATEMENT   
	TABLE ACCESS		HASH		ORDERS

Path 4: Single Row by Unique or Primary Key This access path is available if the statement's WHERE clause uses all columns of a unique or primary key in equality conditions. For composite keys, the equality conditions must be combined with AND operators. To execute the statement, Oracle performs a unique scan on the index on the unique or primary key to retrieve a single ROWID and then accesses the table by that ROWID.

Example: This access path is available in the following statement in which the EMPNO column is the primary key of the EMP table:

SELECT * 
	FROM emp 	
	WHERE empno = 7900;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION	 	OPTIONS		 OBJECT_NAME 
----------------------------------------------------- 
SELECT STATEMENT 
  TABLE ACCESS	BY ROWID	EMP     
	INDEX		UNIQUE SCAN	PK_EMP

PK_EMP is the name of the index that enforces the primary key.

Path 5: Clustered Join This access path is available for statements that join tables stored in the same cluster if the statement's WHERE clause contains conditions that equate each column of the cluster key in one table with the corresponding column in the other table. For a composite cluster key, the equality conditions must be combined with AND operators. To execute the statement, Oracle performs a nested loops operation. For information on nested loops operations, see the section "Join Operations" on page A-37.

Example: This access path is available in the following statement in which the EMP and DEPT tables are clustered on the DEPTNO column:

SELECT * 
	FROM emp, dept 
	WHERE emp.deptno = dept.deptno;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 			OPTIONS	 	OBJECT_NAME 
----------------------------------------------------- 
SELECT STATEMENT 
  NESTED LOOPS 
	TABLE ACCESS		FULL		DEPT     
	TABLE ACCESS		CLUSTER	EMP

Path 6: Hash Cluster Key This access path is available if the statement's WHERE clause uses all the columns of a hash cluster key in equality conditions. For a composite cluster key, the equality conditions must be combined with AND operators. To execute the statement, Oracle applies the cluster's hash function to the hash cluster key value specified in the statement to obtain a hash value. Oracle then uses this hash value to perform a hash scan on the table.

Example: This access path is available for the following statement in which the ORDERS and LINE_ITEMS tables are stored in a hash cluster and the ORDERNO column is the cluster key:

SELECT * 
	FROM line_items
	WHERE orderno = 65118968;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 			OPTIONS		 OBJECT_NAME 
----------------------------------------------------- 
SELECT STATEMENT   
	TABLE ACCESS		HASH		LINE_ITEMS

Path 7: Indexed Cluster Key This access path is available if the statement's WHERE clause uses all the columns of an indexed cluster key in equality conditions. For a composite cluster key, the equality conditions must be combined with AND operators. To execute the statement, Oracle performs a unique scan on the cluster index to retrieve the ROWID of one row with the specified cluster key value. Oracle then uses that ROWID to access the table with a cluster scan. Since all rows with the same cluster key value are stored together, the cluster scan requires only a single ROWID to find them all.

Example: This access path is available in the following statement in which the EMP table is stored in an indexed cluster and the DEPTNO column is the cluster key:

SELECT *  FROM emp 
	WHERE deptno = 10;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION		 OPTIONS			 OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  TABLE ACCESS	CLUSTER			EMP

	INDEX 		UNIQUE SCAN		 PERS_INDEX

PERS_INDEX is the name of the cluster index.

Path 8: Composite Index This access path is available if the statement's WHERE clause uses all columns of a composite index in equality conditions combined with AND operators. To execute the statement, Oracle performs a range scan on the index to retrieve the ROWIDs of the selected rows and then accesses the table by those ROWIDs.

Example: This access path is available in the following statement in which there is a composite index on the JOB and DEPTNO columns:

SELECT *

	FROM emp

	WHERE job = 'CLERK'

	  AND deptno = 30;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION		 OPTIONS		 OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  TABLE ACCESS	BY ROWID	EMP

	INDEX		 RANGE SCAN	 JOB_DEPTNO_INDEX

JOB_DEPTNO_INDEX is the name of the composite index on the JOB and DEPTNO columns.

Path 9: Single-Column Indexes This access path is available if the statement's WHERE clause uses the columns of one or more single-column indexes in equality conditions. For multiple single-column indexes, the conditions must be combined with AND operators.

If the WHERE clause uses the column of only one index, Oracle executes the statement by performing a range scan on the index to retrieve the ROWIDs of the selected rows and then accessing the table by these ROWIDs.

Example: This access path is available in the following statement in which there is an index on the JOB column of the EMP table:

SELECT *

	FROM emp

	WHERE job = 'ANALYST';

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 		OPTION 		OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  TABLE ACCESS	BY ROWID	EMP
	INDEX		 RANGE SCAN 	JOB_INDEX

JOB_INDEX is the index on EMP.JOB.

If the WHERE clauses uses columns of many single-column indexes, Oracle executes the statement by performing a range scan on each index to retrieve the ROWIDs of the rows that satisfy each condition. Oracle then merges the sets of ROWIDs to obtain a set of ROWIDs of rows that satisfy all conditions. Oracle then accesses the table using these ROWIDs.

Oracle can merge up to five indexes. If the WHERE clause uses columns of more than five single-column indexes, Oracle merges five of them, accesses the table by ROWID, and then tests the resulting rows to determine whether they satisfy the remaining conditions before returning them.

Example: This access path is available in the following statement in which there are indexes on both the JOB and DEPTNO columns of the EMP table:

SELECT *

	FROM emp

	WHERE job = 'ANALYST'
	  AND deptno = 20;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION	 	OPTIONS 			OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  TABLE ACCESS	BY ROWID		EMP

	AND-EQUAL

	  INDEX		 RANGE SCAN	 	JOB_INDEX

	  INDEX		 RANGE SCAN		 DEPTNO_INDEX

The AND-EQUAL operation merges the ROWIDs obtained by the scans of the JOB_INDEX and the DEPTNO_INDEX, resulting in a set of ROWIDs of rows that satisfy the query.

Path 10: Bounded Range Search on Indexed Columns This access path is available if the statement's WHERE clause contains a condition that uses either the column of a single-column index or one or more columns that make up a leading portion of a composite index:

column = expr

column >[=] expr AND column <[=] expr

column BETWEEN expr AND expr

column LIKE 'c%'

Each of these conditions specifies a bounded range of indexed values that are accessed by the statement. The range is said to be bounded because the conditions specify both its least value and its greatest value. To execute such a statement, Oracle performs a range scan on the index and then accesses the table by ROWID.

This access path is not available if expr references the indexed column.

Example: This access path is available in this statement in which there is an index on the SAL column of the EMP table:

SELECT *

	FROM emp

	WHERE sal BETWEEN 2000 AND 3000;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 		OPTIONS	 OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  TABLE ACCESS	BY ROWID	EMP   
	INDEX		 RANGE SCAN	SAL_INDEX

SAL_INDEX is the name of the index on EMP.SAL.

Example: This access path is also available in the following statement in which there is an index on the ENAME column of the EMP table:

SELECT *

	FROM emp

	WHERE ename LIKE 'S%';

Path 11: Unbounded Range Search on Indexed Columns This access path is available if the statement's WHERE clause contains one of these conditions that use either the column of a single-column index or one or more columns of a leading portion of a composite index:

WHERE column >[=] expr

WHERE column <[=] expr

Each of these conditions specify an unbounded range of index values accessed by the statement. The range is said to be unbounded because the condition specifies either its least value or its greatest value, but not both. To execute such a statement, Oracle performs a range scan on the index and then accesses the table by ROWID.

Example: This access path is available in the following statement in which there is an index on the SAL column of the EMP table:

SELECT *

	FROM emp 
	WHERE sal > 2000;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 		OPTIONS 		OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  TABLE ACCESS	BY ROWID	EMP     
	INDEX		RANGE SCAN	SAL_INDEX

Example: This access path is available in the following statement in which there is a composite index on the ORDER and LINE columns of the LINE_ITEMS table:

SELECT *

	FROM line_items

	WHERE order > 65118968;

The access path is available because the WHERE clause uses the ORDER column, a leading portion of the index.

Example: This access path is not available in the following statement in which there is an index on the ORDER and LINE columns:

SELECT *

	FROM line_items

	WHERE line < 4;

The access path is not available because the WHERE clause only uses the LINE column, which is not a leading portion of the index.

Path 12: Sort-Merge Join This access path is available for statements that join tables that are not stored together in a cluster if the statement's WHERE clause uses columns from each table in equality conditions. To execute such a statement, Oracle uses a sort-merge operation. Oracle can also use a nested loops operation to execute a join statement. For information on these operations and on when the optimizer chooses one over the other, see the section "Optimizing Join Statements" on page A-37.

Example: This access path is available for the following statement in which the EMP and DEPT tables are not stored in the same cluster:

SELECT *

	FROM emp, dept

	WHERE emp.deptno = dept.deptno;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 			OPTIONS 		OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  MERGE JOIN

	SORT			JOIN

	  TABLE ACCESS		FULL		EMP

	SORT			JOIN       
	  TABLE ACCESS		FULL		DEPT

Path 13: MAX or MIN of Indexed Column This access path is available for a SELECT statement for which all of these conditions are true:

The query uses the MAX or MIN function to select the maximum or minimum value of either the column of a single-column index or the leading column of a composite index. The index cannot be a cluster index.
The argument to the MAX or MIN function can be any expression involving the column, a constant, or the addition operator (+), the concatenation operation (||), or the CONCAT function.
There are no other expressions in the select list.
The statement has no WHERE clause or GROUP BY clause.

To execute the query, Oracle performs a range scan of the index to find the maximum or minimum indexed value. Since only this value is selected, Oracle need not access the table after scanning the index.

Example: This access path is available for the following statement in which there is an index on the SAL column of the EMP table:

SELECT MAX(sal) FROM emp;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 			OPTIONS		 OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  AGGREGATE			GROUP BY

	INDEX			RANGE SCAN	SAL_INDEX

Path 14: ORDER BY on Indexed Column This access path is available for a SELECT statement for which all of these conditions are true:

The query contains an ORDER BY clause that uses either the column of a single-column index or a leading portion of a composite index. The index cannot be a cluster index.
There must be a PRIMARY KEY or NOT NULL integrity constraint that guarantees that at least one of the indexed columns listed in the ORDER BY clause contains no nulls.
The NLS_SORT parameter is set to BINARY.

To execute the query, Oracle performs a range scan of the index to retrieve the ROWIDs of the selected rows in sorted order. Oracle then accesses the table by these ROWIDs.

Example: This access path is available for the following statement in which there is a primary key on the EMPNO column of the EMP table:

SELECT *

	FROM emp

	ORDER BY empno;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 		OPTIONS 		OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT

  TABLE ACCESS	BY ROWID	EMP

	INDEX		RANGE SCAN	PK_EMP

PK_EMP is the name of the index that enforces the primary key. The primary key ensures that the column does not contain nulls.

Path 15: Full Table Scan This access path is available for any SQL statement, regardless of its WHERE clause conditions.

This statement uses a full table scan to access the EMP table:

SELECT *
	FROM emp;

The EXPLAIN PLAN output for this statement might look like this:

OPERATION 		OPTIONS		 OBJECT_NAME

-----------------------------------------------------

SELECT STATEMENT   
  TABLE ACCESS	FULL		EMP

Note that these conditions do not make index access paths available:

column1 > column2
column1 < column2
column1 >= column2
column1 <= column2

where column1 and column2 are in the same table.

column IS NULL
column IS NOT NULL
column NOT IN
column != expr
column LIKE '%pattern'

regardless of whether column is indexed.

expr = expr2

where expr is an expression that operates on a column with an operator or function, regardless of whether the column is indexed.

NOT EXISTS subquery
any condition involving a column that is not indexed

Any SQL statement that contains only these constructs and no others that make index access paths available must use full table scans.

Choosing Access Paths with the Cost-Based Approach

This section provides examples that show how the optimizer chooses among available access paths when using the cost-based approach. For more information, see "The Optimizer" chapter in Oracle7 Server Concepts.

Example: Consider this query, which uses a bind variable rather than a literal value for the boundary value in the WHERE clause condition:

SELECT *

	FROM emp

	WHERE empno < :e1;

The optimizer does not know the value of the bind variable E1. Indeed, the value of E1 may be different for each execution of the query. For this reason, the optimizer cannot use the means described in the previous example to determine selectivity of this query. In this case, the optimizer heuristically guesses a small value for the selectivity of the column (because it is indexed). The optimizer makes this assumption whenever a bind variable is used as a boundary value in a condition with one of the operators <, >, <=, or >=.

The optimizer's treatment of bind variables can cause it to choose different execution plans for SQL statements that differ only in the use of bind variables rather than constants. In one case in which this difference may be especially apparent, the optimizer may choose different execution plans for an embedded SQL statement with a bind variable in an Oracle Precompiler program and the same SQL statement with a constant in SQL*Plus.

Example: Consider this query, which uses two bind variables as boundary values in the condition with the BETWEEN operator:

SELECT *

	FROM emp

	WHERE empno BETWEEN :low_e AND :high_e;

The optimizer decomposes the BETWEEN condition into these two conditions:

empno >= :low_e

empno <= :high_e

The optimizer heuristically estimates a small selectiviy for indexed columns in order to favor the use of the index.

Example: Consider this query, which uses the BETWEEN operator to select all employees with employee ID numbers between 7500 and 7800:

SELECT *

	FROM emp

	WHERE empno BETWEEN 7500 AND 7800;

To determine the selectivity of this query, the optimizer decomposes the WHERE clause condition into these two conditions:

empno >= 7500

empno <= 7800

The optimizer estimates the individual selectivity of each condition using the means described in a previous example. The optimizer then uses these selectivities (S1 and S2) and the absolute value function (ABS) in this formula to estimate the selectivity (S) of the BETWEEN condition:

S = ABS( S1 + S2 - 1 )

Optimizing Join Statements

This section covers:

Optimizer Choices for Join Statements

To choose an execution plan for a join statement, the optimizer must make three related choices

access paths	As for simple statements, the optimizer must choose an access path to retrieve data from each table in the join statement.
join operations	To join each pair of row sources, Oracle must perform one of these operations: nested loops sort-merge cluster hash join
join order	To execute a statement that joins more than two tables, Oracle joins two of the tables, and then joins the resulting row source to the next table. This process is continued until all tables are joined into the result.

Join Operations

To join two row sources, Oracle must perform one of these operations:

nested loops
sort-merge
cluster
hash join

Nested Loops Join To perform a nested loops join, Oracle follows these steps:

The optimizer chooses one of the tables as the outer table, or the driving table. The other table is called the inner table.
For each row in the outer table, Oracle finds all rows in the inner table that satisfy the join condition.
Oracle combines the data in each pair of rows that satisfy the join condition and returns the resulting rows.

The following figure shows the execution plan for this statement using a nested loops join:

SELECT *

	FROM emp, dept

	WHERE emp.deptno = dept.deptno;

Nested Loops Join

To execute this statement, Oracle performs these steps:

Step 2 accesses the outer table (EMP) with a full table scan.
For each row returned by Step 2, Step 4 uses the EMP.DEPTNO value to perform a unique scan on the PK_DEPT index.
Step 3 uses the ROWID from Step 4 to locate the matching row in the inner table (DEPT).
Oracle combines each row returned by Step 2 with the matching row returned by Step 4 and returns the result.

Sort-Merge Join To perform a sort-merge join, Oracle follows these steps:

Oracle sorts each row source to be joined if they have not been sorted already by a previous operation. The rows are sorted on the values of the columns used in the join condition.
Oracle merges the two sources so that each pair of rows, one from each source, that contain matching values for the columns used in the join condition are combined and returned as the resulting row source.

Oracle can only perform a sort-merge join for an equijoin.

The following figure shows the execution plan for this statement using a sort-merge join:

SELECT *

	FROM emp, dept 
	WHERE emp.deptno = dept.deptno;

Sort-Merge Join

To execute this statement, Oracle performs these steps:

Steps 3 and 5 perform full table scans of the EMP and DEPT tables.
Steps 2 and 4 sort each row source separately.
Step 1 merges the sources from Steps 2 and 4 together, combining each row from Step 2 with each matching row from Step 4 and returns the resulting row source.

Cluster Join Oracle can perform a cluster join only for an equijoin that equates the cluster key columns of two tables in the same cluster. In a cluster, rows from both tables with the same cluster key values are stored in the same blocks, so Oracle only accesses those blocks. For information on clusters, including how to decide which tables to cluster for best performance, see Chapter 8, "Data Access Methods".

The following figure shows the execution plan for this statement in which the EMP and DEPT tables are stored together in the same cluster:

SELECT *

	FROM emp, dept 
	WHERE emp.deptno = dept.deptno;

Cluster Join

To execute this statement, Oracle performs these steps:

Step 2 accesses the outer table (DEPT) with a full table scan.
For each row returned by Step 2, Step 3 uses the DEPT.DEPTNO value to find the matching rows in the inner table (EMP) with a cluster scan.

A cluster join is nothing more than a nested loops join involving two tables that are stored together in a cluster. Since each row from the DEPT table is stored in the same data blocks as the matching rows in the EMP table, Oracle can access matching rows most efficiently.

Hash Join To perform a hash join, Oracle follows these steps:

Oracle performs a full table scan on each of the tables and splits each into as many partitions as possible based on the available memory.
Oracle builds a hash table from one of the partitions (if possible, Oracle will select a partition that fits into available memory). Oracle then uses the corresponding partition in the other table to probe the hash table. All partitions pairs that do not fit into memory are placed onto disk.
For each pair of partitions (one from each table), Oracle uses the smaller one to build a hash table and the larger one to probe the hash table.

Oracle can only perform a hash join for an equijoin.

The following figure shows the execution plan for this statement using a hash join:

SELECT *

	FROM emp, dept 
	WHERE emp.deptno = dept.deptno;

Hash Join

To execute this statement, Oracle performs these steps:

Steps 2 and 3 perform full table scans of the EMP and DEPT tables.
Step 1 builds a hash table out of the rows coming from 2 and probes it with each row coming from 3.

To enable the selection of the hash join algorithm you must use one of the following approaches: issue the statement ALTER SESSION SET HASH_JOIN_ENABLED = TRUE; or set the HASH_JOIN_ENABLED initialization parameter; or else employ the USE_HASH hint. See Oracle7 Server SQL Reference for more information on the syntax of the ALTER SESSION command.

The initialization parameter HASH_AREA_SIZE controls the memory to be used for hash join operations and the initialization parameter HASH_MULTIBLOCK_IO_COUNT controls the number of blocks a hash join operation should read and write concurrently. See Oracle7 Server Reference for more information about these initialization parameters.

Choosing Execution Plans for Join Statements

This section describes how the optimizer chooses an execution plan for a join statement:

when using the rule-based approach
when using the cost-based approach

Note these considerations that apply to the cost-based and rule-based approaches:

The optimizer first determines whether joining two or more of the tables definitely results in a row source containing at most one row. The optimizer recognizes such situations based on UNIQUE and PRIMARY KEY constraints on the tables. If such a situation exists, the optimizer places these tables first in the join order. The optimizer then optimizes the join of the remaining set of tables.
For join statements with outer join conditions, the table with the outer join operator must come after the other table in the condition in the join order. The optimizer does not consider join orders that violate this rule.

Choosing Execution Plans for Joins with the Rule-Based Approach With the rule-based approach, the optimizer follows these steps to choose an execution plan for a statement that joins R tables:

The optimizer generates a set of R join orders, each with a different table as the first table. The optimizer generates each potential join order using this algorithm:
1. a. To fill each position in the join order, the optimizer chooses the table with the most highly ranked available access path according to the ranks for access paths in "Choosing Access Paths" on page A-23. The optimizer repeats this step to fill each subsequent position in the join order.
1. b. For each table in the join order, the optimizer also chooses the operation with which to join the table to the previous table or row source in the order. The optimizer does this by "ranking" the sort-merge operation as access path 12 and applying these rules:

If the access path for the chosen table is ranked 11 or better, the optimizer chooses a nested loops operation using the previous table or row source in the join order as the outer table.
If the access path for the table is ranked lower than 12, and there is an equijoin condition between the chosen table and the previous table or row source in join order, the optimizer chooses a sort-merge operation.
If the access path for the chosen table is ranked lower than 12, and there is not an equijoin condition, the optimizer chooses a nested loops operation with the previous table or row source in the join order as the outer table.

The optimizer then chooses among the resulting set of execution plans. The goal of the optimizer's choice is to maximize the number of nested loops join operations in which the inner table is accessed using an index scan. Since a nested loops join involves accessing the inner table many times, an index on the inner table can greatly improve the performance of a nested loops join. Usually, the optimizer does not consider the order in which tables appear in the FROM clause when choosing an execution plan.
The optimizer makes this choice by applying the following rules in order:
2. a. The optimizer chooses the execution plan with the fewest nested-loops operations in which the inner table is accessed with a full table scan.
2. b. If there is a tie, the optimizer chooses the execution plan with the fewest sort-merge operations.
2. c. If there is still a tie, the optimizer chooses the execution plan for which the first table in the join order has the most highly ranked access path:

If there is a tie among multiple plans whose first tables are accessed by the single-column indexes access path, the optimizer chooses the plan whose first table is accessed with the most merged indexes.
If there is a tie among multiple plans whose first tables are accessed by bounded range scans, the optimizer chooses the plan whose first table is accessed with the greatest number of leading columns of the composite index.
2. d. If there is still a tie, the optimizer chooses the execution plan for which the first table appears later in the query's FROM clause.

Choosing Execution Plans for Joins with the Cost-Based Approach With the cost-based approach, the optimizer generates a set of execution plans based on the possible join orders, join operations, and available access paths. The optimizer then estimates the cost of each plan and chooses the one with the lowest cost. The optimizer estimates costs in these ways:

The cost of a nested loops operation is based on the cost of reading each selected row of the outer table and each of its matching rows of the inner table into memory. The optimizer estimates these costs using the statistics in the data dictionary.
The cost of a sort-merge join is based largely on the cost of reading all the sources into memory and sorting them.
The optimizer also considers other factors when determining the cost of each operation. For example:

With the cost-based approach, the optimizer's choice of join orders can be overridden with the ORDERED hint. If the ORDERED hint specifies a join order that violates the rule for outer join, the optimizer ignores the hint and chooses the order. You can also override the optimizer's choice of join operations with hints. For information on using hints, see Chapter 7, "Optimization Modes and Hints".

Optimizing Star Queries

The Star Query Environment

One type of data warehouse design centers around what is known as a "star" schema. These schemas are characterized by the existence of one or more very large "fact" tables and a number of much smaller "dimension" tables. A typical fact table contains "keys" and "measures." For example, a simple fact table might contain the measure Sales, and keys Time, Product, and Market. In this case there would be corresponding dimension tables for Time, Product, and Market. The Product dimension table, for example, would typically contain information about each product number which appears in the fact table.

A "star join" refers to a primary key to foreign key join of the dimension tables to a fact table. The fact table normally has a concatenated index on the key columns to facilitate this type of join.

The Oracle cost-based optimizer recognizes such star queries and generates efficient execution plans for them.

Tuning Star Queries

This section discusses star query tuning, with reference to the following example:

SELECT SUM(dollars) FROM facts, time, product, market
WHERE market.stat = 'New York' AND
product.brand = 'MyBrand' AND
time.year = 1995 AND time.month = 'March' AND
/* Joins*/
time.key = facts.tkey AND
product.pkey = facts.pkey AND
market.mkey = facts.mkey;

A star query is thus a join between a very large table and a number of much smaller "lookup" tables. Each lookup table is joined to the large table using a primary key to foreign key join, but the small tables are not joined to each other.

CREATE VIEW prodview AS SELECT /*+ NO_MERGE */ *

FROM brands, mfgrs WHERE brands.mfkey = mfgrs.mfkey;

This hint will both reduce the optimizer's search space, and cause caching of the result of the view.

Hints

Usually, if you analyze the tables the optimizer will choose an efficient star plan. You can also use hints to improve the plan. The most precise method is to order the tables in the FROM clause in the order of the keys in the index, with the large table last. Then use the following hints:

/*+ ORDERED USE_NL(facts) INDEX(facts fact_concat) */

A more general method is to use the STAR hint /*+ STAR */.

Optimizing Compound Queries

To choose the execution plan for a compound query, the optimizer chooses an execution plan for each of its component queries and then combines the resulting row sources with the union, intersection, or minus operation, depending on the set operator used in the compound query.

The following figure shows the execution plan for this statement, which uses the UNION ALL operator to select all occurrences of all parts in either the ORDERS1 table or the ORDERS2 table:

SELECT part FROM orders1

UNION ALL

SELECT part FROM orders2;

Compound Query with UNION ALL Set Operator

To execute this statement, Oracle performs these steps:

Steps 3 and 4 perform full table scans on the ORDERS1 and ORDERS2 tables.
Step 2 perform a UNION ALL operation returning all rows that are returned by either Step 3 or Step 4 including all copies of duplicates.
Step 1 performs an internal operation on the result of Step 2.

The following figure shows the execution plan for this statement, which uses the UNION operator to select all parts that appear in either the ORDERS1 table or the ORDERS2 table:

SELECT part FROM orders1

UNION

SELECT part FROM orders2;

Compound Query with UNION Set Operator

This execution plan is identical to the one for the UNION ALL operator shown in Figure A-11, except that in this case Oracle uses the SORT operation to eliminate the duplicates returned by the UNION ALL operation.

Figure A-13 shows the execution plan for this statement, which uses the INTERSECT operator to select only those parts that appear in both the ORDERS1 and ORDERS2 tables:

SELECT part FROM orders1

INTERSECT

SELECT part FROM orders2;

Compound Query with INTERSECT Set Operator

To execute this statement, Oracle performs these steps:

Steps 4 and 6 perform full table scans of the ORDERS1 and ORDERS2 tables.
Steps 3 and 5 sort the results of Steps 4 and 6, eliminating duplicates in each row source.
Step 2 performs an INTERSECTION operation that returns only rows that are returned by both Steps 3 and 5.
Step 1 performs an internal operation on the result of Step 2.

Using Histograms

Height-Balanced Histograms

Oracle's histograms are height balanced (as opposed to width balanced). Width-balanced histograms divide the data into a fixed number of equal-width ranges and then count the number of values falling into each range. Height-balanced histograms place the same number of values into each range so that the endpoints of the range are determined by how many values are in that range.

For example, suppose that the values in a single column of a 1000-row table range between 1 and 100, and suppose that you want a 10-bucket histogram (ranges in a histogram are often referred to as "buckets"). In a width-balanced histogram, the buckets would be of equal width (1-10, 11-20, 21,-30, etc.) and each bucket would count the number of rows that fall into that range. In a height-balanced histogram, each bucket has the same height (in this case 100 rows), and then the endpoints for the buckets would be determined by the density of the distinct values in the column.

Advantages of Height-Balanced Histograms

The advantage of the height-balanced approach is clear when the data in the above example is highly skewed. Supposed that 800 rows of the example table had the value 5, and the remaining 200 rows are evenly distributed between 1 and 100. The width-balanced histogram would have 820 rows in the bucket labeled "1-10", and approximately 20 rows in the other buckets. The height-balanced histogram would have one bucket labeled "1-5", seven buckets labeled "5-5", one bucket labeled "5-50", and one bucket labeled "50-100".

If you wanted to know how many rows in the table contained the value "5", it is apparent from the height-balanced histogram that approximately 80% of the rows contain this value. However, the width-balanced histogram does not provide a mechanism for differentiating between the value "5" and the value "6". You would compute only 8% of the rows contain the value "5" in a width-balanced histogram. Thus, height-balanced histograms are more appropriate for determining the selectivity of column values.

Prev Next

Library

Product

Contents

Index

comp_oper	Is any of the comparison operators =, !=, ^=, <, <>, <, >, <=, or >=.
constant	Is any constant expression involving operators, SQL functions, literals, bind variables, and correlation variables.