| Oracle Context Option Application Developer's Guide | Library |
Product |
Contents |
Index |
The following topics are covered in this chapter:
The term document, however, has two specific and distinct meanings:
For more information about text storage in ConText Option, see Oracle ConText Option Administrator's Guide
Documents in a text column can consist of plain text (i.e. ASCII) or formatted text (i.e. Microsoft Word, WordPerfect). In addition, each document in a text column can be in a different format.
The pointer can be:
If changes to a document require the text index to be updated, ConText Option has no way of notifying the DML Queue that the text index needs to be updated. Notification of the DML Queue can be accomplished through two methods:
This section discusses:
For more information about creating text indexes for columns, see Oracle ConText Option Administrator's Guide.
To retrieve relevant documents, a text query must accomplish three tasks:
The three tasks required to retrieve documents can be accomplished using two-steps, one-step, or an in-memory cursor. All three methods produce exactly the same results. You choose a method depending on the needs of the application.
In addition, ConText Option allows you to return the number of hits for a query in place of the actual hitlist. This can be useful for queries that produce very long hitlists.
Theme queries work similarly to text querying in that you must create an index (theme) for the documents before you can query. Theme queries differ from text queries in that you need not provide the word patterns for the search. ConText option interprets your query conceptually according to its view of the world and returns an appropriate document hitlist based on theme, along with a measure of how relevant each document is to the query.
You can use the standard query methods to perform theme queries, namely one-step, two-step, and in-memory. In a theme query, you can use most of the operators you use in regular text queries.
For more information about theme queries, see "Theme Queries (Chapter 4)."
For more information about creating theme indexes for columns, see Oracle ConText Option Administrator's Guide.
In addition, ConText Option provides a method for counting query hits without performing an actual query.
The second step uses a SELECT statement to select the results from the result table. In addition, the hitlist table can be joined with the original table to return more detailed document information. In the two-step method, the physical hitlist table is available to the application program.
The hitlist is processed by ConText Option using internal result tables. As a result, you do not have to create result tables before running a one-step query; however, the internal result tables are not available to the application program.
In an in-memory query, you open a cursor to the query buffer and run a query. ConText Option writes the results of the query to the buffer. You fetch the results, then close the cursor.
Results can be returned in order of their textkeys or sorted by score.
Counting query hits can be performed in two modes: estimate and exact. The modes are based on the method ConText Option uses to record deleted documents in a text index.
In exact mode, hits are returned only for those documents that satisfy the conditions of the query expression and are currently in the text column of the table.
In estimate mode, hits may be included for documents that satisfy the query condition, but have been deleted from the text column or have been updated so that they no longer satisfy the query expression. This can occur when the text index for the column has not been optimized and the internal document IDs are still present in the index.
In general, the inaccuracy of the results returned by COUNT_HITS in estimate mode is proportional to the amount of DML that has been performed on a text column.
Note: If the index being queried has been optimized and no further DML has been performed on the text column, estimate mode will return accurate results.
For more information about text indexing, DML, and optimization, see Oracle ConText Option Administrator's Guide
The most basic kind of query expression is single words or phrases that return documents with a score based on the number of occurrences of the words or phrases. More complex expressions allow the user to weight certain terms, search for words that sound like each other, and find all of the words based on a particular root.
ConText Option provides a rich vocabulary of operators and special characters that can be used to create highly sophisticated query expressions that meet many complex user needs.
For more information about query expressions, see "Understanding Query Expressions (Chapter 3)."
You can combine queries by referencing an SQE within the query expression of another query. Using an SQE in a query results in faster execution of the query because the results are already stored in the database.
Stored query expressions can also be used to perform interactive queries, in which an initial query is refined using one or more additional queries.
The table owned by CTXSYS is an internal table which stores the SQE definitions for all the SQEs that have been created for all existing policies. It cannot be accessed directly, but can be viewed through two views, CTX_SQES (users with CTXADMIN role) and CTX_USER_SQES (users with CTXAPP and CTXADMIN roles).
The table used to store the results an SQE for a text column is part of the text index for the column and is created automatically by ConText Option during the initial text indexing of the column; however, the SQR table is only populated when an SQE is created/stored and updated when an SQE is re-evaluated.
The tablespace, storage clause, and other parameters used to create the SQR table are specified by the Engine preference in the policy for the text column of the SQE.
Note: Similar to the other ConText index tables, the SQR table is an internal table that is accessed only by ConText Option when an SQE is processed in a query.
For more information about policies, preferences, text indexing, and the structure of the SQE tables and views, see Oracle ConText Option Administrator's Guide.
You can use session SQEs only in the current session. These SQEs are stored only for the duration of the session. When a session is terminated, all session SQEs created during the session are deleted from the SQE tables. If you want to use a session SQE in another session, you must recreate the SQE.
System SQEs can be used in all sessions, including concurrent sessions. When a session is terminated, system SQEs created during the session are not deleted from the SQE tables and can be used in future sessions.
SQEs also support all of the special characters and other components that can be used in a query expression, including PL/SQL functions and other SQEs.
For example, an SQE could be created (SQE 1) that stores the results of a query for term A and term B. Then, a second SQE could be created (SQE 2) that stores the results of a query for SQE 1 and term C. Finally, SQE 2 could be called in a query to return all of the documents that contain term A, term B, and term C.
ConText Option also verifies that any SQEs nested within an SQE have up-to-date results.
Note: ConText Option does not verify whether PL/SQL functions in SQEs have been updated. If a PL/SQL function in an SQE has been updated, the SQE must be manually re-evaluated.
Result lists in SQE tables may get fragmented by consequtive re-evaluations. You can resolve fragmentation by calling CTX_QUERY.REFRESH_SQE.
In a two-step query, the hitlist is created explicitly and returned to the user as a result table that must have been allocated by the application program.
In a one-step query, the hitlist is generated and processed internally by ConText Option. The results of the query, including the generated scores, are returned to the user as a record set of selected documents; the hitlist is not available as a separate table.
In an in-memory query, the hitlist is stored in memory and is returned through a loop that fetches the individual hits from memory.
For example, a document that contains the search expression 10 times is considered more relevant than one that only contains the expression 5 times.
In basic queries, the score is calculated as the number of times a chosen search word appears in the document, and the score can be used to order the hitlist so that the highest scoring documents appear first. In more complex queries, the score is affected by various relationships between words and phrases; weights applied to various elements of the search expression also affect the score by giving more or less emphasis to the occurrence of those terms within the document.
Scores are generated by the general purpose text engine during queries (text or theme). The engine calculates a relevance score for each cell in the text column that meets the search criteria. The upper bound of the score value is 100, and each row meeting the criteria is assigned a score between 1 and 100.
In two-step queries, it is generated by the CONTAINS procedure and stored in a result table called the hitlist table.
In one-step queries, the score is generated internally by the CONTAINS function and returned by the SCORE function.
In in-memory queries, score is one of the output arguments specified when running the query and is returned when the hits are retrieved.
Result tables are conceptually distinguished from normal database tables in that they have specific meaning only when applied to specific ConText Option functions, specifically in the following two situations:
You can create result tables using the SQL command CREATE or using functions provided in the CTX_QUERY PL/SQL package.
For more information about the structure of result tables, see "Result Tables (Chapter 12)".
For more information about CTX_QUERY, see "PL/SQL Packages (Chapter 11)".
|
Prev Next |
Copyright © 1996 Oracle Corporation. All Rights Reserved. |
Library |
Product |
Contents |
Index |