Oracle Context Option Application Developer's Guide Go to Product Documentation Library
Library
Go to books for this product
Product
Go to Contents for this book
Contents
Go to Index
Index



Go to previous file in sequence Go to next file in sequence

CHAPTER 7. Using the Linguistic Services


This chapter explains how to use the Linguistic Services to generate linguistic output for English text. It also provides some tips and suggestions for building linguistically-enhanced text applications.

The topics covered in this chapter are:

Specifying Settings and Error Handling

The CTX_LING package can be used to perform the following tasks for the Linguistic Services:

Specify Setting Configurations

To specify a setting configuration for a session, use the CTX_LING.SET_SETTINGS_LABEL procedure.

For example:

	execute ctx_ling.set_settings_label('P')

You can specify one of the predefined setting configurations provided with ConText Option or a custom setting configuration that you create using the administration tool.

The specified setting configuration is active for the session until SET_SETTINGS_LABEL is called with a new setting configuration label.

You can use the CTX_LING.GET_SETTINGS_LABEL function to return the label for the active setting configuration for the current session.

Enable Parse Logging

At startup of a ConText server, parse information logging is disabled.

To enable logging of the parse information generated by the Linguistic Services during a session, use the CTX_LING.SET_LOG_PARSE procedure.

For example:

	execute ctx_ling.set_log_parse('TRUE')

Once you enable parse logging for a session, it is active until you explicitly disable it during the session.

Attention: Parse logging is a useful feature if you are having difficulty generating linguistic output and you want to monitor how the Linguistic Services are parsing your documents; however, parse logging may affect performance considerably. As such, you should only enable parse logging if you encounter problems with the Linguistic Services.

You can use the CTX_LING.GET_LOG_PARSE function to return a value (TRUE or FALSE) which indicates whether parse logging is enabled or disabled for the session.

Specify Completion and Error Procedures

To specify a processing routine, usually a procedure, to be called when a Linguistic Services request completes or errors, use the SET_COMPLETION_CALLBACK and SET_ERROR_CALLBACK procedures in CTX_LING. However, before specifying a completion or error procedure, the procedure must be defined.

The following example of a completion callback procedure is taken from genling.sql for the ctxling demonstration provided with the ConText Option distribution package.

   create or replace procedure LING_COMP_CALLBACK (
     p_handle in number, 
     p_status in varchar2,
     p_errors in varchar2
   ) IS
     l_total number;
     l_pk    varchar2(64);
begin
   ctx_ling.set_completion_callback('LING_COMP_CALLBACK');
end;

This example illustrates the creation of a procedure named LING_COMP_CALLBACK. It also illustrates the call to CTX_LING.SET_COMPLETION_CALLBACK which specifies LING_COMP_CALLBACK as the completion callback procedure.

Generating Linguistic Output

Before theme and Gist information can be used in an application, you must perform the following tasks:

Creating Output Tables

To create a theme table called CTX_THEMES, issue the following SQL:

    create table ctx_themes (
        cid        number,
        pk         varchar2(64),
        theme      varchar2(256),
        weight     number);

To create a Gist table called CTX_GIST, issue the following SQL:

    create table ctx_gist (
        cid        number,
        pk         varchar2(64),
        pov        varchar2(256),
        gist       long);

Note: Because the combination of the CID (column ID) and PK (primary key) columns in the output tables uniquely identify each document in a text column, you can use the output tables to store theme and Gist information for multiple text columns. You can also choose to create multiple output tables to store the theme and Gist information separately for each text column.

For more information the columns in each table, see "Linguistic Services Output Table Structure" in "Linguistic Specifications (Chapter 13)."

Creating Composite Textkey Output Tables

To create a theme table whose textkey has two columns, issue the following SQL statement:

    create table ctx_themes (
        cid        number,
        pk1        varchar2(64),
        pk2        varchar2(64),
        theme      varchar2(256),
        weight     number);

To create a Gist table whose textkey has two columns, issue the following SQL statement:

    create table ctx_gist (
        cid        number,
        pk1        varchar2(64),
        pk2        varchar2(64),
        pov        varchar2(256),
        gist       long);

For more information about composite textkey theme and Gist tables, see "Linguistic Services Output Table Structure" in "Linguistic Specifications (Chapter 13)."

Requesting the Linguistic Services

To generate linguistic output for the documents in a text column, you first call CTX_LING.REQUEST_THEMES and CTX_LING.REQUEST_GIST for each document in the column, then call CTX_LING.SUBMIT to enter these requests in the Services Queue as a single transaction for that particular document.

Note: A policy must be defined for a column before you can request Linguistic Services for the documents in the column.

The following example shows how you could use the procedures and functions in CTX_LING package to generate linguistic output:

declare handle
    number;
begin
ctx_ling.request_themes(
   'CTXSYS.DOC_POLICY',
   '7039',
   'CTXSYS.CTX_THEMES');
ctx_ling.request_gist(
   'CTXSYS.DOC_POLICY',
   '7039',
   'CTXSYS.CTX_GIST');
handle := ctx_ling.submit;
end;

The first two calls request themes and Gist output for document 7039 in the text column for the DOC_POLICY policy and instructs the Linguistic Services to store the output in the linguistic output tables (CTX_THEMES and CTX_GISTS), which were created in the previous step.

The final API call submits the requests as one batch request to the Services Queue and returns a handle which can be used to monitor the status of the request. Because the two requests are submitted as one batch request, the Linguistic Services parse the document only once while still generating both theme and Gist output.

Monitoring the Services Queue

Once the requests are submitted to the Services Queue and a handle is returned, the CTX_SVC package can be used to perform the following tasks:

Monitoring the Status of Requests

To monitor the status of requests in the Services Queue, use the CTX_SVC.REQUEST_STATUS function.

For example:

   set serverouput on
   declare status varchar2;
     begin
        status := request_status(handle);
        dbms_output.put_line(status);
     end;

This example uses the declared variable STATUS to return the status for the Linguistic Services request identified by HANDLE. The value for HANDLE was generated by the call to CTX_LING.SUBMIT which placed the request in the Services Queue.

Removing Pending Requests

To remove requests with a status of PENDING from the Services Queue, use the CTX_SVC.CANCEL procedure.

For example:

    execute ctx_svc.cancel(3321)

In this example, a pending request with handle 3321 is removed from the Services Queue.

If a request has a status of RUNNING, ERROR, or SUCCESS, it cannot be removed from the Services Queue.

Clearing Requests with Errors

To remove requests with a status of ERROR from the Services Queue, use the CTX_SVC.CLEAR_ERROR procedure.

For example:

     execute ctx_svc.clear_error(3321)

In this example, a request with handle 3321 is removed from the Services Queue.

If a value of 0 (zero) is specified for the handle, all requests with a status of ERROR are removed from the queue.

If a request has a status of PENDING, RUNNING, or SUCCESS, it cannot be removed from the queue using CLEAR_ERROR.

Using Linguistic Services in Applications

There is no single approach to creating a text application that utilizes the Linguistic Services; however, the following section provides some questions that you should consider before developing an application. The answers you provide will help you develop the best application to suit your needs.

Do You Want Linguistic Output Updated Whenever a Document is Updated?

When you create a ConText index (text or theme) for a text column in a table, ConText Option automatically generates a database trigger which queues a DML request whenever a DML action is performed on the table (insert, update, or delete a row). The DML request ensures the ConText index for the table is automatically updated to reflect the change.

Because the linguistic output generated by the Linguistic Services is handled through a separate queue, the DML trigger does not update the linguistic output for the table.

If you want linguistic output updated whenever a document is updated, you need to create a trigger on your text table to do this for you.

Note: If the automatic DML notification trigger already exists on the table and you want to create a similar trigger on the table to automatically update linguistic output, the COMPATIBLE initialization parameter must be set to 7.1.0.0 in the initsid.ora file. This ensures that your release of Oracle is compatible with Oracle, release 7.1 and supports multiple triggers of the same type on a table.

For more information about the COMPATIBLE initialization parameter and the initsid.ora file, see Oracle7 Server Administrator's Guide.

For more information about DML in text columns and updates to ConText Option indexes, see Oracle ConText Option Administrator's Guide..

The following example illustrates the trigger you might create:

create or replace trigger doc_ctx 
   after insert or delete or update
   of id, text
   on ctxsys.DOC
   for each row
declare
   handle number;
begin
   if (deleting or updating) then
      delete from ctx_themes
            where pk = :old.id;
      delete from ctx_gist
            where pk = :old.id;
   end if;
   if (inserting or updating) then
      ctx_ling.request_themes (
            `CTXSYS.DOC_POLICY',
            :new.id,
            `CTXSYS.CTX_THEMES'
      );
      ctx_ling.request_gist (
            `CTXSYS.DOC_POLICY',
            :new.id,
            `CTXSYS.CTX_GIST
      );
      handle := ctx_ling.submit (do_commit => FLASE);
   end if;
end;

In this example, the user CTXSYS has a table named DOC with a text column named TEXT, a primary key column named ID, and a policy named DOC_POLICY.

Are the Output Tables Appropriate for Your Application?

Each application will use the Linguistic Services output differently, so there is no schema design that is optimal for all ConText-enabled applications. For this reason, the Linguistic Services simply store the linguistic output in output tables and it is up to the application to re-map the data to permanent tables as appropriate.

For example, you might want to normalize the theme phrases and use them to perform lookups.

How are You Going to Query?

If your application is going to combine text/theme queries with queries for the linguistic output generated by the Linguistic Services, consider creating views that join the linguistic output tables with the original text table, thereby hiding the complexity involved in working with the linguistic output tables and the original text table.

If your application uses two-step queries, the view can also incorporate the hitlist results table utilized in two-step text queries.

For example, using a results table called CTX_TEMP, a linguistic output table called CTX_THEMES, and a text table called DOC, with already generated theme output from the Linguistic Services, a view called DOC_VIEW can be created as follows:

create or replace doc_view as
   select id, author, title, text,  /* from DOC */
          score,                    /* from CTX_TEMP */
          theme, weight             /* from CTX_THEMES */
   from   doc, ctx_temp, ctx_themes
   where  doc.id         = ctx_temp.textkey
     and  doc.id         = ctx_themes.pk
     and  ctx_themes.cid = 1027;

In this example, the policy ID for DOC_POLICY is 1027.

You could then use DOC_VIEW to combine a query for themes with a two-step query:

execute ctx_query.contains(`DOC_POLICY',`shrimp | pasta',\
'ctx_temp')
select * from doc_view
  where theme = `drinking and dining'
    and weight > 1000;

This example illustrates a query asking for articles that contain either the words shrimp or pasta, are about drinking and dining, and exceed a minimum thematic weight threshold of 1000.

Note: The view is designed so that nothing will be returned if there are no rows in the hitlist results table (that is, you do not execute a text query).

This problem is eliminated if you use the one-step query method instead of the two-step method. The one-step method does not use the CONTAINS procedure or a hitlist results table.




Go to previous file in sequence Go to next file in sequence
Prev Next
Oracle
Copyright © 1996 Oracle Corporation.
All Rights Reserved.
Go to Product Documentation Library
Library
Go to books for this product
Product
Go to Contents for this book
Contents
Go to Index
Index