Searching on CLOB columns

Post date: 15-Jul-2010 11:46:02

I need to search in a text column larger than 4KB. So, I changed the type of text column from VARCHAR2 to CLOB. But I found that I can't search in CLOB columns. Is thare any way to do this?

CLOBs require that you use the DBMS_LOB package to perform substr/instr type searches. In order to use do any kind of searching within the column, you must first get the locator for the CLOB column, the use the DBMS_LOB.SUBSTR or DBMS_LOB.INSTR function to search and/or pull part of the text from the string.

For example:

declare     src_lob        CLOB;     pos            INTEGER := 1;     buf            VARCHAR2(32000); BEGIN     SELECT c_lob       INTO src_lob       FROM lob_table      WHERE key_value = 21;     buf := DBMS_LOB.SUBSTR(src_lob, 200, pos);     -- and so on.... END; 

This code would retrieve the first 200 characters from the CLOB. INSTR works in a similar way. Take a look at the Oracle8i Supplied PL/SQL Packages Reference for more details.

Let's expand the concept with another example and review the details: First, we assume we have a table named pet_table defined as:

CREATE TABLE pet_table (pet_name   varchar2(30) PRIMARY KEY,  pet_info   CLOB,  pet_breed  varchar2(30),  pet_owner  varchar2(50) ); 

What we want to be able to do is to read the pet_table for a given pet_name (the key) and retrieve from the CLOB column (pet_info) the first 200 characters. For example purposes, we'll assume that the first 200 characters always contains the pet's pedigree information.

Any LOB type column (CLOB, BLOB, NCLOB) cannot be retrieved and searched directly as LOB data columns simply store a pointer that must be used to obtain the actual column data. So, our next step is to SELECT from the pet_table and retrieve the row for the pet and retrieve the lob pointer (also called the lob locator). Then using that locator, we will execute the DBMS_LOB.SUBSTR function to pull back the first 200 characters of the actual lob data (stored at the location pointed to by the locator). DBMS_LOB.SUBSTR takes 3 parameters: 1) the lob locator pointing to the CLOB data value to retrieve 2) an integer indicating the amount of data to retrieve from the CLOB and 3) the starting position where SUBSTR should begin.

Our PL/SQL block can then be written as follows:

DECLARE     src_lob_locator  CLOB;     pedigree_info    VARCHAR2(200); BEGIN     SELECT pet_info       INTO src_lob_locator       FROM pet_table      WHERE pet_name = 'FIDO' ;      pedigree_info := DBMS_LOB.SUBSTR(src_lob_locator, 200, 1);      /* Put any code that you need to process the returned information here */  END; 

The main thing to remember with LOB columns is that you CANNOT retrieve the CLOB column directly like you would a regular VARCHAR2 column. In other words, if you tried to execute:

SELECT substr(pet_info, 1, 200)   FROM pet_table  WHERE pet_name = 'FIDO';

This would fail because you must first obtain the lob locator and then use the DBMS_LOB package functions and procedures to examine the contents as shown in the PL/SQL block.

INSTR works in a similar way as SUBSTR in that it needs to have a lob locator, except that using INSTR, you can find a specific pattern within the CLOB instead of just pulling back a specified amount of characters. The Oracle8i Supplied PL/SQL Packages Reference under DBMS_LOB has details and examples on using INSTR.

 

 

This entry falls under the "Something New I Learned Today" heading.

I needed to obtain a list of active users who had a particular string within a CLOB field in an auxillary table associated back to their account via a foreign key. You can't do "LIKE" statements against a CLOB, and we don't have Intermedia operational on the boxes where the query needed to run so I couldn't run a "contains" against an index. I say "boxes" because we have two distinct user databases -- one running Oracle 9i and another on 8i. So whatever solution I came up with had to play in both environments. Our DBA mentioned there was a package resident in both called DBMS_LOB which exposed methods for working with LOBs. I saw that one of these was InStr() and hoped that it would allow me to specify the substring I was concerned with, and perhaps return the index where the substring started within the CLOB.

Sometimes there IS reason in the universe. I was able to do this using the following PL/SQL (names have been changed to protect the innocent):

select u.user_uid, u.login_text, p.profile_name_text

  from profile p, user u

 where u.acct_closed_ind = 0

   and u.user_uid = p.user_uid

   and (

dbms_lob.instr(clob_field, 'first string to find',1,1) > 0 or

        dbms_lob.instr(clob_field, 'second string to find',1,1) > 0

);

Obviously you could stick as many substrings in there as you want using additional ORs. The DBMS_LOB package has all kinds of interesting functions for dealing LOBs. We could have even run a script to update/remove the offending strings automatically. But we opted for safety and will instead have the Customer Support team call the users so they can be educated.

Ref: http://www.developerdotstar.com/community/node/68