Wednesday, December 26, 2018

Searching strings using SQL

searching strings using sql locate_in_string

In a previous post I had written about using regular expressions to search for data within a string, but I did not mention the easier to use LOCATE_IN_STRING SQL scalar function. I believe this was introduced as part of IBM i Technical Refresh 2 for 7.3 and TR6 for 7.2.

The LOCATE_IN_STRING has four parameters:

  1. The source string to be searched (mandatory)
  2. The string used to search (mandatory)
  3. Starting position (optional)
  4. Instance, which number occurrence to find (optional)

I created a SQL DDL table, which I called TESTFILE, which I will be using in these examples. It contains one column, called DESCRIPTION, that is 40 long character, and the table contains just one row:

SELECT DESCRIPTION FROM MYLIB.TESTFILE

----+----1----+----2----+----3----+----4
RPGPGM.COM is best website for IBM i

Let me start with the simplest statement, just with the source string, the column DESCRIPTION, and the search pattern, lower case "r":

SELECT LOCATE_IN_STRING(DESCRIPTION,'r') FROM MYLIB.TESTFILE

Result = 30

30 was returned as the result as LOCATE_IN_STRING is case sensitive. Even though there is an upper case "R" in the first position it was ignored.

If I wanted to search for the first occurrence of a character, regardless of case, I would need to convert all characters to either upper or lower case, personally I prefer converting to upper. This can easily be achieved by using the UPPER SQL scalar function. I would just need to perform the UPPER before searching the string:

SELECT LOCATE_IN_STRING(UPPER(DESCRIPTION),'R')
  FROM MYLIB.TESTFILE

Result = 1

No surprise with that result as the column's values starts with "R", but what about the second "r", that is lower case.

I can do this two ways. In this fist example you will notice that I have used the third parameter, which is used to say where to start the search. By starting the search in the second position I have bypassed the "R" in the first position.

SELECT LOCATE_IN_STRING(UPPER(DESCRIPTION),'R',2) 
  FROM MYLIB.TESTFILE

Result = 30

Or I can use the fourth parameter, the occurrence of the search pattern. I have to give the third parameter if I use the fourth, therefore, my statement below says starting in the first position of the search string look for the second occurrence of "R".

SELECT LOCATE_IN_STRING(UPPER(DESCRIPTION),'R',1,2)
  FROM MYLIB.TESTFILE

Result = 30

To search from the end of the search string to the beginning all I have to do is to enter a negative number for the starting position to start the search. If I wanted to search for last occurrence of "R" I could simply just use:

SELECT LOCATE_IN_STRING(UPPER(DESCRIPTION),'R',-1)
  FROM MYLIB.TESTFILE

Result = 30

If I wanted to find the second from last occurrence I would use the occurrence parameter:

SELECT LOCATE_IN_STRING(UPPER(DESCRIPTION),'R',-1,2) 
  FROM MYLIB.TESTFILE

Result = 1

This finds the "R" at the start of the search string.

In the real world I would be performing these searches in, let's say a RPG program, where I could use a variable for the search string, second parameter.

I need to give you more examples of programs using Main procedures as it is more efficient than using a cyclical program that I need to exit with either *INLR on or with a RETURN operation.

01  **free
02  ctl-opt main(Main) option(*nodebugio:*srcstmt) ;

03  dcl-proc Main ;
04    dcl-s FixedString char(10) ;
05    dcl-s SearchString varchar(10) ;
06    dcl-s Position int(5) ;

07    exec sql SET OPTION COMMIT = *NONE, CLOSQLCSR = *ENDMOD ;

Line 1: Why would I use anything else except totally free RPG.

Line 2: As I am using a Main procedure I need to use the MAIN keyword in the control option. I also have my favorite control options. As I am not passing any parameters to this program I do not need a procedure prototype or interface.

Line 3: Start of the Main procedure.

Lines 4 – 6: I decided to define these variables as local variables, which are only available within this procedure. Notice that on line 5 I have declared a fixed sized variable, and on line 6 one that is variable in length.

Line 7: I always add these SQL options to my programs to make sure they are not forgotten when the program is compiled.

In the first Select statement I used the fixed width variable:

08    FixedString = 'i' ;
09    exec sql SELECT LOCATE_IN_STRING(DESCRIPTION,:FixedString)
                      INTO :Position
                 FROM TESTFILE ;
10    dsply ('1. Position = ' + %char(Position)) ;

Line 8: I am going to be searching for the character "i".

Line 9: This statement looks similar to the ones I used before, except I have used the variable FixedWidth in the search string, and the INTO which places the result into the variable Position.

Line 10: I display the result using the Display operation code.

What is displayed?

DSPLY  1. Position = 0

Why zero? The variable is fixed width, therefore, the search pattern string is not just "i", it is "i" followed by nine spaces. Which does not occur in the search string.

This is why I am using a variable length character variable to contain the search pattern in the other statements.

11    SearchString = 'i' ;

12    exec sql SELECT LOCATE_IN_STRING(DESCRIPTION,
                                       :SearchString)
                      INTO :Position
                 FROM TESTFILE ;
13    dsply ('2. Position = ' + %char(Position)) ;


DSPLY  2. Position = 12

Line 11: Moving "i" to the variable length field guarantees when it is used it will contain just "i".

Line 12: The minimum search criteria, no starting position and no occurrence values.

And as I expect the first "i" is found in the twelfth position.

14    exec sql SELECT LOCATE_IN_STRING(DESCRIPTION,
                                       :SearchString,1,2)
                      INTO :Position
                 FROM TESTFILE ;
15    dsply ('3. Position = ' + %char(Position)) ;


DSPLY  3. Position = 24

Line 14: Looking for the second occurrence of "i".

And it is found in the 24th position.

16    exec sql SELECT LOCATE_IN_STRING(DESCRIPTION,
                                       :SearchString,1,3)
                     INTO :Position
                 FROM TESTFILE ;
17    dsply ('4. Position = ' + %char(Position)) ;


DSPLY  4. Position = 36

Line 16: Looking for the third occurrence of "i".

It is found in the 36th position.

What about searching the other way, in reverse.

18    exec sql SELECT LOCATE_IN_STRING(DESCRIPTION,
                                       :SearchString,-1,3)
                      INTO :Position
                 FROM TESTFILE ;
19    dsply ('5. Position = ' + %char(Position)) ;
20  end-proc ;


DSPLY  5. Position = 12

Line 18: As the start position is -1 the search starts at the end of the search string. I am looking for the third occurrence of "i", which will give me the same result as searching for the first occurrence from the start.

Line 19: The Main procedure ends.

As expected I get a result of twelve.

 

You can learn more about the LOCATE_IN_STRING scalar function from the IBM website here.

 

This article was written for IBM i 7.3 TR2 and 7.2 TR6.

No comments:

Post a Comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.