Wednesday, December 3, 2014

Replacing text in a string

%scanrpl sql replace

I received a request from the engineers: One solvent they use has been place on the "restricted list" by the state and could no longer be used. Could I create a program to replace the solvent's code in their files with the replacement solvent's code?

It gave me a reason to use the %SCANRPL, 'Scan and replace character', built in function that was introduced in IBM i 7.1, and made me think how this could be accomplished with earlier releases too.

The %SCANRPL requires just three required parameters:

  1. From string
  2. To string
  3. Variable name

I do wonder why they are in this order. Why is Variable name is not the first parameter?

And %SCANRPL has two optional parameters, that I will not be using in this example:

  1. Which position in the variable to start scanning
  2. The scan length, which is how far to scan after the starting position of the scan

The neat thing about using the %SCANRPL is that I can:

  • Replace strings that are longer than the original
  • Replace strings that are shorter than the orginal

In the example I am going to replace the code 'ABC' with 'LMNOP':

01  Var1 = 'WASH IN SOLVENT ABC' ;
02  Var1 = %scanrpl('ABC':'LMNOP':Var1)

After line 2 has executed the variable Var1 contains: WASH IN SOLVENT LMNOP

One thing you have to be aware of when replacing shorter strings with longer ones is that there could be "overflow", in other words the result of the %SCANRPL could be longer than Var1 and the "overflow" characters will be lost.

In the next example I am going to replace a long string with a shorter one.

01  Var1 = 'JOHN DAVID SMITH' ;
02  Var1 = %scanrpl('DAVID':'D.':Var1)

After line 2 has executed Var1 contains: JOHN D. SMITH. The five character of 'DAVID' have been replaced with the two of 'D.' and the other three characters have been removed.

I could remove the DAVID completely. Notice that in the first parameter I have DAVID followed by a space. If I do not remove the space then the result will be 'JOHN-space-space-SMITH'. The second character needs to be a null, these is achieved by not passing any value. By having two apostrophes ( ' ) with nothing between them denotes null.

01  Var1 = 'JOHN DAVID SMITH' ;
02  Var1 = %scanrpl('DAVID ':'':Var1)

After line 2 has executed the variable Var1 contains: JOHN SMITH

My program to change the solvent codes could look like:

01  dcl-f ENGFILE usage(*update) ;
02  dcl-s i int(3) ;

03  dow (1 = 1) ;
04    read ENGFILER ;
05    if (%eof) ;
06      leave ;
07    endif ;

08    i = 0 ;
09    i = %scan('ABCD':DESC1) ;

10    if (i > 0) ;
11      DESC1 = %scanrpl('ABCD':'LMNOP':DESC1) ;
12      update ENGFILER %fields(DESC1) ;
13    endif ;
14  enddo ;

15  *inlr = *on ;

Line 1 defines the file, which I am calling ENGFILE as unkeyed and for update. Even though the file has a key I do not want to read it in key order. Line 2 defines an integer variable i.

Lines 3 – 7 are where the file is read and if end-of-file is encountered the logic leaves the do-loop.

On line 8 I set i to zero, and then on line 9 I use the %SCAN built in function to determine if the string I am wanting to change is present in the file variable DESC1. If it is %SCAN returns which position in starts in, therefore, if i is greater than zero, line 10, DESC1 contains what I am looking for.

On line 11 I am using the %SCANRPL to replace 'ABCD' with 'LMNOP'. Then I need to update the file with the new value in the field, I only want to update the file with the value in the file variable DESC1, so I use the %FIELDS to define the fields I want to update the file with.

I think the rest of the program is pretty self evident.

And what about those who do not have IBM i 7.1 or greater. Fortunately SQL offers an alternative that was introduced in V5R3, the REPLACE function. This function has three parameters:

  1. Variable name
  2. From string
  3. To string
    SET :VAR1 = REPLACE(:VAR1:'ABC':'LMNOP')

As with the %SCANRPL the REPLACE can replace short strings with longer ones, long strings with short ones, and even remove a string. In the examples below shows the SQL is embedded in a RPG program.

01  Var1 = 'WASH IN SOLVENT ABC' ;
02  exec sql SET :VAR1 = REPLACE(:VAR1:'ABC':'LMNOP') ;

03  Var1 = 'JOHN DAVID SMITH' ;
04  exec sql SET :VAR1 = REPLACE(:VAR1:'JOHN':'J.') ;
05  exec sql SET :VAR1 = REPLACE(:VAR1:'DAVID ':'') ;

After line 2 Var1 will contain: WASH IN SOLVENT LMNOP

After line 4 Var1 will contain: J. DAVID SMITH
And after line 5 it will contain: J. SMITH

I could just replace the %SCANRPL with the SQL REPLACE function in my example program:

01  dcl-f ENGFILE usage(*update) ;
02  dcl-s i int(3) ;

03  dow (1 = 1) ;
04    read ENGFILER ;
05    if (%eof) ;
06      leave ;
07    endif ;

08    i = 0 ;
09    i = %scan('ABCD':DESC1) ;

10    if (i > 0) ;
11      exec sql SET :DESC1 = REPLACE(:DESC1:'ABCD':'LMNOP') ;
12      update ENGFILER %fields(DESC1) ;
13    endif ;
14  enddo ;

15  *inlr = *on ;

But I could do all of that in one SQL statement:

    UPDATE ENGFILE
       SET DESC1 = REPLACE(DESC1,'ABCD','LMNOP')
     WHERE DESC1 LIKE '%ABCD%'

When using the SQL REPLACE function if you encounter "overflow" you will receive a SQL0404 message:

Message ID . . . . . . . . . :   SQL0404
Message file . . . . . . . . :   QSQLMSG
  Library  . . . . . . . . . :     QSYS

Message . . . . :   Value for column or variable DESC1 too long.
Cause . . . . . :   An INSERT, UPDATE, MERGE, SET, VALUES INTO, or GET
  DIAGNOSTICS statement specifies a value that is longer than the maximum
  length string that can be stored in DESC1. The length of DESC1 is 40 and the
  length of the string is 45.
Recovery  . . . :   Reduce the length of the string from 45 to a maximum of 40
  and try the request again.

Which is not a bad thing as you then know that there will be lost characters if this replace is run. There is no such warning from %SCANRPL, so you will have to manually check to see if characters have been lost.

 

You can learn more about these on the IBM website:

 

This article was written for IBM i 7.2, and it should work with release V5R3 and greater too.

6 comments:

  1. Why this order? Not sure But personally I like it because all I have to do is remember Needle, Haystack (as in find a needle in a haystack) and I no longer have to look up the field sequence for any of this family of BIFs.

    ReplyDelete
    Replies
    1. What an excellent way to remember... "needle, haystack".
      Thanks

      Delete
  2. This is great but will it be replaced with regular expressions in 7.2?

    ReplyDelete
  3. I'm on 7.2 and hadn't heard about regular expressions. Do you have a link where I could learn how to do it?

    ReplyDelete
  4. The Biffs are nice to use but I prefer using SQL REPLACE to transform text.

    ReplyDelete
  5. I've never used this procedure but have used SQL to replace specific characters.
    I support 12 European countries and there are occasional character issues to deal with after the fact.
    Example;
    RUNSQL REQUEST('update insaddr set adcity = translate(adcity, x''9C'', x''A0'')')
    This updates the Address file INSADDR replacing the 'µ' character with 'æ' wherever it's found in the field ADCITY throughout the file.
    Very useful as sometimes it may be a null or non-displayable character in EBCDIC but this allows you to use the Hex value.
    Just an additional tip on a similar subject.
    Keep up the good work and interesting tips and discussions.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.