Wednesday, February 11, 2026

Using SQL TRANSLATE in place of REGEXP_REPLACE

I encountered something last week that I think may be a "bug" in either RPG or in Db2 for i. I was using the REGEXP_REPLACE scalar function, and I received an error message that made no sense. I ran out of ideas of how to overcome this error using REGEXP_REPLACE, and I went on to use the TRANSLATE scalar function. Before I tell you what the error is I need to explain my situation.

I was working upon a partition which was running IBM i 7.5 with the latest PTFs for Technology Refresh 7, TR7, and the latest PTFs for Database and RPG:

  • Database (SQL) = SF99950 Level 11
  • RPG = 5770WDS SJ08064

The partition's CCSID is:

01  SELECT SYSTEM_VALUE_NAME,SYSTEM_VALUE
02    FROM QSYS2.SYSTEM_VALUE_INFO
03   WHERE SYSTEM_VALUE_NAME = 'QCCSID'

SYSTEM_VALUE_NAME   SYSTEM_VALUE
-----------------   ------------
QCCSID              273

Lines 1 – 3: I can retrieve the values of the system values using the SYSTEM_VALUE_INFO View. Here I am only interested in retrieving the partition's default CCSID.

The results show that the CCSID is 273, which is the correct CCSID for Austria and Germany.

What I will be showing here is a very simple example of what I found. I have a DDS file, and I need to change certain characters in a field to different values.

The file I will be using here is:

01 A          R TESTFILER
02 A            FLD01         10
03 A            FLD02         50

Line 1: It will come as no surprise to the regular readers of this website that my file is called TESTFILE, and its record format is TESTFILER.

Line 2: First field, FLD01 is a ten long character field. This is the field that will contain the characters that need to be changed.

Line 3: You will see why and how I use FLD02 later. For now, all you need to know is that it is a fifty character field.

As this is a simple example I am just going to insert one row/record into the file:

01  INSERT INTO TESTFILE
02    VALUES(FLD01 => '#%#%#') ;

Here I am using the new way to insert a row, where I list the columns and their values next to them.

Line 2: I only need to give a value to the first field, FLD01, the second needs to be blank.

I can now check the contents of TESTFILE, with the following SQL statement:

01  SELECT * FROM TESTFILE

Which shows that only FLD01 is populated with a mixture of hash ( # ) and percent ( % ) characters.

FLD01   FLD02
-----   ---------------------
#%#%#

As I always do, I created and tested the SQL statement I was going to use with the REGEXP_REPLACE in ACS's Run SQL scripts, RSS:

01  UPDATE TESTFILE 
02     SET FLD01 = REGEXP_REPLACE(FLD01,'[#%]','x',1,0,'i') ;

Line 2: I am updating FLD01 having replaced any "#" or "%" in FLD01 with a "x", starting in first position, the zero denotes that I want to change all eligible characters, and the "i" means the replace operation will be case independent.

The result is:

01  SELECT * FROM TESTFILE

FLD01   FLD02
-----   ---------------------
#%#%#

I then build a RPG program around this statement:

01  **free
02  ctl-opt option(*srcstmt) ;

03  exec sql UPDATE TESTFILE
04             SET FLD01 = REGEXP_REPLACE(FLD01,
05                                        '[#%]',
06                                        'x',
07                                        1,0,'i') ;
08  if (SQLCOD <> 0) ;
09    dsply ('SQLCOD = ' + %char(sqlcod)) ;
10  endif ;

11  exec sql UPDATE TESTFILE
12             SET FLD02 = REGEXP_REPLACE(FLD01,
13                                        '[#%]',
14                                        'x',
15                                        1,0,'i') ;
16  *inlr = *on ;

Lines 3 - 7: I am moving the string of change characters in FLD01 into FLD01.

Lines 8 – 10: If there is an warning or error the SQL Code will not be zero, and the SQL Code will be displayed.

Lines 11 – 15: This time the changed characters in FLD01 are moved into FLD02, which is bigger than FLD01. You will see why I have done this shortly.

After compiling this program, I call it. And I get a SQL error:

DSPLY  SQLCOD = -404

When I look at the message in the job log I see:

                       Additional Message Information

Message ID . . . . . :  SQL0404      Severity . . . . . . :  30
Message type . . . . :  Diagnostic
Date sent  . . . . . :  DD-DD-DD     Time sent  . . . . . :  TT:TT:TT

Message . . . . :   Value for column or variable FLD01 too long.
Cause . . . . . :   An INSERT, UPDATE, MERGE, SET, VALUES INTO, or GET
 DIAGNOSTICS statement specifies a value that is longer than the maximum
 length string that can be stored in FLD01. The length of FLD01 is 10
 and the  length of the string is 21.
Recovery  . . . :   Reduce the length of the string from 21 to a maximum
  of 10 and try the request again.

How can the ten long FLD01 create a result of twenty one characters!

The second REGEXP_REPLACE statement, where I put the result into FLD02 shows what has happened:

01  SELECT * FROM TESTFILE

FLD01   FLD02
-----   ---------------------
#%#%#   x#x%x#x%x#x x x x x x

What I see in the FLD02 column is not a replace, but a combination of the "x" replacement character with the original characters in FLD01, including the blank characters.

I tried various ideas I had to over this to no avail.

As I am unable to use the REGEXP_REPLACE I needed to find something else. Fortunately, there is the TRANSLATE SQL scalar function that will do what I need to do:

01  UPDATE TESTFILE SET FLD01 = TRANSLATE(FLD01,'xx','#%')

TRANSLATE has three parameters:

  1. String or variable that contains the characters to translate
  2. What I want to convert the characters to
  3. These are the characters I want to change

In other words I want to change any "#" or "%" to "x". Notice that I have two "x" as I am translating both of the original characters to "x". If I had only one "x" then "#" would be translated to "x", and "%"would be translated to blank.

After executing that statement I can check TESTFILE:

01  SELECT * FROM TESTFILE

FLD01   FLD02
-----   ---------------------
xxxxx

What would this look like in a RPG program:

01  **free
02  ctl-opt option(*srcstmt) ;

03  exec sql UPDATE TESTFILE
04             SET FLD01 = TRANSLATE(FLD01,'xx','#%') ;

05  if (SQLCOD <> 0) ;
06    dsply ('SQLCOD = ' + %char(sqlcod)) ;
07  endif ;

08  *inlr = *on ;

Lines 3 – 6: I broke the statement over several lines to make it easier to understand.

After compiling, I called the program, and it completed without error.

I checked TESTFILE to make sure that the translations happened.

01  SELECT * FROM TESTFILE

FLD01   FLD02
-----   ---------------------
xxxxx

I am still a bit confused why this error happened with REGEXP_REPLACE. Tried the same programs on a different partition, that is at IBM i 7.6 and CCSID 37, and what happened above did not occur. Is this something to do with the CCSID, and a possible mismatch between the operating system and my job's CCSID? I have no idea.

If you ever encounter REGEXP_REPLACE acting this way, you now have an alternative you can use.

 

You can learn more about the TRANSLATE SQL Scalar function from the IBM website here.

 

This article was written for IBM i 7.5 TR7.

No comments:

Post a Comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.