Wednesday, May 26, 2021

New RPG built in function to split apart a string

split bif to break apart a string into array elements

Two years ago IBM introduced a SQL function, SPLIT, that would break apart a string into parts. In the latest Technology Refreshes for IBM i 7.4 and 7.3 introduces into RPG a Built in Function, BiF, that does a similar thing.

The Split BiF, %SPLIT, breaks apart data from a string into a temporary array.

Let me started with some examples:

01  **free
02  dcl-s String char(100) 
      inz('RPGPGM is a website that provides stories about IBMi') ;

03  dcl-s wkArray char(10) dim(10) ;

04  wkArray = %split(String) ;

Line 2: This variable contains the string I will be splitting into array elements. And yes, I do know it is not "IBMi", but you will see why I did this below.

Line 3: Definition of the array that will contain the parts of the string.

Line 4: All I need to do is just use the %SPLIT BiF passing to it the variable name and the string is split everywhere there is a space into a separate array elements.

The results are:

EVAL wkArray              
WKARRAY(1) = 'RPGPGM    ' 
WKARRAY(2) = 'is        ' 
WKARRAY(3) = 'a         ' 
WKARRAY(4) = 'website   ' 
WKARRAY(5) = 'that      ' 
WKARRAY(6) = 'provides  ' 
WKARRAY(7) = 'stories   ' 
WKARRAY(8) = 'about     ' 
WKARRAY(9) = 'IBMi      ' 
WKARRAY(10) = '          '

The %SPLIT can use other characters as the separator, rather than space.

In the following example I am using a comma as the separator. The contents of the variable looks like data from a CSV type file.

01  **free
02  dcl-s String char(100)
      inz('"Simon","Hutchinson","RPGPGM.COM",1') ;

03  dcl-s wkArray char(20) dim(5) ;

04  String = %scanrpl('"':'':String) ;
05  wkArray = %split(String:',') ;

06  wkArray = %split(%scanrpl('"':'':String):',') ;

Line 2: This is the string I will be splitting up.

Line 4: I am removing the double quotes ( " ) using the Scan and Replace BiF, %SCANRPL, and replacing them with null. This removes the double quotes from the string.

Line 5: Then I use the %SPLIT with a second parameter, which denotes the separator character.

The results look like:

EVAL wkarray                       
WKARRAY(1) = 'Simon               '
WKARRAY(2) = 'Hutchinson          '
WKARRAY(3) = 'RPGPGM.COM          '
WKARRAY(4) = '1                   '
WKARRAY(5) = '                    '

Line 6: Why use two lines of code, lines 4 and 5, when I can do the same in one line? The results are the same.

I would not use this method to break apart a CSV into individual file fields or table columns. I would still use the method I have shown before using the CPYFRMIMPF command.

%SPLIT can handle multiple separator characters, like in this example.

07  String = 'One,Two.Three|Four:Five' ;

08  wkArray = %split(String:':|.,') ;

Line 7: I am using four different separator characters between each of the words.

Line 8: When I use the Split I need to define all of those separator characters in the second parameter.

The results show an array with each word in a separate element.

EVAL wkarray                       
WKARRAY(1) = 'One                 '
WKARRAY(2) = 'Two                 '
WKARRAY(3) = 'Three               '
WKARRAY(4) = 'Four                '
WKARRAY(5) = 'Five                '

If there are multiple separator characters between each word Split is "smart" enough not to generate an empty value.

09  String = '.One..Two...Three....Four.....Five' ;

10  wkArray = %split(String:'.') ;

Line 9: The words in the string are separated by increasing numbers of periods ( . ).

Line 10: I just need to define the separator character as a period.

The results are each word in its own array element, with no blank elements between them.

EVAL wkarray                       
WKARRAY(1) = 'One                 '
WKARRAY(2) = 'Two                 '
WKARRAY(3) = 'Three               '
WKARRAY(4) = 'Four                '
WKARRAY(5) = 'Five                '

Warning: To be able to use this on other IBM i partitions they must have the run time Split PTF installed.

 

You can learn more about the SPLIT RPG Built in Function from the IBM website here.

 

This article was written for IBM i 7.4 TR4, and will work for 7.3 TR10 too.

6 comments:

  1. "If there are multiple separator characters between each word Split is "smart" enough not to generate an empty value." If there is not an option to prevent this statement then you cannot use %SPLIT to process CSV records accurately as you can have null values in CSVs.


    -Matt

    ReplyDelete
    Replies
    1. This is why I stated above I would still use the CPYFRMIMPF command to process CSV files.

      Delete
  2. That looks like the most useful of all the new features.

    ReplyDelete
  3. Here is my issue with the %split. I can't split a .csv file record without modifying the record first

    Last,First,Add1,Add2,City
    Smith,John,123 Main,,Somewhere

    If I %split line 2, I get 'Somewhere' as Add2, and *Blank as the City. I think a parm to allow Blank array entries would be useful.

    Of course, I could replace ",," with ",*blank," before the split

    ReplyDelete
    Replies
    1. As I said above I am still going to be using the CPYFRMIMPF command for CSV files.

      Delete
  4. IT would be nice if IBM can include another parameter to %split to specify I want to ignore or not any blank vales between delimiters.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.