Wednesday, July 5, 2017

SQL null byte map as a 2-dimensional array

using a 2 dimensional array for sql null byte map in rpg

Nulls are something we are all going to have to get use to handling as we move from DDS files to SQL tables, or start using data from DB2 of i or other non-IBM i databases. Recently I was working on a project with a SQL Table with more than 300 columns. If there was no data in a particular field it was null. If you think about it null is a better alternative than blank or zero as those values could be valid values for the contents of that column.

I have written about handling null in SQL in an earlier post, but in this scenario the method I described to define the null byte map was not practical. I am not going to list 300+ subfields in a data structure array for the null indicators for the matching data structure array I would be moving the SQL Table's data into.

But before I explain how I did that in my RPG let me start with the example SQL table.

03    FIELD2 CHAR(2),
04    FIELD3 DECIMAL(5,0),
05    FIELD4 DATE,
07  )

I am sure you have noticed that this Table does not contain 300+ columns. The same method I am going to describe will work with any Table, whether it has 300+ columns or just four.

The Table contains four rows, each with a different number of null values.

  01      11         1   12/31/2017
  02      22         2   -
  03      33          -  -
  04      -           -  -

The example RPG program is just going to bring in rows from the Table and move them into a data structure array using a multi row Fetch statement. What I choose to do with the data after that is irrelevant to this subject. It also makes it short enough that I can include it in one part, see below.

01  **free
02  ctl-opt option(*nodebugio:*srcstmt:*nounref)
              alwnull(*usrctl) ;

03  dcl-ds Data extname('TESTFILE') qualified dim(999)
04  end-ds ;

05  dcl-ds Nulls qualified dim(9999) ;
06    Ind int(5) dim(4) ;
07  end-ds ;

08  dcl-s NbrOfElements int(5) inz(%elem(Data)) ;
09  dcl-s Rows packed(5) ;

10  exec sql SET OPTION COMMIT = *NONE, 
                        CLOSQLCSR = *ENDMOD,
                        DATFMT = *ISO ;

11  exec sql DECLARE C0 CURSOR FOR
              SELECT * FROM TESTFILE
               ORDER BY FIELD1
                 FOR READ ONLY ;

12  exec sql OPEN C0 ;

13  exec sql FETCH C0 FOR :NbrOfElements ROWS 
                          INTO :Data :Nulls ;

14  exec sql GET DIAGNOSTICS :Rows = ROW_COUNT ;

15  exec sql CLOSE C0 ;

16  *inlr = *on ;

Line 1: As this was written on IBM i 7.3 I should free my RPG.

Line 2: My standard control options.

Lines 3 and 4: This is the definition of the data structure array for the data from the example SQL table. Rather than list all the column by using the External Name keyword, EXTNAME the precompiler will use the table's definition for the data structure's subfields. I have coded that this array has 9,999 elements.

Lines 7 – 9: I need my null byte map array to have the same number of elements as the data's array. The data structure's subfield definition is interesting as it is a definition of an array, with the same number of elements as the number of subfields in the data's data structure. Each element has to be a 5 long integer, null is expressed at -1 and zero means that is not null. What this has created is a two dimensional array, an array within an array.

Line 8: This is the variable I will be using in the Fetch statement for the number of rows to retrieve from the table. I have initialized with the same number as the number of elements in the Data data structure array.

Line 9: I will be using this variable to contain the number of rows I retrieved using the Fetch.

Line 10: I like to put SQL options into my source member to ensure these values are used when the program is run. Otherwise I have to rely on the person compiling the program to change compile command parameters.

Line 11: Here is the definition of the cursor I will be using. I am selecting all the columns from TESTFILE and retrieving them in order by the value in FIELD1.

Line 12: The cursor is opened.

Line 13: Here I am fetching the number of rows held in the variable NbrOfElements, which is the same as the number of elements in the data structure array Data, into the two data structures Data and Nulls. As there is no comma ( , ) separating the two the null byte map (indicators) are moved to Nulls.

Line 14: Here I am getting the number of rows retrieved from the Fetch into the variable Rows.

Line 15: I am done with my cursor so it is time to close it.

I need to give the DATFMT in the SQL options so that the precompiler can calculate the length of the each element of the data structure array. If I do not, see below:


Then the precompiler does not know what the length of FIELD4, the date field, is and I get a precompile error:


SQL5011  30      13  Position 48 Host structure array DATA
                     not defined or not usable.

If I am using a two dimensional array I need to access the elements within it like:


The above is the null indicator for the equivalent of the second row retrieved from the Table and fourth column within it, FIELD4. With a Table of four columns it is not hard to determine which number to use for the desired array element in the Ind array. With 300+ it is too hard to sit there and count the columns in the file to determine which indicator matches the desired field. I use the compile listing. If I look where the Data data structure is shown I know that the number to the left of the subfield name is the number I need to use for the matching element in the Ind array.

* Data structure . . . . . . :  DATA
* External format  . . . . . :  TESTFILER : MYLIB/TESTFILE
000001=D FIELD1                         2A
000002=D FIELD2                         2A
000003=D FIELD3                         5P 0
000004=D FIELD4                        10D   DATFMT (*ISO-)

What about the data I retrieved from the Table? First the elements from the Data data structure array:

> EVAL data
  DATA.FIELD1(1) = '01'
  DATA.FIELD2(1) = '11'
  DATA.FIELD3(1) = 00001.
  DATA.FIELD4(1) = '2017-12-31'
  DATA.FIELD1(2) = '02'
  DATA.FIELD2(2) = '22'
  DATA.FIELD3(2) = 00002.
  DATA.FIELD4(2) = '0001-01-01'
  DATA.FIELD1(3) = '03'
  DATA.FIELD2(3) = '33'
  DATA.FIELD3(3) = 00000.
  DATA.FIELD4(3) = '0001-01-01'
  DATA.FIELD1(4) = '04'
  DATA.FIELD2(4) = '  '
  DATA.FIELD3(4) = 00000.
  DATA.FIELD4(4) = '0001-01-01'
  DATA.FIELD1(5) = '  '
  DATA.FIELD2(5) = '  '
  DATA.FIELD3(5) = 00000.
  DATA.FIELD4(5) = '0001-01-01'

Even though I know some of those columns are null the Data is just the field types' default value. I have to look in the Nulls for the null byte map indicators to see which columns are really null.

> EVAL nulls
  NULLS.IND(1,1) = 0
  NULLS.IND(1,2) = 0
  NULLS.IND(1,3) = 0
  NULLS.IND(1,4) = 0
  NULLS.IND(2,1) = 0
  NULLS.IND(2,2) = 0
  NULLS.IND(2,3) = 0
  NULLS.IND(2,4) = -1
  NULLS.IND(3,1) = 0
  NULLS.IND(3,2) = 0
  NULLS.IND(3,3) = -1
  NULLS.IND(3,4) = -1
  NULLS.IND(4,1) = 0
  NULLS.IND(4,2) = -1
  NULLS.IND(4,3) = -1
  NULLS.IND(4,4) = -1
  NULLS.IND(5,1) = 16448
  NULLS.IND(5,2) = 16448
  NULLS.IND(5,3) = 16448
  NULLS.IND(5,4) = 16448

When I see -1 I know that indicates that the equivalent subfield in Data is null. In debug the numbers in parentheses ( ( ) ) show the value for the first array, Nulls followed by a comma and then the element in the second array, Ind.

I only retrieved four rows using the Fetch statement, see below:

> EVAL rows
  ROWS = 00004.

The elements for the fifth and beyond elements in Nulls are unused, which is the value 16448.

I did find this a way of defining the null byte map array a lot easier than entering 300+ subfields.


This article was written for IBM i 7.3, and should work for earlier releases and TRs that support RPG free format definitions.


  1. Very good article you wrote, Simon. However I would suggest that when processing a table or view in an RPG (or any HLL), it would be far better and cleaner if the view replaced NULL's with 0 or blank instead of letting the RPG deal with NULLs. The use of IFNULL() or better yet COALESCE() placed in the VIEW (or stored procedure that returns a result-set via cursor) would be far cleaner, and this would not required the users of said VIEW or stored procedure to worry about NULL's. IFNULL() is only allowed in our own DB2 for i flavor, so better to use COALESCE() since it is supported by all flavors of DB2. When the VIEW has 1 or more LEFT OUTER JOINs, placing the COALESCE()'s in such VIEWs would take care of joins that were not satisfied. In the interest of multi-tier'ed architectures (which we all should be striving for) it is far better for VIEW's, Table Functions, and Stored Procedures to handle the NULL's than all the calling HLL's. We really need to let DB2 do the dirty work, heavy lifting instead of putting that task in each HLL.

  2. Very cool technique. Thanks!

  3. Learn something new everyday!

  4. Thank you, Simon, for another example. I'm with Dan, I'd rather use COALESCE() to identify/deal with nulls.

  5. 16448 is X'4040'
    This is because array of null indicators not inisialized

    If you use same array for more one tables without initialize the array, it's can occurs unpredictable results

  6. thank you Simon. this is really useful. I was using stand alone field but this looks handy when you are dealing with more than one fields where there is a possibility of getting null values. thanks.


To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.