 
This all started as a question from a member of another programming team at work. They had been asked to add a new field to an existing file. Most of the time this field would be empty, but it could contain up to 2,000 characters of data in some cases.
"It would be easy with a SQL table," they said "I could just make a new column VARCHAR (variable length character field) and make the default value null. But I don't know how to do that with a DDS field."
By using a variable length character field will mean that the field will not always be 2,000 characters. It will be as long as the data within it. And when there is no data in the field it will take up zero space. What a disk space savings.
Fortunately doing this with a DDS file is as easy as it is with a SQL table using the right keywords when defining the file. And it is just as easy to handle the variable length and null value in a RPG program. I decide to create my example file and the program on an IBM i partition that is running 7.2 just to show that there is nothing from a newer release need to do this.
It will come as no surprise to my regular readers that the file I will be using in this post is called TESTFILE. It contains just two fields:
| 01 A UNIQUE 02 A R TESTFILER 03 A KEY01 5A 04 A BIGCOL 2000A VARLEN 05 A ALWNULL 06 A ALIAS(BIG_COLUMN) 07 A K KEY01 | 
Line 1: When I define a DDS file or DDL (SQL) table I always use a unique key to ensure that it will never contain duplicate keyed records.
Line 3: This is the key field: KEY01
Line 4 – 6: This is the definition for the field my colleague was asking for. It is a maximum of 2,000 characters long. The VARLEN means that it is of varying length.
Line 5: The field is allowed to be null. Notice I did not say "contains null" I will explain later why that is so.
Line 6: I decided to give this field an alias name, that is longer than a ten long system name, that I can use in my RPG program.
After compiling the file if I use the Display Field File Description command, DSPFFD, I can see that the BIGCOL field has not been allocated the space of 2,000 characters:
| 
Field      Type       Length  Length  Position   Usage  Heading
KEY01      CHAR            5       5         1   Both   KEY01
BIGCOL     CHAR         2000    2002         6   Both   BIGCOL
  Alternative name  . . . . . . . . . . . . :
      BIG_COLUMN
  Variable length field -- Allocated length :  None
 | 
TESTFILE contains three records. Two where the BIGCOL is null, and one with text in it.
| KEY01 BIGCOL ----- -------------------------------------------------------- 1 - 2 VERY, VERY LONG DESCRIPTION FOR THIS COLUMN HERE TO SHOW 3 - | 
Before I show any programs using this file I want to explain how null works in the IBM i operating system. As I mentioned above the fields themselves never contain null. Within the file is a "null byte map". The only place I have ever had to accommodate "null byte map" is in a RPG program is in a trigger program. The "null byte map" area in RPG is broken up into a series on "null indicators", one for each field in the file. If the "null indicator" contains a value of "on" the corresponding field is considered null. The %NULLIND built in function allows me to handle this "null indicators" for the individual fields, as you will see below.
And now a simple program showing how to read this file.
| 
01  **free
02  ctl-opt option(*nodebugio:*srcstmt)
03            alwnull(*usrctl) ;
04  dcl-f TESTFILE keyed alias ;
05  dcl-s Work50 char(50) ;
06  dcl-s Length int(5) ;
07  dow (1 = 1) ;
08    read TESTFILER ;
09    if (%eof) ;
10      leave ;
11    endif ;
12    dsply ('Key field = ' + KEY01) ;
13    Length = %len(BIG_COLUMN) ;
14    dsply ('Length of BIG_COLUMN = ' + %char(Length)) ;
19  enddo ;
20  *inlr = *on ;
 | 
Line 1: It is 2020 so all of our program should be totally free RPG.
Lines 2 and 3: These are the program's control options. The important one for today is on line 3, ALWNULL(*USRCTL). This allows me handle nulls in the file using the null indicators.
Line 4: I have defined the file TESTFILE as input only, keyed, and I can use the alias names of the fields.
Lines 5 and 6: I will be using these variables in various parts of this example program.
Lines 7 – 11: I will be reading all of the records in the file.
Line 12: Display the value of the key field.
Line 13: As BIG_COLUMN variable in length I can use the %LEN built in function to see how long the field is in each record.
Line 14: Display the length of BIG_COLUMN.
When I run this program I see the following:
| DSPLY Key field = 1 DSPLY Length of BIG_COLUMN = 0 DSPLY Key field = 2 DSPLY Length of BIG_COLUMN = 69 DSPLY Key field = 3 DSPLY Length of BIG_COLUMN = 0 | 
As expected the BIG_COLUMN field in the first and third records are empty.
If I wanted to see the contents of the BIG_COLUMN field I might change the following lines in the program
| 13 Work50 = %subst(BIG_COLUMN:1:50) ; 14 dsply Work50 ; | 
Line 13: I have to use the variable Work50 as the DSPLY command can only be used with a variable of up to 52 characters.
Alas when I run the program with these lines in it I get the following:
| DSPLY Key field = 1 Length or start position is out of range for the string operation (C G D F). | 
As I cannot access the data within the BIG_COLUMN field in the records where it is null I need to do something like this to avoid the error:
| 
13  if %nullind(BIG_COLUMN) ;
14    dsply ('Big column = *N') ;
15  else ;
16    Work50 = %subst(BIG_COLUMN:1:50) ;
17    dsply Work50  ;
18  endif ;
 | 
Line 13: If the field's null indicator is on I execute line 14. If not then I will display the contents, well the first 50 character of, BIG_COLUMN.
When I call the program I do not get an error.
| DSPLY Key field = 1 DSPLY Big column = *N DSPLY Key field = 2 DSPLY VERY, VERY LONG DESCRIPTION FOR THIS COLUMN HERE T DSPLY Key field = 3 DSPLY Big column = *N | 
In the next program I am going to be updating and writing to TESTFILE.
| 01 **free 02 ctl-opt option(*nodebugio:*srcstmt) 03 alwnull(*usrctl) ; 04 dcl-f TESTFILE keyed usage(*update:*output) alias ; 05 setll *hival TESTFILER ; 06 readp TESTFILER ; 07 %nullind(BIG_COLUMN) = *off ; 08 BIG_COLUMN = 'UPDATED IN PROGRAM' ; 09 update TESTFILER %fields(BIG_COLUMN) ; 10 KEY01 = '4' ; 11 %nullind(BIG_COLUMN) = *on ; 12 write TESTFILER ; 13 chain '2 ' TESTFILER ; 14 if (%found) ; 15 %nullind(BIG_COLUMN) = *on ; 16 update TESTFILER %fields(BIG_COLUMN) ; 17 endif ; 18 *inlr = *on ; | 
Line 4: This time I have defined the file for update and output.
Lines 5 – 9: Here I am getting the last record in the file, setting off the null indicator, and moving a character value to the BIG_COLUMN field. I always like using the %FIELDS built in function as it will only update the fields I list.
Lines 10 – 12: Here I am writing a new record to the file, and as I have set on the null indicator the BIG_COLUMN will be considered null.
Lines 13 – 17: I always use a key field list with CHAIN, READE, etc. I prefer this method to the fixed format key list or key data structure as you can see the fields, or in this case value, I am using as the key to chain to the file on the line the operation happens, rather than having to hunt elsewhere for the key's definition. When the record is retrieved from the file I am setting on BIG_COLUMN's null indicator to on to wipe the value from the field and set it to null.
When the program has finished the contents of TESTFILE now looks like:
| KEY01 BIGCOL ----- ------------------ 1 - 2 - 3 UPDATED IN PROGRAM 4 - | 
And by using the variable length field it is not always 2,000 characters long, it is just the length of the field's contents thereby saving the disk space that would have been utilized by the rest of the 2,000 unused characters.
This article was written for IBM i 7.2, and should work for some earlier releases too.
... as an aside. It would be much better to add a length to the VARLEN keyword, i.e. VARLEN(256) equivalent of the ALLOCATE key word in SQL.
ReplyDeleteIf no allocated length is specified, only 16 Bytes (the address of the overflow area) is stored within the table. All data is stored in the overflow area, which causes additional database traffic.
For the best performance the allocated length should be as big, so around 80% of the data can be stored within the column and only the 20% of value longer than the allocated length go into the overflow area.
One minor comment. In the case of a VarChar field it can indeed contain null (i.e. when the length of the field is zero). In fact I would personally never bother with making a Varchar null capable since null capability is inherent and testing for %Len(https://www.penguinmagic.com/p/12250) = 0 or varchar = '' is a lot simpler than playing with nullind.
ReplyDelete@Jon, I'd argue against using a zero-length VARCHAR value as being equivalent to NULL; Oracle does this by default and it is a headache.
DeleteConceptually, NULL means "unknown". A null BIG_COLUMN means it really is unknown, while not null and %len = 0 means "we know this is a zero-length string".
With a long variable length field, it's a good item to trim trailing spaces before an insert/update to save some storage.
ReplyDeleteRinger
Indeed. The normal way to load any field is with the contents of another variable. If BIG_COLUMN was loaded with a fixed field with a defined length of 100 but the variable value is only 5, the remaining 95 spaces end up in BIG_COLUMN, when the goal is to save space.
Delete