Wednesday, October 21, 2020

Using SQL to extract packed decimal from a flat file

sql interpret to extract packed decimal from strings

I have a two "flat" files that contain packed numeric fields. The first is a "header" that contains the file date, the number of records and the total of the amounts in the "detail" file. The "detail" file contains account information that I don't really care about. I need to validate the count from the "header" file to the data in the "detail" file.

When I look at the file using the Display Physical File Member command, DSPPFM, I can see the packed fields. They are where the funny characters are:

DSPPFM FLATHDR

*...+....1....+....2....+....3
2020-10-01??????Ãí_

DSPPFM FLATDTL

*...+....1....+....2....+....3....+....4....+....5
First     ??One  ?????¬*Testing
Second    ??Two  ???Ì°?*Testing

Many years ago I wrote a RPG program to read the "detail" file, count the records, and total the amount fields. The program is in partially free RPG.

Technology Refresh 2 for IBM i 7.4 and TR8 for IBM i 7.3 introduced the SQL Interpret function, INTERPRET, that I can use it to do the same in SQL as I did the RPG program.

Let me start by showing a version of the RPG program I used before, this one I have converted to totally free RPG.

01  **free
02  dcl-f FLATHDR disk(30) ;
03  dcl-f FLATDTL disk(50) ;

04  dcl-ds HdrData len(30) qualified ;
05    H1 char(10) ;      //Date
06    H2 packed(5) ;     //No. records
07    H3 packed(11:2) ;  //Total of all records
08  end-ds ;

09  dcl-ds DtlData len(50) qualified ;
10    D1 char(10) ;      //Account
11    D2 packed(3) ;     //Sequence
12    D3 char(5) ;       //Sub account
13    D4 packed(11:2) ;  //Amount
14    D5 char(1) ;       //Flag
15    D6 char(20) ;      //Description
16  end-ds ;

17  dcl-s FileTotal packed(19:2) ;
18  dcl-s FileCount packed(7) ;

19  dow (1 = 1) ;
20    read FLATDTL DtlData ;
21    if (%eof) ;
22      leave ;
23    endif ;

24    FileTotal += DtlData.D4 ;
25    FileCount += 1 ;
26  enddo ;

27  read FLATHDR HdrData ;

28  if ((FileTotal <> HdrData.H3) or (FileCount <> HdrData.H2)) ;
29    dsply ('Total or count mismatch!') ;
30  endif ;

31  *inlr = *on ;

Lines 2 and 3: These are the definitions for the two files. I do not want them to be externally described, so I put the length of the "flat" files' records in the DISK keyword.

Lines 4 – 8: This is a data structure that I will be using to contain the data from the "header" file. As the data in the file is shorter than the length of the record I have a length keyword, LEN on line 4.

Lines 9 – 16: This is the data structure I will be using with the "detail" file. As the data in this file is shorter than the record length I have given the LEN on line 9.

Lines 17 and 18: These two variables will be used while I am calculating the count and totals.

Lines 19 – 26: In this Do loop I am reading the "detail" file into its data structure, DtlData, on line 20. I am then incrementing the total and count variables, lines 24 and 25.

Line 27: When I have read all the records from the "detail" file I need to read the "header" file, into its data structure, HdrData.

Lines 28 – 30: As this is only an example program I am not going to add a routine to handle when the counts or totals do not match. This is just an example program, if there is a mismatch a message is displayed to the user.

The program does what it needs to do, but if there are many, many records in the "detail" file it can take time for it to run. If I could use SQL I could do this much faster.

This is where the Interpret function comes into play. If I wanted to see the data in the "header" file in separate columns I would use the following statement:

01 SELECT INTERPRET(SUBSTR(FLATHDR,1,10) AS CHAR(10)) AS "Date",
02        INTERPRET(SUBSTR(FLATHDR,11,3) AS DEC(5,0)) AS "Count",
03        INTERPRET(SUBSTR(FLATHDR,14,6) AS DEC(11,2)) AS "Total"
04   FROM MYLIB.FLATHDR ;

Basically the Interpret function allows me to change a column, or in this example, part of a column to a different data type.

Line 1: Here I am defining the first ten characters of the FLATHDR column/row as a character column. I use the Substring function, SUBSTR, to define I am only want the first ten characters.

Line 2: This where is gets a bit more interesting. The second column I want to create continues on from the previous one, starting in position 11. This is a five long packed decimal number, which occupies only three positions as it is packed.

Line 3: The third column is a eleven packed decimal number with two decimal places. This occupies six positions in the record.

When I run this statement I get to see the data broken out into its parts:

Date        Count   Total 
----------  ------  --------
2020-10-01       2  -6665.56

The "detail" file contains more data elements which can be identified using the Interpret function thus:

01 SELECT INTERPRET(SUBSTR(FLATDTL,1,10) AS CHAR(10)) AS "Account",
02        INTERPRET(SUBSTR(FLATDTL,11,2) AS DEC(3,0)) AS "Seq",
03        INTERPRET(SUBSTR(FLATDTL,13,5) AS CHAR(5)) AS "Sub Acc",
04        INTERPRET(SUBSTR(FLATDTL,18,6) AS DEC(11,2)) AS "Amount",
05        INTERPRET(SUBSTR(FLATDTL,24,1) AS CHAR(1)) AS "Flg",
06        INTERPRET(SUBSTR(FLATDTL,25,20) AS CHAR(20)) 
            AS "Description"
07   FROM MYLIB.FLATDTL ;

The same logic I used for packed decimal numbers in the "header" I also used for the "detail" file. The results are:

Account Seq Sub Acc Amount   Flg Description
------- --- ------- -------- --- -----------
First     1 One       123.45 *   Testing
Second    2 Two     -6789.01 *   Testing

I do not need all the data from the "detail", I just need a count of the number of records and the total of the Amount column. This can be easily achieved with the following SQL statement:

01  SELECT COUNT(*) AS "Count",
02         SUM(INTERPRET(SUBSTR(FLATDTL,18,6) AS DEC(11,2))) 
             AS "Total"
03    FROM MYLIB.FLATDTL ;

The Count function on line 1 and the Sum function on line 2 give me the results I need:

Count   Total
------  --------
     2  -6665.56

I need to combine that information with the data from the "header" file to be able to check if count of records and the total amount match. This I can do using a Common Table Expression. Using a CTE I can create virtual tables, and combine them to give the results I desire. This is one to show the data from the "header" and "detail" files so that it might be compared.

01  WITH HDR AS
02  (SELECT  
03  INTERPRET(SUBSTR(A.FLATHDR,1,10) AS CHAR(10)) AS "HDR_DATE",
04  INTERPRET(SUBSTR(A.FLATHDR,11,3) AS DEC(5,0)) AS "HDR_COUNT",
05  INTERPRET(SUBSTR(A.FLATHDR,14,6) AS DEC(11,2)) AS "HDR_TOTAL"
06  FROM MYLIB.FLATHDR A),

07  DTL AS 
08  (SELECT
09  COUNT(B.FLATDTL) AS "DTL_COUNT",
10  SUM(INTERPRET(SUBSTR(B.FLATDTL,18,6) AS DEC(11,2))) AS "DTL_TOTAL"
11  FROM MYLIB.FLATDTL B)

12  SELECT  HDR_DATE,HDR_COUNT,HDR_TOTAL,DTL_COUNT,DTL_TOTAL,
13  CASE WHEN HDR_COUNT <> DTL_COUNT THEN 'ERROR'
14  ELSE 'OK' 
15  END AS "COUNT_ERROR",
16  CASE WHEN HDR_TOTAL <> DTL_TOTAL THEN 'ERROR'
17  ELSE 'OK' 
18  END AS "TOTAL_ERROR"
19  FROM DTL,HDR ;

You do not have to put a blank line between each CTE statement, I do to make the separate parts easier to understand.

Line 1: All CTE start with the word WITH.

Lines 1 – 6: I am making a virtual table, HDR, from the "header" file. Notice that this statement ends with a comma ( , ).

Lines 7 – 11: The virtual table DTL contains just the count of records and total amount from the "detail" file.

Lines 12 - 19: Here I combine the data from the two virtual tables to display the columns I created by using the INTERPRET. I have also created two new derived columns to identify if either the counts or the amounts do not match.

This looks like:

HDR_       HDR_  HDR_     DTL_  DTL_     COUNT_  TOTAL_
DATE       COUNT TOTAL    COUNT TOTAL    ERROR   ERROR
---------- ----- -------- ----- -------- ------  -------
2020-10-01     2 -6665.56     2 -6665.56 OK      OK

Having all these SQL statements in ACS's "Run SQL scripts" is fine and dandy. But I need to put this in a RPG program to get the result I desire.

01  **free
02  dcl-s Total char(5) ;
03  dcl-s Count char(5) ;

04  exec sql
05    WITH HDR AS
06    (SELECT
07    INTERPRET(SUBSTR(A.FLATHDR,11,3) AS DEC(5,0)) AS "HDR_COUNT",
08    INTERPRET(SUBSTR(A.FLATHDR,14,6) AS DEC(11,2)) AS "HDR_TOTAL"
09    FROM FLATHDR A),

10    DTL AS
12    (SELECT
13    COUNT(B.FLATDTL) AS "DTL_COUNT",
14    SUM(INTERPRET(SUBSTR(B.FLATDTL,18,6) AS DEC(11,2))) 
        AS "DTL_TOTAL"
15    FROM FLATDTL B)

16    SELECT
17    CASE WHEN HDR_COUNT <> DTL_COUNT THEN 'ERROR'
18    ELSE 'OK'
19    END,
20    CASE WHEN HDR_TOTAL <> DTL_TOTAL THEN 'ERROR'
21    ELSE 'OK'
22    END
23    INTO :Count,:Total
24    FROM DTL,HDR ;

25  if ((Count = 'ERROR') or (Total = 'ERROR')) ;
26    dsply ('Total or count mismatch!') ;
27  endif ;

28  *inlr = *on ;

The above program is just a simpler version of the previous CTE. All I need to know are the results from the two derived columns.

Lines 2 and 3: The variables that will contain the value returned from the two derived columns.

Lines 5 – 9: As before the first part of the CTE is getting the data from the "header" file. I only need the count and total columns.

Lines 10 – 15: From the "detail" file I only need the count of records and the sum of the amounts.

Lines 16 – 24: I only need to know if the counts and total amounts match. Therefore, the third part of the CTE is just the derived columns, whose values are returned into program variables on line 23.

Lines 25 – 27: My response to any error is woefully inadequate, but this is just an example program.

The above shows how useful the INTERPRET function is, and it to use it would make my simple validation program so much faster, and simpler to understand.

 

You can learn more about the INTERPRET SQL function from the IBM website here.

 

This article was written for IBM i 7.4 TR2 and 7.3 TR8.

3 comments:

  1. What is the difference between INTERPRET and CAST?
    Thank you!

    ReplyDelete
    Replies
    1. the first part of the INTERPRET is assumed as a binary value.

      Delete
  2. Excellent example of using INTERPRET option in SQL. I usually used to create another file with zone numeric fields and then copy the packed fields file to zone field file. Then SQL in the zone field file.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.