Wednesday, June 24, 2020

Using SQL to validate packed fields

validate packed number fields with sql

When I heard of these three table functions being introduced as part of the latest round of Technology Refreshes, IBM i 7.4 TR2 and 7.3 TR8, I was really interested to know what they did as their names, "Validate data", really piqued my interested. These table functions will validate packed numeric fields at, what I like to call, three levels:

  1. VALIDATE_DATA:  validates at the member level
  2. VALIDATE_DATA_FILE:  validates at the file level
  3. VALIDATE_DATA_LIBRARY:  validates at the library level

These three exist in the SYSTOOLS library.

It is nigh impossible to put bad data into date, time, and timestamp fields in files. But it is not too hard to put bad data into a packed numeric field. While rare, I am seeing this more as data is taken from other types of databases and applications, and loaded into files in IBM i.

I have created a file with a packed field and I have put alphanumeric data into it to show how these table functions can are used. My file, TESTFILE contains two fields the first alphanumeric and the second packed numeric. I created the first record in this file to have alphanumeric data in the numeric field. I need to know the relative record number as these table functions return that as the "key" to identify which record contains the bad numeric field:

SELECT RRN(A) AS "RRN",A.* FROM MYLIB.TESTFILE A

RRN  F001     F002
---  -------  ----
  1  FIRST       -
  2  SECOND      2 
  3  THIRD       3

The hyphen ( - ) indicates the value null for the field F002 in the first record, as a numeric value cannot be returned into the results. Personally, I think Query does a better job of letting me know there is bad data in a field:

RUNQRY *NONE TESTFILE

F001            F002
------- -------------
FIRST   +++++++++++++
SECOND             2
THIRD              3

The first of these three table functions, VALIDATE_DATA, validates the file at the member level, and requires three parameters passed to it:

  1. LIBRARY_NAME:  Library that contains the file I want to test. Has to be the library name, Using the value *LIBL will return an error
  2. FILE_NAME:  File that contains the member to test
  3. MEMBER_NAME:  Name of the member. Could also be the values *FIRST or *LAST

My file contains only one member so I will be using *FIRST for the member name.

There are two ways I can enter the statement for this table function, the first way is with the parameter names:

SELECT *
  FROM TABLE(SYSTOOLS.VALIDATE_DATA(
              LIBRARY_NAME => 'MYLIB',
              FILE_NAME => 'TESTFILE',
              MEMBER_NAME => '*FIRST'))

Using the statement this way the parameters can be entered in any order:

SELECT *
  FROM TABLE(SYSTOOLS.VALIDATE_DATA(
              MEMBER_NAME => '*FIRST',
              LIBRARY_NAME => 'MYLIB',
              FILE_NAME => 'TESTFILE'))

The other way is to pass the parameters in the expected order, library, file, member, without the parameter names:

SELECT *
  FROM TABLE(SYSTOOLS.VALIDATE_DATA(
              'MYLIB','TESTFILE','*FIRST'))

The results are as follows:

                                                           RELATIVE
                            LIBRARY    FILE_     MEMBER    _RECORD
VALIDATE_TIME               _NAME      NAME      _NAME     _NUMBER
--------------------------  ---------  --------  --------  --------
2020-06-03 15:27:44.048101  MYLIB      TESTFILE  TESTFILE         1


SQL_     REASON  COLUMN
WARNING  _CODE   _NAME   WARNING_TEXT
-------  ------  ------  --------------------------------------
    802      6   F002  Data conversion or data mapping error.

What is SQL Warning 802 and reason code 6? Fortunately this is the same as IBM i message SQL0802 which is:

SQL0802                               
Data conversion or data mapping error.

Reason code:
6 -- Numeric data that is not valid.

Just as a heads up. When playing which these table functions occasionally the above message was accompanied by the one below:

                                                           RELATIVE
                            LIBRARY    FILE_     MEMBER    _RECORD
VALIDATE_TIME               _NAME      NAME      _NAME     _NUMBER
--------------------------  ---------  --------  --------  --------
2020-06-03 15:27:44.048817  MYLIB      TESTFILE  -                1


SQL_     REASON  COLUMN
WARNING  _CODE   _NAME   WARNING_TEXT
-------  ------  ------  ----------------------------------
    462      0   QSYS2   Procedure or user-defined function 
                           PARTITION_STATISTICS in QSYS2 
                           returned a warning SQLSTATE.

This message can be ignored.

If I had a multimember file I would just replace *FIRST with one of the member names.

Next is the file level, VALIDATE_DATA_FILE. This table function will check all members in the file passed as a parameter:

SELECT *
  FROM TABLE(SYSTOOLS.VALIDATE_DATA_FILE(
              LIBRARY_NAME => 'MYLIB',
              FILE_NAME => 'TESTFILE'))

SELECT *
  FROM TABLE(SYSTOOLS.VALIDATE_DATA_FILE(
              'MYLIB','TESTFILE'))

It is pretty much the same as the previous table function. It is so similar that its results are the same too, which is why I am not going to show them here.

Last level is the library level, VALIDATE_DATA_LIBRARY. No surprises that the only parameter is the library name:

SELECT *
  FROM TABLE(SYSTOOLS.VALIDATE_DATA_LIBRARY(
              LIBRARY_NAME => 'MYLIB'))

SELECT *
  FROM TABLE(SYSTOOLS.VALIDATE_DATA_LIBRARY(
              'MYLIB'))

As there could be many files in the library running this statement interactively could take some time for the results to be returned. I could create simple CL program to run the statement and output it to a file.

01  PGM

02  DLTF FILE(MYLIB/VALIDLIB)
03  MONMSG MSGID(CPF0000)

04  RUNSQL SQL('CREATE TABLE MYLIB.VALIDLIB AS +
05              (SELECT * +
06               FROM TABLE(SYSTOOLS.VALIDATE_DATA_LIBRARY(+
07               LIBRARY_NAME => ''MYLIB''))) +
08             WITH DATA') +
09           COMMIT(*NC)

10  ENDPGM

Lines 4 – 9: I do use the Run SQL Statement Command, RUNSQL, a lot as I find it easier to write a quick, little CL program than the equivalent in RPG.

Lines 4 – 8: The results from the VALIDATE_DATA_LIBRARY statement are used to create the table VALIDLIB in my library.

After submitting this program to batch, and it completing, I can then run the following SQL statement to see how many bad packed numeric fields I have in files in MYLIB:

SELECT * FROM MYLIB.VALIDLIB

The data says I have two files with bad packed numeric fields:

                                                             RELATIVE
                            LIBRARY    FILE_      MEMBER     _RECORD
VALIDATE_TIME               _NAME      NAME       _NAME      _NUMBER
--------------------------  ---------  --------=  --------   --------
2020-06-06 22:32:00.398852  MYLIB      TESTFILE   TESTFILE          1
2020-06-06 22:32:00.725011  MYLIB      TESTFILE2  TESTFILE2        11


SQL_     REASON  COLUMN
WARNING  _CODE   _NAME      WARNING_TEXT
-------  ------  ---------- --------------------------------------
    802      6   F002     Data conversion or data mapping error.
    802      6   ANOTHERFLD Data conversion or data mapping error.

 

You can learn more about the VALIDATE_DATA, VALIDATE_DATA_FILE, and VALIDATE_DATA_LIBRARY SQL table functions from the IBM website here.

 

This article was written for IBM i 7.4 TR2 and 7.3 TR8.

15 comments:

  1. Neat. Seems I’ve used sql to check a certain hex value in the packed field before to determine if invalid. Worked great. This is even better.

    ReplyDelete
  2. Thanks for sharing!

    ReplyDelete
  3. Reynaldo Dandreb MedillaMay 16, 2021 at 7:16 AM

    cool one Simon

    ReplyDelete
  4. This is VERY valuable...thanks Simon!

    ReplyDelete
  5. I ran into this with the user copy/pasting web links onto the system. Which was in Hex.

    ReplyDelete
  6. Just a comment: The UDTFs will only return one row per Db2 row. That means, if there are multiple invalid numeric values in the same row, you'll get the information only for one of them.
    I.e. after having revised the invalid numeric values, you should run the VALIDATE... UDTF again, to make sure you revised all invalid numeric values.

    ReplyDelete
  7. Excellent Simon... Thank you for sharing..

    ReplyDelete
  8. Thank you Simon for sharing a good "burned" :)

    ReplyDelete
  9. Tried using this for the first time and ran into a problem with date fields that contain the date '0001-01-01'. It returns SQL warning 181 or SQL0181 - 'Value in date, time, or timestamp string not valid.' Yes we probably should use nulls in these but it is common practice to use *LOVAL when tables don't allow nulls. My Run SQL Scripts JDBC connection uses *ISO as the date type so I assume it should be allowed. When I change the date to 2001-01-01 the error goes away. Any suggestions on ignoring or allowing this date value?

    ReplyDelete
    Replies
    1. Check how the date is defined in the file.
      The only time I have seen where '0001-01-01' is not a valid date when the field the data is in is not defined as *ISO, but as another data format, for example *YMD or *CYMD.

      Delete
    2. I did some more testing. This is a DDS created file. I then recreated the file using SQL and then validated it and no errors were found. I tried recreating the file using DDS just in case the problem is that the file was created so long ago (1995). The field is a type "L" date field and the DSPFFD says it is *ISO. So I believe the issue is with DDS files but not with SQL created tables.

      Delete
    3. Doug, You wouldn't get ANY errors if the file was created with DDL. Not only date fields but numeric fields also. DDS would allow you to do crazy stuff like let's say you had a numeric column for Soviet Union currency conversion factor. Since there is no such thing anymore you could have RPG overlay that numeric column with a character string and repurpose it for something else. Any attempt to do so with a DDL file would be stopped upon output. If you want proof of this created a file with a single 7,0 packed decimal column. Now create another file with no dds or ddl by CRTPF RCDLEN(4). In the RCDLEN created file add a row with 4 letters. Now do a CPYF from the RCDLEN file into the decimal column file. If that file was created with DDS it would let you. If that file was created with DDL it will not let you. Simple test.

      Delete
    4. Doug, if you created that file with DDL it won't allow any invalid numeric data in the first place.

      Delete
    5. OK - there is no point in using VALIDATE_FILE on a DDL table since it can not contain bad data. Perhaps VALIDATE_FILE ignores these tables altogether. However, I can still put '0001-01-01' in the date field of the DDL table with no problem and VALIDATE_FILE does not report a problem with that value. But '0001-01-01' is considered a bad date in DDS created tables. This renders the VALIDATE_xxxx table functions useless to me. We have many tables in our system that use '0001-01-01' as a pseudo null value. Such as date shipped in an order file.

      Delete
  10. Reynaldo Dandreb MedillaSeptember 4, 2022 at 10:13 AM

    that would be handy, thanks Simon

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.