Wednesday, November 9, 2016

Coping with null with derived columns

using case with sql when derived column is null

I was assisting a colleague creating a SQL View when I encountered something I had not thought of before. He wanted to create a view with a derived column based on a column in the "right" table of a LEFT OUTER JOIN. Everything looked fine until the LEFT OUTER JOIN did not find a row in the "right" table. We could easily stop a null appearing for the columns from the "right" table, but what about the derived column?

Null is a concept that many people struggle with. I often have had to explain what null is and why it is different from blank or zero to older RPG programmers, who always have a confused or dismissive look in their faces. Users just don't understand the concept. This is why when I create output I always put some value in the column that comes up null.

Let me give a simple scenario to illustrate what I mean. I need to identify all of employees, past and present, in the Employee Master file do not belong to valid Department or if it is a valid Department is the Department effective date valid? There are two files:

  1. EMPLMST – Employee master
  2. DEPTMST – Department master

This very simple Employee Master file contains the following fields:

01  A          R EMPLMSTR
02  A            EMPLNBR        5P 0
03  A            LASTNME       30A
04  A            FIRSTNME      20A
05  A            MIDINITL       1A
06  A            DEPARTMENTR               REFFLD(DEPARTMENT DEPTMST)

Line 6: For those of you unfamiliar with the REFFLD this means that the field DEPARTMENT in this file has the same attributes as the field DEPARTMENT in the Department Master.

The Employee Master contains the following employees:

EMPLNBR   LASTNME     FIRSTNME     MIDINITL  DEPARTMENT
      1   CRUZ        ANNA            M        H0100
      2   SMITH       JOHN            D        H0101
      3   GRAY        DARNESHA        A        H0200

The Department Master is also very simple:

01  A          R DEPTMSTR
02  A            DEPARTMENT     5A
03  A            DEPTNAME      30A
04  A            EFFDATE        8P 0

This file contains these departments:

DEPARTMENT  DEPTNAME           EFFDATE
  H0100     GROCERY         20,000,718
  H0101     ELECTRONICS              0

As this is an old file the "date" is really a number (8,0), therefore, it can contain any numeric value, including those that are not valid date. The Electronic department clearly shows this with a "date" of zero.

I want to create a view that contains the following columns:

  1. Employee Number from Employee Master
  2. Employee Name, formatted as "Last name, First Name I", from Employee Master
  3. Department Id from the Employee Master, and this is the link to Department Master
  4. Department Name from Department Master
  5. Department Effective Date, the date the department was created, from Department Master.

I want to show the "date" in *USA format (MM/DD/YYYY) in the View as the people using the report are used to seeing the date in this format. I know there are many "date" numbers that are not valid dates, so rather than convert the "date" field to a real date, and risk errors, I am just going to convert it to alphanumeric and substring it to show like a *USA date.

My first attempt at a View to present this information looks like:

01  CREATE OR REPLACE VIEW MYLIB.TABLE1 (
02    EMPLOYEE_NUMBER FOR "EMPLNBR",
03    EMPLOYEE_NAME FOR "EMPNAME",
04    DEPARTMENT,
05    DEPARTMENT_NAME FOR "DEPTNAME",
06    DEPARTMENT_EFFECTIVE_DATE FOR "DEPTEFFDTE"
07  )
08  AS SELECT A.EMPLNBR,
09            MAKENAME(A.LASTNME,A.FIRSTNME,A.MIDINITL),
10            A.DEPARTMENT,
11            IFNULL(B.DEPTNAME,'Not found'),
12            SUBSTR(DIGITS(B.EFFDATE),5,2) CONCAT '/' CONCAT
13            SUBSTR(DIGITS(B.EFFDATE),7,2) CONCAT '/' CONCAT
14            SUBSTR(DIGITS(B.EFFDATE),1,4)
15       FROM EMPLMST A LEFT OUTER JOIN DEPTMST B
16            ON A.DEPARTMENT = B.DEPARTMENT
17  RCDFMT TABLE1R ;

Line 1: As I am using an IBM i that is running version 7.3 I can use the CREATE OR REPLACE VIEW. Those of you using a version before IBM i 7.2 will have to use just CREATE VIEW.

Lines 2 – 7: Here I am defining the columns that will be in the View with both their long and short names.

Lines 8 – 16: Is the Select statement that defines the data within the View.

Line 9: I am making use of the User Define Function I described in an earlier post to put the Employee name together.

Line 11: B.DEPARTMENT will be null if there is no matching Department file record for the value in the Employee Master. I can change the null to another value, using the IFNULL, to make it easier for users to understand what happened.

Lines 12 – 14: Is where I am converting the numeric representation of a "date" to alphanumeric, with slashes ( / ), *USA format. I use the DIGITS to convert the numeric EFFDTE to character before I substring, SUBSTR, the part of the "date" I need. Then use CONCAT to concatenate it all together.

Line 15 and 16: This is where I declare the type of join and the fields I use to link the two files together.

Line 17: I always give a record format name just in case at some point in the future this View needs to be read in an RPG program.

I can now use a Select statement to view the data in the View.

  SELECT * FROM MYLIB.VIEW1

NUMBER  EMPLOYEE_NAME      DEPARTMENT  DEPARTMENT_NAME  EFFECTIVE_DATE
     1  CRUZ, ANNA M         H0100     GROCERY          07/18/2000
     2  SMITH, JOHN D        H0101     ELECTRONICS      00/00/0000
     3  GRAY, DARNESHA A     H0200     Not found        -

The Department Effective Date of Darnesha is null, which is shown as a hyphen ( - ). I don't want it to display null as I don't want to have to explain what null is again. I need to put something that will make more sense to the users instead.

Searching the IBM KnowledgeCenter I found that I can use a CASE to test if a column's value is null. This allows me to change the section between lines 12 – 14 with:

12            CASE WHEN B.EFFDATE IS NULL THEN 'N/A'
13            ELSE
14              SUBSTR(DIGITS(B.EFFDATE),5,2) CONCAT '/' CONCAT
15              SUBSTR(DIGITS(B.EFFDATE),7,2) CONCAT '/' CONCAT
16              SUBSTR(DIGITS(B.EFFDATE),1,4)
17            END

Line 12: The CASE statement with the IS NULL allows me to give a value to the column if the join was not successful.

Line 14 – 16: Is the same code as before.

Line 17: All CASE statements must end with an END.

Now when I perform the same Select statement as before I see:

NUMBER  EMPLOYEE_NAME      DEPARTMENT  DEPARTMENT_NAME  EFFECTIVE_DATE
     1  CRUZ, ANNA M         H0100     GROCERY          07/18/2000
     2  SMITH, JOHN D        H0101     ELECTRONICS      00/00/0000
     3  GRAY, DARNESHA A     H0200     Not found        N/A

It is a lot easier for me to explain that "N/A" means Not Available as the department for Darnesha's is not valid.

 

This article was written for IBM i 7.3, and should work for earlier releases too.

8 comments:

  1. I would suggest using the COALESCE function instead, which is more standard compliant than IFNULL. Moreover, COALESCE accepts a list of arguments, while IFNULL can check only one.

    ReplyDelete
    Replies
    1. Like minds...I didn't see your comment before I added mine...we say the same things ;-).

      Delete
  2. Do you have to use CASE or can you use IFNULL/COALESCE for the date column as you did for Department Name?

    Is it proper to format data in a view? I'm referring to your date formatting. I don't create views and I'm wondering if that's best practice or if it is best to leave the column in its original format and then format it when you use the view.

    ReplyDelete
    Replies
    1. i am of the opinion that I want as much as possible of the formatting to be performed in the View. I can just take the data from the View, dump it into a data structure array, and just use it without any more formatting, it is that easy.

      Delete
  3. Simon, very good article. I would mention that instead of using IFNULL() it would be good to instead use COALESCE() because the latter is compatible across all of DB2 and other RDBMS's. Additionally, COALESCE() can be stacked with many more values, which is great when using with many LEFT OUTER JOINed tables. Again, good job!

    ReplyDelete
  4. Here are 2 other ways to format YYYYMMDD as 'MM/DD/YYYY'.
    The '000000' string is HHMMSS to complete the timestamp.

    char(Date(concat(digits(pmdate),'000000')),USA)
    -or a shorter version-
    char(Date(concat(pmdate,'000000')),USA)

    since SQL will cast the number to a string for you.

    Ringer

    ReplyDelete
  5. Hi guys, I hope all of you are very well.

    Currently I am programming with SQL embedded in RPGLE, and I had the same problem. I solved it adding a new variable, it is called NULLINDS.

    The declaration way is the next.

    * SQL Structure.
    DSQLSDO DS
    DDS_CIA 2 0
    DDS_NOM 40
    DDS_OPE 5 0
    DDS_IMP 11 2
    DDS_OET 5 0
    DDS_IET 11 2
    DDS_OGB 5 0
    DDS_IGB 11 2
    DDS_OAP 5 0
    DDS_IAP 11 2
    *
    DNULLINDS S 5I 0 DIM(10)

    The size of it depends on how many columns has your SQL Structure.

    See you guys.

    ReplyDelete
    Replies
    1. Have you checked out the post SQL and null that talks about null handlilng in SQL?

      Delete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.