Pages

Wednesday, March 8, 2017

Using Union to combine data from two files into one View

sql union clause to combine data from more than one file

When, in the ERP application my employer uses, an order is closed and posted to General Ledger the order's data is copied from the live file to the history file, and then deleted from the live file. This becomes an inconvenience when I am asked to produce a list of orders for a customer over a date range; the eligible data could be in both files. I have always wanted a way to link the live and history files together so that I can get the data from one place in the order I want.

Like most IBM i developers I want to use SQL to get data rather than use RPG's data access operation codes. I want to link these files together using SQL as I can get large chunks of data in one SQL operation faster than I can in RPG. Fortunately there is something in SQL to give me what I need, the UNION clause, that will allow me to join two or more selects statements together.

But before I describe it I think it will be useful to give some information on the example files I will be using. I have two Order Header files, ORDHDR and ORDHDRH, two Order detail files, ORDDTL and ORDDTLH, and a Vendor file, VENDOR. Both of the Order Header and both of the Order Details files are identical, except for the record format names.

     * Order Header
01  A                                      UNIQUE
02  A          R RORDHDR
03  A            ORDNBR         7A
04  A            ORDDTE         8P 0
05  A            VENNBR        10A
06  A            DUEDTE    R               REFFLD(ORDDTE  *SRC)
07  A            ORDSTS         2S 0
08  A          K ORDNBR
09  A          K ORDDTE
     * Order Header History
01  A                                      REF(ORDHDR)
02  A                                      UNIQUE
03  A          R RORDHDRH
04  A            ORDNBR    R
05  A            ORDDTE    R
06  A            VENNBR    R
07  A            DUEDTE    R
08  A            ORDSTS    R
09  A          K ORDNBR
10  A          K ORDDTE

As these are only examples both of the files contain two records:

Order Header
ORDNBR       ORDDTE   VENNBR          ORDDTE   ORDSTS
200711   20,170,125   V1054       20,171,005     20
200855   20,170,126   V1054       20,170,901     30
Order Header History
ORDNBR       ORDDTE   VENNBR          ORDDTE   ORDSTS
110021   20,150,918   V1054       20,151,231     99
198860   20,151,203   V1054       20,160,202     99

I can join both of the files together using what I know as a multiformat logical file, see below. As you can see I just give the record format names of the two physical files and the key order I want.

01  A          R RORDHDR                   PFILE(ORDHDR)
02  A          K ORDDTE
03  A          K ORDNBR
     *
04  A          R RORDHDRH                  PFILE(ORDHDRH)
05  A          K ORDDTE
06  A          K ORDNBR

I can read the multiformat logical file in RPG, see below, and get the records in my sort order from both record formats if I read using the file name, rather than the record format names.

01  dcl-f ORDHDRL0 keyed ;

02  dow (1 = 1) ;
03    read ORDHDRL0 ;
04    if (%eof) ;
05      leave ;
06    endif ;
07    dsply ('Order No. = <' + ORDNBR + '>') ;
08  enddo ;

DSPLY  Order No. = <110021 >
DSPLY  Order No. = <198860 >
DSPLY  Order No. = <200711 >
DSPLY  Order No. = <200855 >

If I try to use the multiformat with SQL I get the following message:

SELECT * FROM ORDHDRL0

File ORDHDRL0 in *LIBL has more than one format.

What I want to do is create a SQL View of rows/records from the live and history files. I can select the columns/rows I want in a Select statement, but I cannot join the data from the two files in the manner I want to do. This is where the UNION clause comes to the rescue. When I use the UNION I can combine two Select statements together, for example this statement below will join the rows/records from the live and history files into one set of results.

SELECT * FROM ORDHDR
UNION
SELECT * FROM ORDHDRH

My View will be a bit more complex as I want it to contain the matching information from the Vendor file, give the columns/fields long names, and good column headings and field texts.

01  CREATE OR REPLACE VIEW MYLIB.ORDHDRV0 (
02    ORDER_NBR FOR "ORDNBR",
03    ORDER_ENTRY_DATE FOR "ORDDTE",
04    VENDOR_NBR FOR "VENNBR",
05    ORDER_DUE_DATE FOR "DUEDTE",
06    ORDER_STATUS FOR "ORDSTS",
07    VENDOR_NAME FOR "VENNME",
08    VENDOR_ADDRESS_1 FOR "VENADR1",
09    VENDOR_CITY FOR "VENCITY",
10    VENDOR_STATE FOR "VENSTATE",
11    VENDOR_ZIP FOR "VENPCDE",
12    VENDOR_COUNTRY FOR "VENCNTRY"
13  )
14  AS
15    SELECT A.ORDNBR, A.ORDDTE, A.VENNBR, A.DUEDTE, A.ORDSTS,
16           B.VENNAME, B.VENADR1, B.VENCITY, B.VENSTATE,
17           B.VENPCDE, B.VENCNTRY
18      FROM ORDHDR A LEFT OUTER JOIN VENDOR B
19        ON A.VENNBR = B.VENNBR
20    UNION
21    SELECT A.ORDNBR, A.ORDDTE, A.VENNBR, A.DUEDTE, A.ORDSTS,
22        B.VENNAME, B.VENADR1, B.VENCITY, B.VENSTATE,
23           B.VENPCDE, B.VENCNTRY
24      FROM ORDHDRH A LEFT OUTER JOIN VENDOR B
25        ON A.VENNBR = B.VENNBR
26    RCDFMT RORDHDRV0 ;

27  LABEL ON COLUMN ORDHDRV0 (
28           ORDER_NBR IS 'Order               No.',
29    ORDER_ENTRY_DATE IS 'Order Entry         date',
30          VENDOR_NBR IS 'Vendor              No.',
31      ORDER_DUE_DATE IS 'Order Due           date',
32        ORDER_STATUS IS 'Order               status',
33         VENDOR_NAME IS 'Vendor              name',
34    VENDOR_ADDRESS_1 IS 'Vendor              address 1',
35         VENDOR_CITY IS 'Vendor              city',
36        VENDOR_STATE IS 'Vendor              state',
37          VENDOR_ZIP IS 'Vendor              zip code',
38      VENDOR_COUNTRY IS 'Vendor              country'
39  ) ;

40  LABEL ON COLUMN ORDHDRV0 (
41           ORDER_NBR TEXT IS 'Order number',
42    ORDER_ENTRY_DATE TEXT IS 'Order entry date',
43          VENDOR_NBR TEXT IS 'Vendor number',
44      ORDER_DUE_DATE TEXT IS 'Order due date',
45        ORDER_STATUS TEXT IS 'Order status code',
46         VENDOR_NAME TEXT IS 'Vendor name',
47    VENDOR_ADDRESS_1 TEXT IS 'Vendor address address line 1',
48         VENDOR_CITY TEXT IS 'Vendor address city',
49        VENDOR_STATE TEXT IS 'Vendor address state',
50          VENDOR_ZIP TEXT IS 'Vendor address zip/postal code',
51      VENDOR_COUNTRY TEXT IS 'Vendor address country code'
52  ) ;

Line 1: As the IBM i I am using is running 7.3 I can use the CREATE OR REPLACE statement. If you are running an old version of the operating system you will need to just use CREATE.

Lines 2 – 12: As I have joined two files together I need to give the columns/fields I want to be in this View. I am giving each column a long name and I am using the name of the file's fields for the short name.

Lines 15 – 19: A Union contains two or more subselects. This Select is the first one and joins the live Order Header file to the Vendor file.

Line 20: This is the all-important UNION clause that joins the result set from the first subselect to the second subselect's results.

Line 21 – 25: This subselect is identical to the first subselect, except this one uses the Order Header history file.

Line 26: I don't have to give this a record format name, but I do in case I ever need to read this View using RPG. I cannot think why I would, but you never know.

Lines 27 – 39: This where I am giving the columns/fields the equivalent of column headings.

Line 40 – 52: Here I am giving the columns/fields the equivalent of field text.

Having created by view using the Run SQL Statement command, RUNSQLSTM, I can check to see if the files are joined and union-ed the way I want:

SELECT * FROM ORDHDRV0 
 ORDER BY ORDER_NBR,ORDER_ENTRY_DATE


Order    Order Entry   Vendor   Order Due   Order   Vendor                Vendor
No.      date          No.      date        status  name                  address 1
110021    20,150,918   V1054    20,151,231     99    OFFICE SUPPLY INC.   116 FIRST ST
198860    20,151,203   V1054    20,160,202     99    OFFICE SUPPLY INC.   116 FIRST ST
200711    20,170,125   V1054    20,171,005     20    OFFICE SUPPLY INC.   116 FIRST ST
200855    20,170,126   V1054    20,170,901     30    OFFICE SUPPLY INC.   116 FIRST ST

It should come as no surprise that is there is an Order Header and Order Header History files there are also an Order Details and Order Detail History file too.

     * Order Detail
01  A                                      REF(ORDHDR)
02  A                                      UNIQUE
03  A          R RORDDTL
04  A            ORDNBR    R
05  A            ORDDTE    R
06  A            LINNBR         5P 0
07  A            PART          15A
08  A            LINQTY         7P 0
09  A            UPRICE         7P 3
10  A            LINSTS    R               REFFLD(ORDSTS)
11  A          K ORDNBR
12  A          K ORDDTE
13  A          K LINNBR
     * Order Detail History
01  A                                      REF(ORDDTL)
02  A                                      UNIQUE
03  A          R RORDDTLH
04  A            ORDNBR    R
05  A            ORDDTE    R
06  A            LINNBR    R
07  A            PART      R
08  A            LINQTY    R
09  A            UPRICE    R
10  A            LINSTS    R
11  A          K ORDNBR
12  A          K ORDDTE
13  A          K LINNBR

As these are only examples both of the files contain a few records:

Order Detail
ORDNBR       ORDDTE   LINNBR   PART        LINQTY    UPRICE   ORDSTS
200711   20,170,125        1   YW0717          20      .500     20
200711   20,170,125        2   PE11834          1     1.250     20
200711   20,170,125        3   PP1508-B         2     6.000     20
200855   20,170,126        1   PE34415         10      .300     40
200855   20,170,126        2   HE21007         45     1.500     20
Order Detail History
ORDNBR       ORDDTE   LINNBR   PART        LINQTY    UPRICE   ORDSTS
110021   20,150,918        1   LX3144         400      .050     99
198860   20,151,203        1   KG17240         13    25.000     99
198860   20,151,203        2   ES11407         24    11.400     99

The code needed to union these two files together is just like the previous View's definition.

01  CREATE OR REPLACE VIEW MYLIB.ORDDTLV0 (
02    ORDER_NBR FOR "ORDNBR",
03    ORDER_ENTRY_DATE FOR "ORDDTE",
04    LINE_NBR FOR "LINENBR",
05    PART_NBR FOR "PART",
06    LINE_QTY FOR "LINEQTY",
07    UNIT_PRICE FOR "UNITPRICE",
08    EXTENDED_PRICE FOR "EXTPRICE",
09    LINE_STATUS FOR "LINSTS"
10  )
11  AS
12    SELECT ORDNBR,ORDDTE,LINNBR,PART,LINQTY,
13           UPRICE,(LINQTY * UPRICE),LINSTS
14      FROM ORDDTL
15    UNION
16    SELECT ORDNBR,ORDDTE,LINNBR,PART,LINQTY,
17           UPRICE,(LINQTY * UPRICE),LINSTS
18      FROM ORDDTLH
19    RCDFMT RORDDTLV0 ;

20  LABEL ON COLUMN ORDDTLV0 (
21           ORDER_NBR IS 'Order               No.',
22    ORDER_ENTRY_DATE IS 'Order Entry         date',
23            LINE_NBR IS 'Line                No.',
24            PART_NBR IS 'Part                number',
25            LINE_QTY IS 'Line                qty',
26          UNIT_PRICE IS 'Unit                price',
27      EXTENDED_PRICE IS 'Extended            price',
28         LINE_STATUS IS 'Line                status'
29  ) ;

30  LABEL ON COLUMN ORDDTLV0 (
31           ORDER_NBR TEXT IS 'Order number',
32    ORDER_ENTRY_DATE TEXT IS 'Order entry date',
33            LINE_NBR TEXT IS 'Order line number',
34            PART_NBR TEXT IS 'Part number',
35            LINE_QTY TEXT IS 'Order line quantity',
36          UNIT_PRICE TEXT IS 'Unit price',
37      EXTENDED_PRICE TEXT IS 'Extended price',
38         LINE_STATUS TEXT IS 'Order line status'
39  ) ;

Notice that on lines 14 and 17 I have calculated the extended price. By doing this in the View every time I use the View the extended price is recalculated, and I don't have to do the calculation in whatever program is using this View.

Once created the new View combines the data from the two Details files:

SELECT * ORDDTLV0
 ORDER BY ORDER_NBR,ORDER_ENTRY_DATE,LINE_NBR


Order    Order Entry   Line   Part       Line       Unit     Extended   Line
No.      date          No.    number      qty        price   price      status
110021    20,150,918      1   LX3144      400        .050      20.000     99
198860    20,151,203      1   KG17240      13      25.000     325.000     99
198860    20,151,203      2   ES11407      24      11.400     273.600     99
200711    20,170,125      1   YW0717       20        .500      10.000     20
200711    20,170,125      2   PE11834       1       1.250       1.250     20
200711    20,170,125      3   PP1508-B      2       6.000      12.000     20
200855    20,170,126      1   PE34415      10        .300       3.000     40
200855    20,170,126      2   HE21007      45       1.500      67.500     20

As I have two views for the Header and Detail data I want to create third View that will combine the all of the data into one.

"Won't this be a performance hit?" I can hear some of you ask. "Surely it is not efficient to build a View over other views. You should build a View that joins all of these files together."

One of the wonderful parts of the IBM i operating system is the DB2 for i database engine. Whenever a SQL statement is run the "engine" works out the optimal way to get the data. When I build Views over other Views the "engine" does not build each View in turn, it works out the most efficient way to get the data from the available files and the access paths they have over them and gets it in one go. That is very cool!

Therefore, my third View is a combination of the two earlier ones.

01  CREATE OR REPLACE VIEW MYLIB.ORDALLV0 (
02    ORDER_NBR FOR "ORDNBR",
03    ORDER_ENTRY_DATE FOR "ORDDTE",
04    LINE_NBR FOR "LINENBR",
05    ORDER_STATUS FOR "ORDSTS",
06    LINE_STATUS FOR "LINSTS",
07    ORDER_DUE_DATE FOR "DUEDTE",
08    VENDOR_NBR FOR "VENNBR",
09    PART_NBR FOR "PART",
10    LINE_QTY FOR "LINEQTY",
11    UNIT_PRICE FOR "UNITPRICE",
12    EXTENDED_PRICE FOR "EXTPRICE",
13    VENDOR_NAME FOR "VENNME",
14    VENDOR_ADDRESS_1 FOR "VENADR1",
15    VENDOR_CITY FOR "VENCITY",
16    VENDOR_STATE FOR "VENSTATE",
17    VENDOR_ZIP FOR "VENPCDE",
18    VENDOR_COUNTRY FOR "VENCNTRY"
19  )
20  AS
21    SELECT A.ORDER_NBR,A.ORDER_ENTRY_DATE,B.LINE_NBR,
22           A.ORDER_STATUS,B.LINE_STATUS,
23           A.ORDER_DUE_DATE,A.VENDOR_NBR,
24           B.PART_NBR,B.LINE_QTY,B.UNIT_PRICE,
25           B.EXTENDED_PRICE,
26           A.VENDOR_NAME,A.VENDOR_ADDRESS_1,
27           A.VENDOR_CITY,A.VENDOR_STATE,A.VENDOR_ZIP,
28           A.VENDOR_COUNTRY
29      FROM ORDHDRV0 A LEFT OUTER JOIN ORDDTLV0 B
30        ON A.ORDER_NBR = B.ORDER_NBR
31       AND A.ORDER_ENTRY_DATE = B.ORDER_ENTRY_DATE
32    RCDFMT RORDALLV0 ;

After creating this view I can use it to see the information about any part of the order:

SELECT * FROM ORDALLV0
 ORDER BY ORDER_NBR,ORDER_ENTRY_DATE,LINE_NBR


Order    Order Entry   Line   Order   Line     Order Due   Vendor   Part       Line
No.      date          No.    status  status   date        No.      number     qty
110021    20,150,918      1     99      99    20,151,231   V1054    LX3144      400
198860    20,151,203      1     99      99    20,160,202   V1054    KG17240      13
198860    20,151,203      2     99      99    20,160,202   V1054    ES11407      24
200711    20,170,125      1     20      20    20,171,005   V1054    YW0717       20
200711    20,170,125      2     20      20    20,171,005   V1054    PE11834       1
200711    20,170,125      3     20      20    20,171,005   V1054    PP1508-B      2
200855    20,170,126      1     30      40    20,170,901   V1054    PE34415      10
200855    20,170,126      2     30      20    20,170,901   V1054    HE21007      45

I am finding I am using Views more and more as I can create one just the way I want and use it in as many places as I want.

 

You can learn more about the SQL's UNION clause from the IBM website here.

 

This article was written for IBM i 7.3, and should work for earlier releases too.

7 comments:

  1. I cant help but think of Temporal tables while reading this example. I am sure in the real world use you would be using that instead of the UNION, but this is a good easy example of union with 2 identical tables ....the next is making a union out of various tables some missing columns etc. ...now thats fun.

    ReplyDelete
    Replies
    1. A Temporal table would be a great thing to use in this situation if it was created from new. But, like many IBM i developers the ERP application pre-dates Temporal tables.

      Delete
  2. I am a big fan of these union views over current and history files. I usually use "UNION ALL" so it does not try to remove any duplicate records. I also like to add a field to let me know where the data came from. I use a literal in each of the select statements like 'CURR' or 'HIST' and use "as Source" as the column name. This way if I ever need to know which file the record came from I will know.

    ReplyDelete
    Replies
    1. Good advice about the column names.

      The "UNION ALL" will be addressed in a future post.

      Delete
  3. Oh when will we get an edit code in SQL so we don't have to look at commas in numbers like numeric dates and times? DDS can do this (EDTCDE) and SQL gladly honors it in the result set. I really don't care to wrap DIGITS around them in SELECTs. Ringer.

    ReplyDelete
  4. This is very, very cool and I will definitely be playing around with it! Although, because of the typical volume of order history, I don't think I would attempt running this during business hours.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.