 
When anyone asks for information that they are going to "pull" from the IBM i into another database or to interface to another application on the IBM i I do not want them to access the "live" files for several reasons:
- Control the information they can get to, don't show them what they do not need
- If there are calculations or other mapping that needs to be performed perform them, rather than rely on them to do it and then find they have not done so in the right manner
- Map data from the files and tables into the type of data format they want
- Prevent them from being able to insert, change, or delete data from files and tables
In my opinion the perfect tool for this is a SQL View. I can build it the way I want and tell them to use that.
A recent example was a request for the date of the latest test runs of a certain job on the development server. This is complex as there are ten environments (library lists of related libraries) and they wanted the information from all. The ERP application writes the data from each run into a DDS file. This file contains a lot more than the start and end dates selected for the job, which is not needed and I am not going to give it.
All the data that was needed was:
- Environment designator – which is not held in the file, but was hard coded as a single character
- From date – packed 6,0 field
- To date – packed 6,0
- Submitted date – date
- Submitted time – time
In these examples I am not going to show how to combine ten environments of data. What works for three can be easily modified to work for ten.
As the layout of the file and data is all the same I can combine the data into SQL View using the UNION clause:
| 01 SELECT 'A',A.* FROM LIBRARY1.TESTFILE A 02 UNION 03 SELECT 'B',B.* FROM LIBRARY2.TESTFILE B 04 UNION 05 SELECT 'C',C.* FROM LIBRARY3.TESTFILE C 06 ORDER BY SBMDATE DESC,SBMTIME DESC ; | 
By all means the first rows shown are the most recent, but the results include every row from the three files.
| Top 5 results only 00001 FRMDTE THRDTE SBMDATE SBMTIME ----- ------ ------ ---------- -------- C 42921 43021 2021-05-01 07.55.15 A 42421 42521 2021-04-26 09.48.45 B 42121 42221 2021-04-21 13.48.34 B 41021 42021 2021-04-21 12.56.27 C 40121 41521 2021-04-16 17.39.58 | 
My first attempt of only returning the first row from each environment was:
| 01 SELECT 'A',A.* FROM LIBRARY1.TESTFILE A 02 LIMIT 1 03 UNION 04 SELECT 'B',B.* FROM LIBRARY2.TESTFILE B 05 LIMIT 1 06 UNION 07 SELECT 'C',C.* FROM LIBRARY3.TESTFILE C 08 LIMIT 1 09 ORDER BY SBMDATE DESC,SBMTIME DESC ; | 
Lines 2, 5, and 8: The LIMIT instructs the statement to only return the first record from each SQL statement.
I am performing all of these test in ACS's "Run SQL scripts" and the error returned looks like:
| SQL State: 42601 Vendor Code: -199 Message: [SQL0199] Keyword UNION not expected. | 
This was easily overcome by making each part of the statement a sub-select:
| 01 (SELECT 'A',A.* FROM LIBRARY1.TESTFILE A 02 LIMIT 1) 03 UNION 04 (SELECT 'B',B.* FROM LIBRARY2.TESTFILE B 05 LIMIT 1) 06 UNION 07 (SELECT 'C',C.* FROM LIBRARY3.TESTFILE C 08 LIMIT 1) 09 ORDER BY SBMDATE DESC,SBMTIME DESC ; | 
The results were exactly what was desired:
| 00001 FRMDTE THRDTE SBMDATE SBMTIME ----- ------ ------ ---------- -------- C 42921 43021 2021-05-01 07.55.15 A 42421 42521 2021-04-26 09.48.45 B 42121 42221 2021-04-21 13.48.34 | 
Now there is one of those gotchas. The job can be run a maximum of twice per day, most of time it is only run once, but on occasion it can be run a second time. To be able to return the two most recent results from each environment only required me to change the number of records returned by the LIMIT.
| 01 (SELECT 'A',A.* FROM LIBRARY1.TESTFILE A 02 LIMIT 2) 03 UNION 04 (SELECT 'B',B.* FROM LIBRARY2.TESTFILE B 05 LIMIT 2) 06 UNION 07 (SELECT 'C',C.* FROM LIBRARY3.TESTFILE C 08 LIMIT 2) 09 ORDER BY 1,4 DESC,5 DESC ; | 
Line 9: As I am just playing around the statement to get what I want rather than type in the column names in the ORDER BY I can just give the number to the place that column is in the results. In this example I am sorting by the first column, environment, fourth column, submit date, and fifth column, submit time.
The results now show two rows from each of the files:
| 00001 FRMDTE THRDTE SBMDATE SBMTIME ----- ------ ------ ---------- -------- A 42421 42521 2021-04-26 09.48.45 A 30121 33121 2021-04-02 16.10.45 B 42121 42221 2021-04-21 13.48.34 B 41021 42021 2021-04-21 12.56.27 C 42921 43021 2021-05-01 07.55.15 C 40121 41521 2021-04-16 17.39.58 | 
Now to turn this statement into one to create a SQL View:
| 01 CREATE OR REPLACE VIEW MYLIB.UNION_VIEW_1 02 FOR SYSTEM NAME "UNIONVIEW1" 03 (ENVIRONMENT FOR COLUMN "ENVIR", 04 FROM_DATE FOR COLUMN "FRMDTE", 05 THRU_DATE FOR COLUMN "THRDATE", 06 SUBMIT_DATE FOR COLUMN "SBMDATE", 07 SUBMIT_TIME FOR COLUMN "SBMTIME") 08 AS 09 ( 10 (SELECT 'A',A.* FROM LIBRARY1.TESTFILE A 11 LIMIT 2) 12 UNION 13 (SELECT 'B',B.* FROM LIBRARY2.TESTFILE B 14 LIMIT 2) 15 UNION 16 (SELECT 'C',C.* FROM LIBRARY3.TESTFILE C 17 LIMIT 2) 18 ORDER BY 1,4 DESC,5 DESC 19 ) ; | 
Line 2: I always like to give my SQL Views both a long name, UNION_VIEW_1, and a system short name, UNIONVIEW1.
Lines 3 – 7: As I have created a column, Environment, in the results that is not in the definition for the file(s) used I need to define the columns for the View. I do this many times even if there are no additional columns to give View's columns relevant long and system short names.
Lines 10 – 18: This is the SQL statement that I used in "Run SQL scripts" to generate the results I want.
Alas, when I tried to create this View I received the following message:
| SQL State: 428FJ Vendor Code: -20211 Message: [SQ20211] Clause not allowed for CREATE VIEW. Cause . . . . . : ORDER BY, FETCH FIRST n ROWS, and OFFSET are not allowed in the outer full select of a view. | 
Here it is the ORDER BY that is causing the problem. What I need to do is define it in each of the sub-selects, lines 11, 15, and 19.
| 01 CREATE OR REPLACE VIEW MYLIB.UNION_VIEW_1 02 FOR SYSTEM NAME "UNIONVIEW1" 03 (ENVIRONMENT FOR COLUMN "ENVIR", 04 FROM_DATE FOR COLUMN "FRMDTE", 05 THRU_DATE FOR COLUMN "THRDATE", 06 SUBMIT_DATE FOR COLUMN "SBMDATE", 07 SUBMIT_TIME FOR COLUMN "SBMTIME") 08 AS 09 ( 10 (SELECT 'A',A.* FROM LIBRARY1.TESTFILE A 11 ORDER BY SBMDATE DESC,SBMTIME DESC 12 LIMIT 2) 13 UNION 14 (SELECT 'B',B.* FROM LIBRARY2.TESTFILE B 15 ORDER BY SBMDATE DESC,SBMTIME DESC 16 LIMIT 2) 17 UNION 18 (SELECT 'C',C.* FROM LIBRARY3.TESTFILE C 19 ORDER BY SBMDATE DESC,SBMTIME DESC 20 LIMIT 2) 21 ) ; | 
Now the View is created and I can query its data with the following SQL statement:
| SELECT * FROM UNION_VIEW_1 ORDER BY 1,4 DESC,5 DESC | 
The results are:
| ENVIRONMENT FROM_DATE THRU_DATE SUBMIT_DATE SUBMIT_TIME ----------- --------- --------- ----------- ----------- A 42421 42521 2021-04-26 09.48.45 A 30121 33121 2021-04-02 16.10.45 B 42121 42221 2021-04-21 13.48.34 B 41021 42021 2021-04-21 12.56.27 C 42921 43021 2021-05-01 07.55.15 C 40121 41521 2021-04-16 17.39.58 | 
I need to provide the View's column definitions. Fortunately I can get that from the SYSCOLUMNS View:
| 01 SELECT TABLE_NAME,COLUMN_NAME,SYSTEM_COLUMN_NAME, 02 DATA_TYPE,LENGTH,NUMERIC_SCALE 03 FROM QSYS2.SYSCOLUMNS 04 WHERE TABLE_SCHEMA = 'MYLIB' 05 AND TABLE_NAME = 'UNION_VIEW_1' | 
Which returns to me:
| TABLE SYSTEM_ DATE_ NUMERIC _NAME COLUMN_NAME COLUMN_NAME TYPE LENGTH _SCALE ------------ ----------- ----------- ------- ------ ------- UNION_VIEW_1 ENVIRONMENT ENVIR VARCHAR 1 | 
Personally I dislike using decimal values masquerading as dates in a View like this. I am going to redefine them as true dates. And just because I can I am going to redefine the Environment from VARCHAR to CHAR. This is going to be a long statement, but you will recognize a most of it from the other examples I have shown above.
| 01 CREATE OR REPLACE VIEW MYLIB.UNION_VIEW_2 02 FOR SYSTEM NAME "UNIONVIEW2" 03 (ENVIRONMENT FOR COLUMN "ENVIR", 04 FROM_DATE FOR COLUMN "FRMDTE", 05 THRU_DATE FOR COLUMN "THRDATE", 06 SUBMIT_DATE FOR COLUMN "SBMDATE", 07 SUBMIT_TIME FOR COLUMN "SBMTIME") 08 AS 09 ( 10 WITH T1 (C1,C2,C3,C4,C5) AS 11 ((SELECT 'A',A.* FROM LIBRARY1.TESTFILE A 12 ORDER BY SBMDATE DESC,SBMTIME DESC 13 LIMIT 2) 14 UNION 15 (SELECT 'B',B.* FROM LIBRARY2.TESTFILE B 16 ORDER BY SBMDATE DESC,SBMTIME DESC 17 LIMIT 2) 18 UNION 19 (SELECT 'C',C.* FROM LIBRARY3.TESTFILE C 20 ORDER BY SBMDATE DESC,SBMTIME DESC 21 LIMIT 2)), 22 T2 (C1,C2,C3,C4,C5) AS 23 (SELECT CAST(C1 AS CHAR(1)), 24 DATE(TIMESTAMP_FORMAT(DIGITS(C2),'MMDDYY')), 25 DATE(TIMESTAMP_FORMAT(DIGITS(C3),'MMDDYY')), 26 C4,C5 27 FROM T1 28 ORDER BY C1,C4 DESC,C5 DESC) 29 SELECT * FROM T2 30 ) ; | 
Lines 1 – 9: The same as before, but I am calling this View a different name UNION_VIEW_2.
Lines 10 – 29: I have decided to do the part that gets the rows and format them in a Common Table Expression, CTE.
Part 1: Get the data
Line 10: All CTE start with WITH followed by a table name. As I like to keep things simple, K.I.S.S., I always call the first one T1. What follows the table name is the list of the columns within it. It does not matter what I call them, so to illustrate I am just calling the resulting columns C1 - C5.
Lines 11 – 21: This is the same SQL statement as before, but as it is being used in a temporary table, T1, it is enclosed with parenthesis ( ( ) ).
Line 21: Notice the comma at the end of this line.
Part 2: Format the data
Line 22: The second table does not need the WITH. The name of the table T2 is followed by the list of the columns in will contain.
Line 23: I am using the CAST to redefine the Environment column from VARCHAR(1) to CHAR(1).
Lines 24 and 25: Here I am converting those numbers masquerading as dates to real dates.
Lines 26: Submit date and time columns.
Line 27: The data comes from the temporary table I created earlier, T1.
Line 28: As I am with the temporary table definition I can sort the results by environment, submit date, and submit time. As this is the last temporary table definition there is no comma at the end of it.
Line 29: This Select statement "displays" the rows for the View, which will use the column names defined at the top of the View.
Why did I do the conversions in T2 rather than T1? If I had performed the conversions in T1 then all of the columns in the three TESTFILE would have been converted before the first two results would have been returned. As these files can contain large amounts of data that is a lot of processing just to return two rows. By performing the conversion in T2 I only convert the rows in just six results.
If I query the new View using:
| SELECT * FROM UNION_VIEW_2 | 
I can see that the from and through dates are now true dates:
| ENVIRONMENT FROM_DATE THRU_DATE SUBMIT_DATE SUBMIT_TIME ----------- ---------- ---------- ----------- ----------- A 2021-04-24 2021-04-25 2021-04-26 09.48.45 A 2021-03-01 2021-03-31 2021-04-02 16.10.45 B 2021-04-21 2021-04-22 2021-04-21 13.48.34 B 2021-04-10 2021-04-20 2021-04-21 12.56.27 C 2021-04-29 2021-04-30 2021-05-01 07.55.15 C 2021-04-01 2021-04-15 2021-04-16 17.39.58 | 
I can use the SYSCOLUMNS View to confirm that my conversion of the environment column was successful:
| 
SELECT TABLE_NAME,COLUMN_NAME,SYSTEM_COLUMN_NAME,
       DATA_TYPE,LENGTH,NUMERIC_SCALE
  FROM QSYS2.SYSCOLUMNS
 WHERE TABLE_SCHEMA = 'MYLIB'
   AND TABLE_NAME = 'UNION_VIEW_2'
 | 
The results show that environment is now CHAR(1), and confirms that all the dates are dates.
| TABLE SYSTEM_ DATE_ NUMERIC _NAME COLUMN_NAME COLUMN_NAME TYPE LENGTH _SCALE ------------ ----------- ----------- ----- ------ ------- UNION_VIEW_2 ENVIRONMENT ENVIR CHAR 1 | 
I can now tell the person I am working with to use the second View as there is only the data I want to share, and is formatted in the manner we both want.
This article was written for IBM i 7.4, and should work for some earlier releases too.
 


 
You might consider using UNION ALL for performance when you know you won't be unioning any duplicate rows, which is guaranteed by the environment variable in your query.
ReplyDeleteWhat we do is create ALIAS's over the various tables and UNION them...
ReplyDeleteCTE’S Allow step by step implementation, testing, debugging , performance analysis of complex SQL Queries. Use them to clearly state your steps. There is a limit on number of unions. Casting the environment code as CHAR(‘A’,1) as Environment. There is an issue with the VIEW UNION crossing libraries, change management will have issue and backups, and restores may have issues. Creating the Alias’s might be a way to avoid the view from having multiple libraries referenced in object. I did something similar on NN IBM i’s but I submitted a remote job on each to have them push their DATA into central repository/table, therefore NN systems simultaneously aggregated and formatted their data within a few seconds into central repository/table.
ReplyDeleteThe ERP when it writes data into table might be redirected to central table which might simplify this. It would have other advantages too.
Avoid DDM for this redirection, overriding could be done in RPGLE Without a CL command ERP is using RPGLE, If using SQL, the alias would work.
I like how you started - the 1-4 rules.