Wednesday, May 8, 2024

Retrieving file's keys with SQL

I was asked how I could present the following information in an "easy to use" manner for all of the "files" in a library:

  • Library name
  • File name, or Table, Index, etc. name
  • Type of file
  • When it was last used
  • File's key fields

The easiest way to get this information is using SQL. The first three pieces of information can be retrieved from the SYSFILES View. The last used information can come from the OBJECT_STATISTICS Table function. Getting the file's key fields is not in any of the usual SQL Views about fields/columns or files/tables.

There is a file that contains the keys for all files: QADBKATR. This is a logical file built over the file QADBKFLD. On all the IBM i partitions I use for testing this posts I found that I am not authorized to the physical file on any, and I am authorized to the logical file on all but one. Therefore, my examples will use QADBKATR.

I am going to provide the data elements asked for from all the files in my library, MYLIB. I have five "file" type objects:

  • TESTFILE:  DDS physical file
  • TESTFILEL0:  DDS logical file built over TESTFILE
  • TESTVIEW:  SQL View built over TESTFILE. Being a View it does not have a key
  • TESTTABLE:  SQL DDL Table
  • TESTINDEX:  SQL DDL Index built over TESTTABLE

To list the keys for those files I can use the following SQL statement:

01  SELECT DBKLIB AS "Library",
02         DBKFIL AS "File",
03         DBKFMT AS "Rcd format",
04         DBKFLD AS "Field",
05         DBKPOS AS "Key seq"
06    FROM QSYS.QADBKATR
07   WHERE DBKLIB = 'MYLIB'
08   ORDER BY DBKLIB,DBKFIL,DBKPOS

Line 5: These column contains the key sequence number, therefore, if I sort by it, line 8, I can list the key fields in key sequence.

The results, below, show the keys for the files in MYLIB.

Library  File        Field     Key seq
-------  ----------  --------  -------
MYLIB    TESTFILE    FIELD01         1
MYLIB    TESTFILE    FIELD02         2
MYLIB    TESTFILEL0  FIELD05         1
MYLIB    TESTFILEL0  FIELD04         2
MYLIB    TESTFILEL0  FIELD03         3
MYLIB    TESTFILEL0  FIELD02         4
MYLIB    TESTFILEL0  FIELD01         5
MYLIB    TESTINDEX   COLUMN_5        1
MYLIB    TESTINDEX   COLUMN_4        2
MYLIB    TESTINDEX   COLUMN_3        3
MYLIB    TESTINDEX   COLUMN_2        4
MYLIB    TESTINDEX   COLUMN_1        5
MYLIB    TESTTABLE   COLUMN_1        1
MYLIB    TESTTABLE   COLUMN_2        2

As TESTVIEW does not contain a key it is not in the results.

When I list the key fields in my final results I do not want to have one row per key field. I want to have one column the contains the keys listed in key sequence. Fortunately the LISTAGG scalar function will do that for me.

I tried to use a Common Table Expression, CTE, to do this. But no matter how I tried I could not get the results from the first part sorted the way I needed. Therefore, I had to make this two steps:

  1. Extract the data I want from QADBKATR
  2. Use LISTAGG to aggregate the key fields into one

The first step is accomplished with the following statement:

01  CREATE TABLE QTEMP.WORK0 AS
02  (SELECT DBKLIB,DBKFIL,DBKFLD,DBKPOS 
03     FROM QSYS.QADBKATR
04    WHERE DBKLIB = 'MYLIB'
05    ORDER BY DBKLIB,DBKFIL,DBKPOS)
06  WITH DATA

The contents of the Table WORK0 is the same as the results from the previous statement.

Now I have the data in Table I can then aggregate the key fields with the following:

01  SELECT DBKFIL,LISTAGG(RTRIM(DBKFLD),',') AS KEY_FIELDS
02    FROM QTEMP.WORK0
03   GROUP BY DBKLIB,DBKFIL

Line 1: I decided I did not need to show the library column as I know that will be 'MYLIB'. I need to right trim the key field column as it is fixed width of ten characters, and I am going to separate each key with a comma ( , ).

Line 3: I need to group the results, here by the library and file, so that I get all the rows for each file.

The results are:

DBKFIL      KEY_FIELDS
----------  --------------------------------------------
TESTFILE    FIELD01,FIELD02
TESTFILEL0  FIELD05,FIELD04,FIELD03,FIELD02,FIELD01
TESTINDEX   COLUMN_5,COLUMN_4,COLUMN_3,COLUMN_2,COLUMN_1
TESTTABLE   COLUMN_1,COLUMN_2

The other columns needed for the desired results can be gathered using the following statement:

01  SELECT SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME,
02         NATIVE_TYPE,FILE_TYPE,B.LAST_USED_TIMESTAMP 
03    FROM QSYS2.SYSFILES,
04  LATERAL 
05  (SELECT LAST_USED_TIMESTAMP 
06     FROM TABLE(QSYS2.OBJECT_STATISTICS(
07            SYSTEM_TABLE_SCHEMA,'FILE',SYSTEM_TABLE_NAME))) B 
08   WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB'

Why did I use the SYSFILES View rather than SYSCOLUMNS?

IMHO the columns in SYSFILES are more IBM i centric when compared to the columns in SYSCOLUMNS, which tends to be more relevant to all the members of the Db2 family.

Line 2: The NATIVE_TYPE columns contains how the IBM i considers the object. With the "files" in my library TESTFILE and TESTTABLE are considered physical files, the others are considered logical files. The FILE_TYPE allows me to distinguish between data and source files. The last used timestamp comes from the OBJECT_STATISTICS Table function.

Line 4: I find that using the Lateral is the easiest way to join results from a Table function to the results from a View as I can use columns from the View in the Table function statement, line 7.

The results are:

SYSTEM_ SYSTEM_
TABLE_  TABLE_     NATIVE   FILE  LAST_USED_
SCHEMA  NAME       _TYPE    _TYPE TIMESTAMP
------- ---------- -------- ----- --------------------------
MYLIB   TESTINDEX  LOGICAL  DATA  <NULL>
MYLIB   TESTFILEL0 LOGICAL  DATA  <NULL>
MYLIB   TESTFILE   PHYSICAL DATA  2024-03-05 00:00:00.000000
MYLIB   TESTTABLE  PHYSICAL DATA  2024-03-05 00:00:00.000000
MYLIB   TESTVIEW   LOGICAL  DATA  <NULL>

Only TESTFILE and TESTTABLE have been used, the others have not and their last used timestamp is null.

The last used timestamp only returns the date. It does not return a time value, just the default of all zeroes.

Having shown how the individual SQL statements work on their own, I can now put them into a RPG program that will generate the results desired:

01  **free
02  ctl-opt main(Main) option(*srcstmt) ;

03  dcl-proc Main ;
04    dcl-pi *n ;
05      inLibrary char(10) ;
06    end-pi ;

07    dcl-s String varchar(350) ;

08    exec sql CALL QSYS2.QCMDEXC('DLTF QTEMP/WORK*') ;
09    exec sql DROP TABLE IF EXISTS QTEMP.FINAL ;

10    String = 'CREATE TABLE QTEMP.WORK0 AS +
11              (SELECT SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME,+
12                      NATIVE_TYPE,FILE_TYPE,B.LAST_USED_TIMESTAMP +
13                 FROM QSYS2.SYSFILES, +
14               LATERAL +
15               (SELECT LAST_USED_TIMESTAMP +
16                  FROM TABLE(QSYS2.OBJECT_STATISTICS(+
17                           SYSTEM_TABLE_SCHEMA,''FILE'',+
18                           SYSTEM_TABLE_NAME))) B +
19               WHERE SYSTEM_TABLE_SCHEMA = ''' + %trimr(inLibrary) +
20                ''') WITH DATA' ;

21    exec sql EXECUTE IMMEDIATE :String ;

22    exec sql CREATE TABLE QTEMP.WORK1 AS
23             (SELECT DBKLIB,DBKFIL,DBKFLD,DBKPOS
24                FROM QSYS.QADBKATR
25               WHERE (DBKLIB,DBKFIL) IN
26                       (SELECT SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME
27                          FROM QTEMP.WORK0)
28               ORDER BY DBKLIB,DBKFIL,DBKPOS)
29             WITH DATA ;

30    exec sql CREATE TABLE QTEMP.FINAL AS
31             (SELECT A.*,B.KEY_FIELDS
32                FROM QTEMP.WORK0 A,
33              LATERAL
34              (SELECT LISTAGG(RTRIM(DBKFLD),',') AS KEY_FIELDS
35                 FROM QTEMP.WORK1
36                WHERE (DBKLIB,DBKFIL) = (A.SYSTEM_TABLE_SCHEMA,
37                                         A.SYSTEM_TABLE_NAME)
38                GROUP BY DBKLIB,DBKFIL) B)
39             WITH DATA ;

40    exec sql MERGE INTO QTEMP.FINAL A USING QTEMP.WORK0 B
41               ON (A.SYSTEM_TABLE_SCHEMA,A.SYSTEM_TABLE_NAME) =
42                    (B.SYSTEM_TABLE_SCHEMA,B.SYSTEM_TABLE_NAME)
43              WHEN NOT MATCHED THEN
44                INSERT VALUES (B.SYSTEM_TABLE_SCHEMA,
45                               B.SYSTEM_TABLE_NAME,
46                               B.NATIVE_TYPE,B.FILE_TYPE,
47                               B.LAST_USED_TIMESTAMP,' ') ;
48  end-proc ;

Line 1: My code is always modern RPG.

Line 2: I am using a Main procedure, which I am calling "Main".

Line 3: Start of the Main procedure.

Lines 4 6: This is the procedure interface for the Main procedure. The library name is passed to this program.

Line 7: Declaration of a variable that will be used to contain the first SQL statement.

Line 8: I will be creating two work files, whose names will start with the characters "WORK". If they already exist in this job's QTEMP I want to delete them. I could have had two DROP TABLE statements here. But I decided to use one Delete file command, DLTF, executed by the QCMDEXC SQL procedure.

Line 9: This file will contain the file data. I need to delete it too if it already exists. As this is just one file I can use the DROP statement. To stop an error if the Table does not exist I use the IF EXISTS.

Lines 10 20: This is the SQL statement that will be used to create the first work file, WORK0. It is basically the same as one of the example SQL statement I gave above. As I cannot use the passed library parameter in the WHERE clause, line 19, I am placing this statement in the variable String. Why did I use the RPG %TRIMR BiF rather than SQL's RTRIM? I chose to use so that the SQL statement would be a little shorter than it would have been if I had used the RTRIM.

Line 21: I use the EXECUTE IMMEDIATE statement to execute the SQL statement contained in the String variable.

Lines 22 29: This SQL statement make an output file, WORK1, listing all the key fields for the files in the file WORK0.

Lines 30 39: This last statement takes the data WORK0 and adds to it the aggregated key fields from WORK1, to create the output file FINAL.

Lines 40 47: The above only includes all "files" that are found in both WORK0 and WORK1, in other words all the "files" that have key fields. What about those "files" that are unkeyed? This MERGE statement inserts all rows from WORK0 that are not in FINAL, which are the unkeyed "files". The last parameter of the Insert, line 47, is a blank as that is the column for the key fields.

When the program completed I can query the Table FINAL with the following statement:

SELECT * FROM QTEMP.FINAL

And the results are:

SYSTEM_ SYSTEM_
TABLE_  TABLE_     NATIVE   FILE  LAST_USED_
SCHEMA  NAME       _TYPE    _TYPE TIMESTAMP      KEY_FIELDS
------- ---------- -------- ----- -------------- --------------------------------------------
MYLIB   TESTFILE   PHYSICAL DATA  2024-03-05-... FIELD01,FIELD02
MYLIB   TESTFILEL0 LOGICAL  DATA  <NULL>         FIELD05,FIELD04,FIELD03,FIELD02,FIELD01
MYLIB   TESTTABLE  PHYSICAL DATA  2024-03-05-... COLUMN_1,COLUMN_2
MYLIB   TESTINDEX  LOGICAL  DATA  <NULL>         COLUMN_5,COLUMN_4,COLUMN_3,COLUMN_2,COLUMN_1
MYLIB   TESTVIEW   LOGICAL  DATA  <NULL>

As you have seen that is a straightforward way to be able to get the key list for any keyed "file".

If you are not authorized to use the file QADBKATR you will have to get the key information from the Display File Description command, DSPFD. You can see how to do that here.

 

This article was written for IBM i 7.5 TR3 and 7.4 TR9.

No comments:

Post a Comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.