Db2 optimizes very large SQL tables by dividing the data contained within into partitions. These partitions store rows of data separately from other rows. In Db2 for i these are implemented using something we are all familiar with, members. This allows us, IBM i users, to use the Db2 view SYSPARTITIONSTAT to get information about members in the files and tables in our systems.
Where do we use members? The two scenarios that I am sure spring to all of our minds are in source files and in data files.
I often use SYSPARTITIONSTAT to search for source members that have the same name in multiple source files. This allows me to determine if a member name has already been used. Or if I have multiple versions of the same source member in different source files.
In this first example I just want a count of all the members in any source files that are in my library, MYLIB.
01 SELECT SYSTEM_TABLE_SCHEMA AS "Library", 02 SYSTEM_TABLE_NAME AS "File", 03 TO_CHAR(COUNT(*),'999G999') AS "No. mbrs" 04 FROM QSYS2.SYSPARTITIONSTAT 05 WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 06 AND SOURCE_TYPE IS NOT NULL 07 GROUP BY SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME 08 ORDER BY 1,2 |
Line 1 and 2: I am using the system names for the library and the source file, rather than the long SQL names.
Line 3: I am getting the count of the number of rows. I am also formatting the number by inserting thousand separators into the number make it easier to understand.
Line 5: I am only interested in objects in my library.
Line 6: Source file members have a source type, all other members are null in this column. Thus, if I want source file members I need to only include those rows where this column is not null.
Line 7: This GROUP BY clause gives the criteria by which the results will be grouped together.
Line 8: I am ordering by the first column, library, and the second, source file name.
My results are:
Library File No. mbrs ------- ------ -------- MYLIB DEVSRC 519 MYLIB OLDSRC 25 MYLIB TFRSRC 2 |
Having looked at the summary, this next statement returns to me the detail, a list of members in the source files in my library. I have picked the following columns as I thought they would be useful to me for to learn about these members:
- SYSTEM_TABLE_SCHEMA: Library name
- SYSTEM_TABLE_NAME: File name, in this example the source file's name
- SYSTEM_TABLE_MEMBER: Member name
- SOURCE_TYPE: Source type. If these are source members from a source file then this column contains values
- NUMBER_ROWS: Number of rows/records in the source member
- DATA_SIZE: Size of the member is bytes
- CREATE_TIMESTAMP: When the source member was created
- LAST_SOURCE_UPDATE_TIMESTAMP: When the source member was last updated
- PARTITION_TEXT: Member's text
My statement is:
01 SELECT SYSTEM_TABLE_SCHEMA AS "Library", 02 SYSTEM_TABLE_NAME AS "File", 03 SYSTEM_TABLE_MEMBER AS "Member", 04 SOURCE_TYPE AS "Type", 05 TO_CHAR(NUMBER_ROWS,'999G999G999G999') AS "Records", 06 TO_CHAR(DATA_SIZE,'999G999G999G999') AS "Size", 07 CREATE_TIMESTAMP AS "Created", 08 LAST_SOURCE_UPDATE_TIMESTAMP AS "Last updated", 09 PARTITION_TEXT AS "Member text" 10 FROM QSYS2.SYSPARTITIONSTAT 11 WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 12 AND SOURCE_TYPE IS NOT NULL 13 ORDER BY 1,2,3 |
Line 12: Don't forget as I am only interested in source file members the SOURCE_TYPE cannot be null.
I am only going to show a few rows of the results:
Library File Member Type Records Size ------- ------ ------- -------- ------- ---- MYLIB DEVSRC EG001 RPGLE 45 12,288 MYLIB DEVSRC EG002 RPGLE 32 8,192 MYLIB DEVSRC EG003 RPGLE 40 12,288 MYLIB DEVSRC EG004B SQLRPGLE 24 8,192 MYLIB DEVSRC EG004C SQLRPGLE 24 8,192 MYLIB DEVSRC EG004P SQLTABLE 4 8,192 MYLIB DEVSRC EG004P2 PF 3 8,192 MYLIB DEVSRC EG005 CLLE 5 8,192 MYLIB DEVSRC EG005A SQLRPGLE 27 8,192 Created Last updated Member text -------------------------- -------------------------- --------------- 2017-09-14 02:49:07.000000 2017-09-14 02:53:57.000000 Create & wri... 2017-09-14 02:56:25.000000 2017-09-14 03:01:03.000000 Read user sp... 2017-09-14 03:05:00.000000 2017-09-14 04:01:24.000000 Get user soa... 2017-10-01 16:51:11.000000 2017-10-01 17:28:50.000000 Show error w... 2017-10-18 19:24:39.000000 2017-10-18 19:25:40.000000 Show error w... 2017-10-01 16:42:26.000000 2017-10-01 17:05:47.000000 SQL table 2017-10-18 19:22:33.000000 2017-10-18 19:23:41.000000 SQL table 2017-10-31 15:07:35.000000 2017-10-31 15:17:38.000000 |
Let me do something more useful. How about a list of source member names common to two source files?
01 SELECT A.SYSTEM_TABLE_SCHEMA AS "Library 1", 02 A.SYSTEM_TABLE_NAME AS "File 1", 03 A.SYSTEM_TABLE_MEMBER AS "Member 1", 04 A.SOURCE_TYPE AS "Type 1", 05 TO_CHAR(A.NUMBER_ROWS,'999G999G999G999') AS "Records 1", 06 A.CREATE_TIMESTAMP AS "Created 1", 07 A.LAST_SOURCE_UPDATE_TIMESTAMP AS "Last updated 1", 08 B.SYSTEM_TABLE_SCHEMA AS "Library 2", 09 B.SYSTEM_TABLE_NAME AS "File 2", 10 B.SYSTEM_TABLE_MEMBER AS "Member 2", 11 B.SOURCE_TYPE AS "Type 2", 12 TO_CHAR(B.NUMBER_ROWS,'999G999G999G999') AS "Records 2", 13 B.CREATE_TIMESTAMP AS "Created 2", 14 B.LAST_SOURCE_UPDATE_TIMESTAMP AS "Last updated 2" 15 FROM QSYS2.SYSPARTITIONSTAT A 16 INNER JOIN QSYS2.SYSPARTITIONSTAT B 17 ON A.SYSTEM_TABLE_MEMBER = B.SYSTEM_TABLE_MEMBER 18 WHERE A.SYSTEM_TABLE_SCHEMA = 'MYLIB' 19 AND A.SYSTEM_TABLE_NAME = 'DEVSRC' 20 AND B.SYSTEM_TABLE_SCHEMA = 'MYLIB' 21 AND B.SYSTEM_TABLE_NAME = 'OLDSRC' |
This statement is pretty similar to the previous one shown. The differences to note are:
Lines 1 – 14: The columns returned are duplicated so I will have the results for any matches showing the same information.
Line 15 – 17: I have joined SYSPARTITIONSTAT to itself with an inner join, this will only return rows that match in both views.
The results are:
Library 1 File 1 Member 1 Type 1 Records 1 --------- ------ --------- -------- --------- MYLIB DEVSRC TESTSQL SQL 9 MYLIB DEVSRC TESTTABLE SQLTABLE 2 MYLIB DEVSRC TESTVIEW SQLVIEW 8 MYLIB DEVSRC TESTFILE PF 4 MYLIB DEVSRC TESTFILE1 PF 3 Created 1 Last updated 1 -------------------------- -------------------------- 2016-12-17 17:57:11.000000 2019-06-29 23:03:33.000000 2020-07-21 12:49:02.000000 2021-02-14 22:10:01.000000 2020-12-24 14:57:31.000000 2021-01-09 18:49:46.000000 2021-03-13 21:39:20.000000 2021-12-06 15:12:05.000000 2021-07-25 16:21:35.000000 2021-07-25 16:22:07.000000 Library 2 File 2 Member 2 Type 2 Records 2 --------- ------ --------- ------- --------- MYLIB OLDSRC TESTSQL SQL 14 MYLIB OLDSRC TESTTABLE SQL 10 MYLIB OLDSRC TESTVIEW SQL 12 MYLIB OLDSRC TESTFILE PF 11 MYLIB OLDSRC TESTFILE1 PF 5 Created 2 Last updated 2 -------------------------- -------------------------- 2016-05-16 02:49:42.000000 2014-12-27 17:51:14.000000 2016-05-16 02:49:43.000000 2016-03-06 06:22:11.000000 2016-05-16 02:49:43.000000 2016-03-13 18:11:04.000000 2016-05-16 02:49:41.000000 2016-05-15 18:31:21.000000 2016-05-16 02:49:41.000000 2016-03-13 17:37:18.000000 |
By comparing:
- Records 1 to Records 2
- Created 1 to Created 2
- Last updated 1 to Last Updated 2
I can tell that the two sets of members are not the same.
Next example shows if I want to learn what objects are in one source file, but not the other. Here I decided to use a Common Table Expression, CTE, to return to me that information.
01 WITH T1(LIBRARY,FILE,MEMBER,TYPE) 02 AS (SELECT SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME, 03 SYSTEM_TABLE_MEMBER,SOURCE_TYPE 04 FROM QSYS2.SYSPARTITIONSTAT 05 WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 06 AND SYSTEM_TABLE_NAME = 'DEVSRC'), 07 T2(LIBRARY,FILE,MEMBER,TYPE) 08 AS (SELECT SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME, 09 SYSTEM_TABLE_MEMBER,SOURCE_TYPE 10 FROM QSYS2.SYSPARTITIONSTAT 11 WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 12 AND SYSTEM_TABLE_NAME = 'OLDSRC') 13 SELECT T1.LIBRARY,T1.FILE,T1.MEMBER 14 FROM T1 LEFT JOIN T2 15 ON T1.MEMBER = T2.MEMBER 16 WHERE T2.MEMBER IS NULL 17 ORDER BY 1,2,3 ; |
Lines 1 – 6: The first part of the statement makes a virtual table in memory of the data I select only for the source file DEVSRC.
Lines 7 – 12: The second part makes a virtual table for the data for the source file OLDSRC.
Lines 13 - 17: This is the statement that puts the two virtual tables together.
Line 13: I only want the columns for DEVSRC, which are in the virtual file T1.
Lines 14 and 15: A left outer join includes all the columns from the "left" file, DEVSRC, and if there is no match in OLDSRC the value in the "right" side is null. I am joining the two files by the member name.
Line 16: By selecting the rows where T2.MEMBER is null I am selecting the unmatched results.
Line 17: I am sorting the results by library, file, and member.
I am only going to show a subset of the results:
LIBRARY FILE MEMBER ------- ------ ---------- MYLIB DEVSRC AVAILSPACE MYLIB DEVSRC CCSID_CVT MYLIB DEVSRC CHK_FILE MYLIB DEVSRC CITY MYLIB DEVSRC CPY2IMPF MYLIB DEVSRC DOWNLOAD |
One more example with source members and files. How about I want to find every member in any source file that matches a particular string. In this case I am searching for any member that starts with string TEST.
01 SELECT SYSTEM_TABLE_SCHEMA AS "Library", 02 SYSTEM_TABLE_NAME AS "File", 03 SYSTEM_TABLE_MEMBER AS "Member", 04 SOURCE_TYPE AS "Type", 05 TO_CHAR(NUMBER_ROWS,'999G999G999G999') AS "Records", 06 CREATE_TIMESTAMP AS "Created", 07 LAST_SOURCE_UPDATE_TIMESTAMP AS "Last updated" 08 FROM QSYS2.SYSPARTITIONSTAT 09 WHERE SYSTEM_TABLE_SCHEMA LIKE 'MYLIB%' 10 AND SYSTEM_TABLE_MEMBER LIKE 'TEST%' 11 AND SOURCE_TYPE IS NOT NULL 12 ORDER BY 3,1,2 |
Line 9: I have restricted my search to just my libraries, that all start with MYLIB. I did try statement for all libraries, and after 30 minutes the statement had not finished. Be careful of the scope of your search.
Line 10: I am looking for any source members that start with TEST.
Line 11: Don't forget to test for not null.
A subset of the results look like:
Library File Member Type Records ------- ------ --------- -------- ------- MYLIB DEVSRC TEST_PROC RPGLE 12 MYLIB OLDSRC TESTALL TXT 315 MYLIB TFRSRC TESTAPI SQLRPGLE 95 MYLIB DEVSRC TESTCL CLLE 35 MYLIB DEVSRC TESTCL_1 CLLE 33 MYLIB DEVSRC TESTCL_2 CLLE 39 Created Last updated -------------------------- -------------------------- 2017-11-25 21:44:45.000000 2017-11-25 21:47:42.000000 2016-05-16 02:49:41.000000 2015-03-22 19:26:05.000000 2021-04-13 02:06:34.000000 2021-04-13 02:08:47.000000 2016-09-06 00:38:30.000000 2021-10-12 17:22:29.000000 2021-07-27 13:32:57.000000 2021-07-27 13:50:31.000000 2021-07-27 14:36:17.000000 2021-07-27 15:12:07.000000 |
Enough of source files and members. What about members in what I like to call "data files". They can have multiple members too. I have been doing a lot of clean up at work recently. One of the big projects was to remove the deleted records from files. When you have 11 million deleted records and only 32 active ones it is time to clean up.
I cannot assume that every file has one member, and SYSPARTITIONSTAT allows me to discover which files' members have those deleted records.
01 SELECT TO_CHAR(NUMBER_DELETED_ROWS,'999G999G999') AS "Deleted", 02 SYSTEM_TABLE_SCHEMA AS "Library", 03 SYSTEM_TABLE_NAME AS "File", 04 SYSTEM_TABLE_MEMBER AS "Member" 05 FROM QSYS2.SYSPARTITIONSTAT 06 WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 07 AND SOURCE_TYPE IS NULL 08 AND NUMBER_DELETED_ROWS > 0 09 ORDER BY 1 DESC |
Line 1: We have a new column I have not mentioned before. This contains the number of deleted rows in the member.
My results are:
Deleted Library File Member ------- ------- -------- ------ 500,000 MYLIB TESTFILE FIRST 333,333 MYLIB TESTFILE SECOND |
I now know I need to RGZPFM these two members.
I could use the SYSTABLESTAT view to get the total number of records:
01 SELECT SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME,NUMBER_DELETED_ROWS 02 FROM QSYS2.SYSTABLESTAT 03 WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 04 AND SYSTEM_TABLE_NAME = 'TESTFILE' |
But that is the number of deleted records for all members in the file.
SYSTEM_TABLE_SCHEMA SYSTEM_TABLE_NAME NUMBER_DELETE_ROWS ------------------- ----------------- ------------------ MYLIB TESTFILE 833333 |
Looking at the number of deleted rows you see why I like to insert the thousand separators into the numbers.
When reorganizing files to remove their deleted records I am going to use SYSPARTITIONSTAT to make sure I reorganize all of the members, not just the first.
You can learn more about the SYSPARTITIONSTAT SQL view from the IBM website here.
This article was written for IBM i 7.4, and should work for some earlier releases too.
good one Simon, thanks for sharing
ReplyDeleteSimon, Searching for a specific member in *all schema and *all source files is extremely slow, do you believe there is a way to create an index over SYSPARTITIONSTAT by member ? Thanks - Great site ! :-)
DeleteSimon -
ReplyDeletePerhaps you already know this, but you can also retrieve the number of members (partitions) for a given IBM i table by querying SYSTABLSTAT:
SELECT
CAST(TABNAME AS CHAR(10)) AS FILE
, CAST(TABSCHEMA AS CHAR(10)) AS LIB
, NBRPARTS AS NBR_OF_MBRS
, CARD AS TOTAL_ROWS
FROM SYSTABLESTAT
WHERE TABSCHEMA = 'JDFSRC'
Display Data
Position to line . . . . .
....+....1....+....2....+....3....+....4....+....5....+....6....+.
FILE LIB NBR_OF_MBRS TOTAL_ROWS
JDEEXT JDFSRC 10 153
JDESRC JDFSRC 27,253 20,259,740
F98CRTCMD JDFSRC 919 3,056
JDECPY JDFSRC 1,899 248,816
******** End of data ********
Newbie to supporting this technology - your blog has been very helpful Simon - thanks! I am using the SQL on QSYS2.SYSPARTITIONSTAT to list all members in a library as shown above. Great. So now I get Members of various SOURCE_TYPE like 'CLP', 'PF', SQLC, etc that have anywhere from 2-100 "Records" as per your SQL. What IBM i catalog view do I use to drill one layer down? IE - your query outputs for me that my Member = XYZSQL is of type = 'SQLC' and has 19 records. What subsequent query can I run / catalog table or view can i use to view those 19 records?
ReplyDeleteTo be able to view the contents of the member still using SQL you would need to build an ALIAS for the member, see here. And then you would just SELECT * FROM .
DeleteIt would be a lot simpler just to use the DSPPFM command to view the member's contents.
Thanks much, Simon. I created an alias and it works. For someone who is used to other technologies, it seems odd to have to create something before being able to view it - I dont get why i cant do select * from library.file(). As far as using DSPPFM command - the main person who supports this particular system from my organization has only so far given me read-only SQL access and not ability to run any commands :) I will have to change that if my responsibilities in supporting this particular system increase. In the mean time, I assume/hope I cannot do any harm by creating aliases?!?
DeleteI'm wondering why you didn't use a LEFT EXCEPTION JOIN (or simply EXCEPTION JOIN) instead of:
ReplyDelete13 SELECT T1.LIBRARY,T1.FILE,T1.MEMBER
14 FROM T1 LEFT JOIN T2
15 ON T1.MEMBER = T2.MEMBER
16 WHERE T2.MEMBER IS NULL
This is very handy Simon. Thank you for sharing. Much easier using DSPPFM command by output to a PF.
ReplyDeleteJust came in handy. Thank you Simon.
ReplyDelete