Wednesday, November 20, 2019

Using SQL to list directories and files in IFS

use sql table function to list ifs files and directories

Thanks to the folks at RZKH who downloaded and applied the PTFs for IBM i 7.4 TR1 on the first day they became available, what great service!, I have been able to explore some of the new features added to the operating system we know and love. Amongst the additions to Db2 for i, or SQL, is a number of table functions to list information about the directories and files in the Integrated File System, IFS. My original thought was to write one post about them all, but the more I played with the IFS_OBJECT_STATISTICS table function I came to the conclusion that it deserved its own post.

The IFS_OBJECT_STATISTICS table function returns similar information to the RTVDSKINF command. All I want is just some basic information about the directories and files I choose, but there are many more columns returned by the table function. I encourage you to check the link at the bottom of this post which will take you to the appropriate page in IBM's documentation.

In my folder, MyFolder, I have created the following files and subdirectories:

  • Directory: MyFolder
    • File: list.txt
    • File: report.csv
    • File: test_file.txt
    • File: xmlfile.xml
    • Directory: subfolder1
      • File: 1.txt
      • Directory: subfolder1a
        • File: 1a.txt
    • Directory: subfolder2
      • File: 2.txt

There are four parameters that can be passed to the IFS_OBJECT_STATISTICS table function to select, or omit, the information that is returned in the results:

Parameter Description
START_PATH_NAME The path name for starting the search. For example: /directory/subdirectory
SUBTREE_DIRECTORIES In the subdirectories of the path name given.
NO Only objects in the start path are listed.
YES Include all the directories found in the start path, and its subdirectories.
OBJECT_TYPE_LIST Object types to be included. Values must be separated by a space.
*ALLDIR Just the directories are returned.
*ALLSTMF All the steam file object types: *MBR, *DOC, *STMF, *DSTMF, *USRSPC. *MBR All database file members.
*NOQSYS Exclude all objects in QSYS.LIB.
OMIT_LIST Directories or subdirectories to exclude from the result. If there is more than one entry they must be separated by a space.

I have found that the parameter names are optional. I will show examples both with and without them. Regardless of whether I use the parameter names they must be separated by a comma ( , ).

Let me get a list of the files and subdirectories in my folder, MyFolder:

SELECT PATH_NAME,OBJECT_TYPE,CREATE_TIMESTAMP,LAST_USED_TIMESTAMP,
       DATA_SIZE
FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS('/MyFolder','YES'))

In this example I want to have the following columns of information returned to me:

  • PATH_NAME: Path name of the directory or file
  • OBJECT_TYPE: Type of object
  • CREATE_TIMESTAMP: The timestamp of when the file or directory was created
  • LAST_USED_TIMESTAMP: The timestamp of when the file was last used. Null is returned for directories
  • DATA_SIZE: Size in bytes of the object

Notice that I did not use the parameter names, just the values I want for the first two parameters.

The returned results look like:

                                         OBJECT
PATH_NAME                                _TYPE   CREATE_TIMESTAMP
---------------------------------------  ------  ------------------
/MyFolder                                *DIR    2019-11-02 20:15:08
/MyFolder/report.csv                     *STMF   2019-11-02 20:52:30
/MyFolder/list.txt                       *STMF   2019-11-02 21:05:58
/MyFolder/xmlfile.xml                    *STMF   2019-11-03 17:21:46
/MyFolder/test_file.txt                  *STMF   2019-11-16 21:28:41
/MyFolder/subfolder1                     *DIR    2019-11-17 16:35:05
/MyFolder/subfolder2                     *DIR    2019-11-17 17:35:37
/MyFolder/subfolder1/subfolder1a         *DIR    2019-11-17 16:35:53
/MyFolder/subfolder1/1.txt               *STMF   2019-11-17 16:36:24
/MyFolder/subfolder2/2.txt               *STMF   2019-11-17 17:38:56
/MyFolder/subfolder1/subfolder1a/1a.txt * STMF   2019-11-17 16:37:01


LAST_USED_TIMESTAMP  DATA_SIZE
-------------------  ---------
-                         8192
2019-11-17 21:08:23         54
2019-11-17 21:08:23         95
2019-11-17 21:08:23         42
2019-11-16 06:57:44          0
-                         8192
-                         8192
-                         8192
2019-11-17 21:08:23         19
2019-11-17 21:08:23         19
2019-11-17 21:08:23         19

What is the difference if I change the subtree parameter to "NO"?

SELECT PATH_NAME,OBJECT_TYPE FROM
TABLE(QSYS2.IFS_OBJECT_STATISTICS('/MyFolder','NO'))

The returned results are only for those objects in MyFolder, and not in its subdirectories:

PATH_NAME                OBJECT_TYPE
-----------------------  -----------
/MyFolder                *DIR
/MyFolder/report.csv     *STMF
/MyFolder/list.txt       *STMF
/MyFolder/xmlfile.xml    *STMF
/MyFolder/test_file.txt  *STMF
/MyFolder/subfolder1     *DIR
/MyFolder/subfolder2     *DIR

If I wanted to get a list of just the subdirectories in MyFolder I would use the following:

SELECT PATH_NAME,OBJECT_TYPE FROM
TABLE(QSYS2.IFS_OBJECT_STATISTICS(START_PATH_NAME => '/MyFolder',
                                 SUBTREE_DIRECTORIES => 'YES',
                                 OBJECT_TYPE_LIST => '*ALLDIR'))

This time I have the parameter names. The OBJECT_TYPE_LIST "asks" for only directories to be returned in the subtree of directories:

PATH_NAME                         OBJECT_TYPE
--------------------------------  -----------
/MyFolder                         *DIR
/MyFolder/subfolder1              *DIR
/MyFolder/subfolder2              *DIR
/MyFolder/subfolder1/subfolder1a  *DIR

In this example I want to list all of the files in the MyFolder and its subfolders, except those in subfolder1. By omitting subfolder1 I also omit any subfolders it has, subfolder1a.

SELECT PATH_NAME,OBJECT_TYPE FROM
TABLE(QSYS2.IFS_OBJECT_STATISTICS(START_PATH_NAME => '/MyFolder',
                            SUBTREE_DIRECTORIES =>'YES',
                            OMIT_LIST => '/MyFolder/subfolder1'))

The results are just from the root of MyFolder and the subfolder2 subdirectory:

PATH_NAME                   OBJECT_TYPE
--------------------------  -----------
/MyFolder                   *DIR
/MyFolder/report.csv        *STMF
/MyFolder/list.txt          *STMF
/MyFolder/xmlfile.xml       *STMF
/MyFolder/test_file.txt     *STMF
/MyFolder/subfolder2        *DIR
/MyFolder/subfolder2/2.txt  *STMF

One of the major reasons I want to look at objects in the IFS is to identify any extremely large ones that could be deleted. In this example I am looking for any files in MyFolder, and its subdirectories, that are greater than 50 bytes, well this is only test data and I do not have really big files in the IFS of this IBM i partition. The WHERE stipulates that the size of the object be more than 50 bytes, and the results are returned in descending order, i.e. the largest file first.

SELECT PATH_NAME,DATA_SIZE FROM
TABLE(QSYS2.IFS_OBJECT_STATISTICS('/MyFolder','YES','*ALLSTMF'))
WHERE DATA_SIZE > 50
ORDER BY DATA_SIZE DESC

I just have two files that fit this criteria:

PATH_NAME              DATA_SIZE
---------------------  ---------
/MyFolder/list.txt            95
/MyFolder/report.csv          54

As I hinted when I was explaining the parameters for this function it is also possible to use this to access the "IBM i directory environment". In this environment all of the libraries are subdirectories of QSYS.LIB. Here I want a list of all the files in my library, MYLIB, with their CCSID and sorted by size in descending order:

SELECT PATH_NAME,OBJECT_TYPE AS "Type",DATA_SIZE AS "Size",CCSID
FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS(START_PATH_NAME => 
                                           '/QSYS.LIB/MYLIB.LIB', 
OBJECT_TYPE_LIST => '*FILE'))
ORDER BY DATA_SIZE DESC

The START_PATH_NAME has to be in the IFS naming convention, therefore, my "directory" MYLIB.LIB is a subfolder of QSYS.LIB. The OBJECT_TYPE_LIST uses the usual IBM i object type for files. My results are as follows:

PATH_NAME                         Type  Size   CCSID
--------------------------------- ----- -----  -----
/QSYS.LIB/MYLIB.LIB/DEVSRC.FILE   *FILE 24576   273
/QSYS.LIB/MYLIB.LIB/EG055P0.FILE  *FILE 16384    37
/QSYS.LIB/MYLIB.LIB/EG055P1.FILE  *FILE 16384    37
/QSYS.LIB/MYLIB.LIB/FTPSRC.FILE   *FILE 16384    37
/QSYS.LIB/MYLIB.LIB/OLDSRC.FILE   *FILE 16384    37
/QSYS.LIB/MYLIB.LIB/OTHERSRC.FILE *FILE 16384    37
/QSYS.LIB/MYLIB.LIB/PERSON.FILE   *FILE 16384    37

As source files and data files are both types of files they are included in the results. Display and printer files would be included too if I had any in MYLIB.

The example below shows how I can select more than one object type, in this case programs and modules, and use a wild card WHERE to return any program or modules that start with "TEST".

SELECT PATH_NAME,OBJECT_TYPE FROM
TABLE(QSYS2.IFS_OBJECT_STATISTICS(START_PATH_NAME => 
                                     '/QSYS.LIB/MYLIB.LIB', 
OBJECT_TYPE_LIST => '*PGM *MODULE'))
WHERE PATH_NAME LIKE '/QSYS.LIB/MYLIB.LIB/TEST%'

I only have two objects, a program and a module that fit this criteria:

PATH_NAME                          OBJECT_TYPE
---------------------------------- -----------
/QSYS.LIB/MYLIB.LIB/TESTPGM.PGM    *PGM
/QSYS.LIB/MYLIB.LIB/TESTMOD.MODULE *MODULE

In this last example I want to show how the path can work with files in the "IBM i environment". As I have shown in a previous example I use the source file DEVSRC in the library MYLIB. I can use IFS_OBJECT_STATISTICS to list all the members in that source file with this statement:

SELECT 
SUBSTR(PATH_NAME,
       LOCATE_IN_STRING(PATH_NAME,'/',1,4)+1,
       (LOCATE_IN_STRING(PATH_NAME,'.MBR',1)) - 
            (LOCATE_IN_STRING(PATH_NAME,'/',1,4)+1) )
       AS "Member",TEXT_DESCRIPTION
FROM TABLE(QSYS2.IFS_OBJECT_STATISTICS(START_PATH_NAME => 
                           '/QSYS.LIB/RPGPGM1.LIB/DEVSRC.FILE', 
                     OBJECT_TYPE_LIST => '*MBR'))

Notice that the path goes down to the level of the source file. It may seem redundant to have the OBJECT_TYPE_LIST parameter asking for only members as a file can only contain members. If I omit this I get a row returned in the results for the source file itself, which I do not want.

I only want the name of the members, not the names followed by ".MBR". To remove that from the string returned in the results I need to use a substring, but the starting position of where the member names starts will likely vary depending upon the length of the library's and source file's names. This is why I have used the LOCATE_IN_STRING function. The first LOCATE_IN_STRING determines where the fourth slash ( / ) is in the string, this is the separator between the source file name, DEVSRC.FILE, and the member name, I also need to add one to this value as I want to start the first character after the slash. The second and third are used to determine the length of the string I want to return. The second will return the value where the ".MBR" part of the string starts, and the third is where the member name starts, by subtracting these two values I have the length of the string of the member name. This means my results look like:

Member      TEXT_DESCRIPTION
----------  --------------------
AVAILSPACE  View available space
CCSID_CVT   -
DOWNLOAD    -
EG00RR1     Get user space creat
EG001       Create & write to us

As I mentioned above I can see this being really useful for finding those large files hiding deep in subdirectories in the IFS. Using this table function is also an easier way to get a list of files in a particular directory, much easier than using the RTVDSKINF command. At this point I don't think I would use it for retrieving information from the "IBM i environment" as I think there are better tables functions and views to use for that information.

 

You can learn more about the IFS_OBJECT_STATISTICS table function from the IBM website here.

 

This article was written for IBM i 7.4 TR1, and should work for IBM i 7.3 TR7 too.

5 comments:

  1. the solution I have used for years....

    https://www.itjungle.com/2014/08/27/fhg082714-story02/

    ReplyDelete
    Replies
    1. I think we all came up with our own solutions, I did too.

      But now with what I have described above we do not have to use what we used before. We can go directly to the data, rather than use an API, command, etc.

      Just K.I.S.S.

      Delete
  2. Great article Simon - I resolved this using an API - your solution is much simpler.

    ReplyDelete
  3. I’m currently using a qsh command to output listings to a qtemp file to read.
    Works great. Very clean.
    But I’ll take an sql approach any day.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.