Wednesday, August 2, 2023

Retrieving data from a JSON array with SQL

I have noticed with a few of the enhancements made in recent Technology Refreshes to various SQL Views and Table functions that new columns have been added that contain JSON arrays. For example the BASED_ON_FILES column in the SYSFILES View. I want to be able to convert the data held within the JSON array into individually columns. With Join logical files, SQL Views and Indexes the object could be based upon more than one data source.

Before I show how to extract the data from the BASED_ON_FILES columns I want to give a simpler example that will allow me to explain how to do it. Before I start I need to thank Brian Hill for his help with the code.

My example JSON array looks like:

{"ARRAY_ELEMENTS":[{"ELEM_1":"1","ELEM_2":"2","ELEM_3":"3"},
 {"ELEM_1":"101","ELEM_2":"102","ELEM_3":"103"}]}

The easiest way to describe the JSON array in RPG terms is that it is sorta-kinda like a data structure array. The "data structure", in this example it is called ARRAY_ELEMENTS, contains three subfields, ELEM_1, ELEM_2, and ELEM_3. The first element at the "data structure" level contains the name of the array, all subsequent elements do not.

The above JSON array contains the following data:

Array
Element
ELEM_1 ELEM_2 ELEM_3
1 1 2 3
2 101 102 103

I am fortunate that Db2 for i SQL provides me with a Table function, JSON_TABLE, that I can use to extract the data from the JSON array. Below is an example of using this Table function:

01  WITH T0(DATA) 
02  AS (VALUES
03  '{"ARRAY_ELEMENTS":[{"ELEM_1":"1","ELEM_2":"2","ELEM_3":"3"},
04   {"ELEM_1":"101","ELEM_2":"102","ELEM_3":"103"}]}'
05  )

06  SELECT JSON.ELEMENT_1,JSON.ELEMENT_2,JSON.ELEMENT_3
07   FROM T0,
08    JSON_TABLE(
09      DATA,
10      '$.ARRAY_ELEMENTS[*]'
11      COLUMNS(
12        ELEMENT_1 CHAR(5) PATH '$.ELEM_1',
13        ELEMENT_2 CHAR(5) PATH '$.ELEM_2',
14        ELEMENT_3 CHAR(5) PATH '$.ELEM_3')
15  ) AS JSON ;

I am using a Common Table Expression, CTE.

Line 1: The first line of all CTE starts with WITH. I am creating a temporary table, called T0, that contains one column, DATA.

Line 2: I am using the VALUES SQL statement to load my JSON array into the column DATA.

Lines 3 and 4: My JSON array.

Lines 6 – 15: My Select statement to split the JSON array into individual columns of data.

Line 6: These three columns are generated by the JSON_TABLE Table function. I prefixed them all with "JSON" to make it obvious where they came from.

Line 7: I need to have the temporary table T0 in the statement so that I can use its column DATA.

Lines 8 – 15: The JSON_TABLE Table function.

Line 9: The first parameter passed to the Table function is the name of the column that contains the JSON array.

Line 10: Here I define the path to the JSON array, ARRAY_ELEMENTS. I use the "$" as I am not passing a path name to the Table function. As the array can contain more than one first level element I need to give [*] so that the Table functions knows there can be more than one element.

Lines 11 – 14: Definitions of the second level elements. Here the "$" is used as this is not a path.

Line 15: The columns returned with be prefixed with JSON.

The results are:

ELEMENT_1  ELEMENT_2  ELEMENT_3
---------  ---------  ---------
1          2          3    
101        102        103  

The three second level values have been extracted from the JSON array into their own columns.

Now onto my original purpose… to extract the data from the column BASED_ON_FILES in the SYSFILES View. The SQL statement will list all the logical type files in my library. Logical files are DDS logical files, DDL Indexes and Views:

01  SELECT SYSTEM_TABLE_NAME,BASED_ON_FILES
02    FROM QSYS2.SYSFILES
03   WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 
04     AND NATIVE_TYPE = 'LOGICAL'

Line 1: I only want the system file name and the based on files columns.

Line 3: I am only interested in the objects in my library.

line 4: I only want the results for logical type files.

I have extract two results of different types for my example:

SYSTEM
_TABLE_
NAME       BASED_ON_FILES
---------  ---------------------
TESTFILEL  {"BASED_ON_FILES":[{"LIBRARY":"MYLIB","FILE":"TESTFILE",
              "MEMBER":"TESTFILE","LF_FORMAT":"TESTFILER"}]}

VIEW001    {"BASED_ON_FILES":[{"LIBRARY":"QPFRD74001","FILE":"QAPMHDWR",
              "MEMBER":"QAPMHDWR","LF_FORMAT":"SIRIUSUSRS"},
           {"LIBRARY":"QSYS2","FILE":"USER_INFO",
              "MEMBER":"USER_INFO","LF_FORMAT":"SIRIUSUSRS"}]}

TESTFILEL is a DDS logical file created over the physical file TESTFILE.

VIEW001 is a SQL View built over the files QAPMDWR and USER_INFO, as it was built over two tables/files/Views there are two array elements.

The statement I used to extract the data from the based on files JSON array:

01  WITH T0(FILE,DATA) 
02  AS (
03  SELECT SYSTEM_TABLE_NAME,BASED_ON_FILES 
04           FROM QSYS2.SYSFILES 
05          WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB' 
06            AND NATIVE_TYPE = 'LOGICAL'
07  )

08  SELECT T0.FILE AS "File",
09         JSON.LIB AS "Based lib",
10         JSON.FILE AS "Based file",
11         JSON.MBR AS "Based mbr",
12         JSON.FORMAT AS "LF format"
13    FROM T0,
14  JSON_TABLE(
15    DATA,
16    '$.BASED_ON_FILES[*]'
17    COLUMNS(
18      LIB CHAR(10) PATH '$.LIBRARY',
19      FILE CHAR(10) PATH '$.FILE',
20      MBR CHAR(10) PATH '$.MEMBER',
21      FORMAT CHAR(10) PATH '$.LF_FORMAT')
22  ) AS JSON ;

Like the previous example this is a CTE.

Lines 1 – 7: In the first part I am extracting the system name of the file and JSON array into the temporary table T0.

Lines 8 – 12: The columns I want returned:

  • Name of the file
  • The library where the based on file exists
  • Name of the based on file
  • Name of the member of the based on file
  • Record format of the "logical" file, Index or View

I have given the columns meaningful column headings to make understanding the results easier.

Line 16: The name of this JSON array.

Lines 18 – 21: Definitions of the second level elements into columns.

My results are:

File        Based lib   Based file  Based mbr   LF format
----------  ----------  ----------  ----------  ----------
TESTFILEL   MYLIB       TESTFILE     TESTFILE   TESTFILER 
VIEW001     QPFRD74001  QAPMHDWR     QAPMHDWR   SIRIUSUSRS
VIEW001     QSYS2       USER_INFO    USER_INFO  SIRIUSUSRS

There is only one row for TESTFILEL, as it is only based on one file. There are two rows for VIEW001 each giving the information for the two based on objects.

Now I have the information in a format I can use anyway I want.

 

You can learn more about the JSON_TABLE SQL Table function from the IBM website here.

 

This article was written for IBM i 7.5 TR2 and 7.4 TR8.

2 comments:

  1. Thank you Simon Hutchinson! I was looking for this!

    ReplyDelete
  2. Originally I used CTE's for this, but, I think the "nested" syntax allows you to do the same thing in one query without used a CTE.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.