Wednesday, July 18, 2018

Getting data from a XML file using SQL

retrieve xml data from file in ifs

A couple of months ago I write a post about creating XML from an IBM i hosted file and write it to a file in the IFS. It was obvious to me that I would have to write this post, describing how to get data from the XML file using SQL.

I could extract the data from the XML file in the IFS using the RPG operation code XML-INTO, but I want to use SQL to do this.

Where to start? I guess with the XML data. The XML file, xmlfile.xml, is in my folder, MyFolder, in the IFS. It contains the same data I used in the post about writing to the XML file. In the file the data is not formatted, it is just one long string of data.

<?xml version="1.0" encoding="UTF-8"><New_Orders><Order><Order_Nu

To make it easier to understand what is contained within I have formatted it below.

<?xml version="1.0" encoding="UTF-8">
<New_Orders>
  <Order>
    <Order_Number>15170Q</Order_Number>
    <Customer_Number>9877A</Customer_Number>
    <Order_Date>2018-05-01</Order_Date>
    <Order_Amount>100.23</Order_Amount>
    <Part_Number>1BTEF8U1</Part_Number>
  </Order>
  <Order>
    <Order_Number>565548</Order_Number>
    <Customer_Number>9877A</Customer_Number>
    <Order_Date>2018-05-06</Order_Date>
    <Order_Amount>9.51</Order_Amount>
    <Part_Number>AEW445</Part_Number>
  </Order>
</New_Orders>

My example program is going to perform two tasks:

  1. Get the XML data from the IFS file and load it into an IBM i file/table
  2. Take the XML data from the IBM i file/table and format it into columns that I can then validate in a program

The RPG program I wrote to perform this may look long and complicated, but in reality it is simple. Let me start with the definitions.

01  **free
02  ctl-opt option(*nodebugio:*srcstmt:*nounref) ;

03  dcl-ds Data qualified dim(9999) ;
04    OrderNo char(10) ;
05    Customer char(10) ;
06    OrderDate char(10) ;
07    OrderAmt char(10) ;
08    PartNo char(20) ;
09    Sequence int(10) ;
10  end-ds ;

11  dcl-s Rows uns(5) inz(%elem(Data)) ;
12  dcl-s InFile sqltype(clob_file) ;
13  dcl-s Path varchar(100) inz('/MyFolder/xmlfile.xml') ;

14  exec sql SET OPTION COMMIT = *NONE, 
                        CLOSQLCSR = *ENDMOD, 
                        DATFMT = *ISO ;

Line 1: As with all of my RPG code nowadays it is totally free form RPG.

Line 2: My favorite control options make it easier for me to debug the program.

Lines 3 - 10: Those of you who have read my previous articles will recognize this as a data structure array, and know that I will be performing a multiple row Fetch. The subfields in this data structure correspond to the elements in the XML document. They do not have to be the same names as the elements.

Line 11: I will be using this variable, Rows in the Fetch statement to say how many rows I will fetch. I have initialized it with the number of elements in the data structure array.

Line 12: This variable is used to retrieve the data from the IFS file. To learn about what the SQLTYPE keyword does see here. Basically this is a CLOB, Character Large Object, SQL variable, which is something that cannot be defined using standard RPG definitions.

Line 13: The path to the XML IFS file is contained in this variable.

Line 14: I add these SQL options to all my programs to make sure they are not forgotten when the program is compiled.

The next part of the program gets the XML data from the IFS file.

15  clear InFile ;
16  Infile_Name = %trimr(Path) ;
17  Infile_NL = %len(%trimr(Infile_Name)) ;
18  Infile_FO = SQFRD ;

19  exec sql DROP TABLE QTEMP.TESTFILE ;
20  exec sql CREATE TABLE QTEMP.TESTFILE (COL1 XML) ;
21  exec sql INSERT INTO QTEMP.TESTFILE VALUES(:InFile) ;

Lines 15 – 18: RPG makes the CLOB variable into a data structure. I need to provide the path name, line 16, the length of the path name, line 17, and a flag that says I will be reading the file, line 18. For more information see the post about creating the XML file mentioned in the first paragraph of this post.

Line 19: As I am going to create a table to contain the data I will be getting from the XML file in the IFS I am using the DROP TABLE statement to delete it if it already exists in the library QTEMP.

Line 20: My file contains just one column, COL1, which is defined as a XML type, in other words this column can only contain XML data.

Line 21: This is so simple. This one line takes all the data from the IFS file and inserts it into the table I just created in QTEMP.

If I was to look in the output file at this time I would not see any recognizable XML, just this:

COL1
*POINTER
********  End of report  ********

Which is not at all helpful, except to tell me I have data.

Now I need to get the XML data and format it into my data structure array.

21  clear Data ;

22  exec sql DECLARE C0 CURSOR FOR
23             SELECT A.* FROM QTEMP.TESTFILE B,
24               XMLTABLE('$doc/New_Orders/Order'
25               PASSING B.COL1 AS "doc"
26               COLUMNS
27               OrderNo CHAR(10) PATH 'Order_Number',
28               Customer CHAR(10) PATH 'Customer_Number',
29               OrderDate CHAR(10) PATH 'Order_Date',
30               OrderAmt CHAR(10) PATH 'Order_Amount',
31               PartNo CHAR(20) PATH 'Part_Number',
32               Sequence FOR ORDINALITY
33               ) AS A ;

34  exec sql OPEN C0 ;
35  exec sql FETCH C0 FOR :Rows ROWS INTO :Data ;
36  exec sql GET DIAGNOSTICS :Rows = ROW_COUNT ;
37  exec sql CLOSE C0 ;

Line 21: I always like to clear the data structure array before I use it.

Line 22 – 33: This is where I define how the XML will be broke into columns.

Line 22: Standard start of a cursor definition.

Line 23: I will be selecting all of the columns I create, A.*, from my output file in QTEMP.

Line 24: The XMLTABLE where I define how to break it up into columns. This line states that the elements I wants are contained within the Order element that is contained within the New_Orders element. The $doc is defined in the line below.

Line 25: I am passing the contents of the outfile's column COL1 to the "variable" doc.

Line 26: This the start of where I define the columns from the XML.

Line 27 - 31: The columns are all defined like:

  DataStructureSubfield CHAR(x) PATH 'XML_element',

For example on line 27 the XML element Order_Number is mapped to the data structure subfield OrderNo, which I have defined as a 10 character column. This size matches the definition of the subfield in the data structure array.

I have defined the amount and date columns as character as in my experience these elements not always a valid number or date. Sometimes the number might contain a thousand separate character or a currency symbol. I would rather capture the amount as character and handle any invalid characters in the program, than have an error if I had defined it as numeric.

Line 32: This column is generated by this SQL statement. It will increment for every set of data (row), i.e. ... .

Line 34: I open the cursor.

Line 35: I do my multiple row Fetch to get 9,999 rows at one time.

Line 36: I get the number of rows I actually fetched, which I would use later in the program to fetch from the data structure array.

Line 37: I have to close the cursor.

When I add a debug breakpoint on line 38 I can see that I retrieved two rows of data, and loaded it into the data structure array.

EVAL rows
ROWS = 2

EVAL data
DATA.ORDERNO(1) = '15170Q    '
DATA.CUSTOMER(1) = '9877A     '
DATA.ORDERDATE(1) = '2018-05-01'
DATA.ORDERAMT(1) = '100.23    '
DATA.PARTNO(1) = '1BTEF8U1            '
DATA.SEQUENCE(1) = 1
DATA.ORDERNO(2) = '565548    '
DATA.CUSTOMER(2) = '9877A     '
DATA.ORDERDATE(2) = '2018-05-06'
DATA.ORDERAMT(2) = '9.51      '
DATA.PARTNO(2) = 'AEW445              '
DATA.SEQUENCE(2) = 2
DATA.ORDERNO(3) = '          '

I could now validate the data in the data structure array and insert the validated data into another file or table.

Yes, it is that easy.

 

You can learn more about the XMLTABLE table function from the IBM website here.

 

This article was written for IBM i 7.3, and will work for every release since 7.1 TR4.

3 comments:

  1. very useful I applied it and it gave me good results.
    Thanks a lot.

    ReplyDelete
  2. You can't see the XML data in table TESTFILE with RUNQRY or STRSQL, but the Run SQL Scripts feature of ACS displays the data.

    ReplyDelete
  3. Tried to do your example on my system but I was getting some errors until I found out that the sample XML file was missing a trailing '?' character in the tag on the first line, right after "UTF-8". Thank you Simon for a site full of gems that I've been using for years.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.