Wednesday, May 9, 2018

Creating a XML file

create xml file using sql and rpg

In an earlier post I described how to retrieve information from a XML file. This left me thinking how do I create a XML file just using native IBM i functionality?

I wanted to keep my program simple, K.I.S.S. After a good deal of research I have an example program of 28 lines, which takes data from a DDS file or DDL table, converts the data to XML, and outputs the data to a file in the IFS. I need acknowledge Birgitta Hauser here. She is someone whose work I study, and articles I always read. Some of the code in this example is based upon a series of example procedures and programs she posted on Github. I will post a link to her Github at the bottom of this post along with the other reference links.

Before I continue I want to describe the parts of a XML document so you understand when I call an element something what it is. Below is an example of a XML document:

01  <?xml version="1.0" encoding="UTF-8">
02  <root>
03    <parent>
04      <child1>xxx</child1>
05      <child2>xxx</child2>
06    </parent>
07    <parent>
08      <child1>xxx</child1>
09      <child2>xxx</child2>
10    </parent>
11  </root>

Line 1: Every well formed XML document starts with one of these elements. It gives the version of the XML standard that is being used, and the character set.

Line 2: root is the root element, and it has its matching end on line 11. Usually there is just one root element per file.

Lines 3 – 6 and 7 – 10: There are two parent elements within this file. Each parent has starting tags, lines 3 and 7, and matching ending tags, 6 and 10.

Lines 4 and 5, 8 and 9: The children contain the "data”.

Now I have explained that I can start with the program. There is a lot of cool interesting stuff in this program, so I am going to break it up into parts to make it easier for me to explain the code and I hope for you to understand. I will give the entire program at the bottom of this post.

Let me start with defining the variables:

01  **free

02  dcl-s XmlData sqltype(xml_clob:3000) ccsid(37) ;
03  dcl-s Outfile sqltype(clob_File) ;
04  dcl-s Data sqltype(clob:1000000) ccsid(*utf8) ;

05  dcl-s Changed varchar(3000) ;
06  dcl-s Path varchar(100) inz('/MyFolder/xmlfile.xml') ;
07  dcl-s ThisCustomer char(10) inz('9877A') ;
08  dcl-s StartDate date(*iso) inz(d'2018-04-01') ;

Line 1: For those of you who have read this blog before should now be familiar with the **FREE, as all of my examples are in totally free RPG.

Lines 2 – 4: RPG does not have equivalent variables to SQL's Large Objects, LOBs, therefore, the SQLTYPE keyword was added to the variable definition. At compile time the SQL precompiler replaces these declarations with their RPG equivalents. I can hear you all say: "What are all the LOBs I can use?” Rather than describe them all in this post I will write about them in a future post. In this example we have three different types:

Line 2: XmlData is defined as XML_CLOB. This will receive the results from the SQL statement I will use to create the XML output. I have made it 3,000 characters long. I have also defined it with the CCSID of 37, which allow me to see the data in a recognized form in my program.

Line 3: Outfile is defined as CLOB_FILE. This will allow me to write data directly to the file in the IFS.

Line 4: Data, which is a CLOB, will contain the data I want to write to the IFS file. I have defined it as 1 million characters, and with the CCSID of UTF-8.

How does the SQL precompiler convert these into RPG:

//*DCL-S XMLDATA SQLTYPE(XML_CLOB:3000) CCSID(37) ;
        DCL-DS XMLDATA;
          XMLDATA_LEN UNS(10);
          XMLDATA_DATA CHAR(3000) CCSID(37);
        END-DS XMLDATA;
 //*DCL-S OUTFILE SQLTYPE(CLOB_FILE) ;
        DCL-DS OUTFILE;
          OUTFILE_NL UNS(10);
          OUTFILE_DL UNS(10);
          OUTFILE_FO UNS(10);
          OUTFILE_NAME CHAR(255) CCSID(*JOBRUNMIX);
        END-DS OUTFILE;
 //*DCL-S DATA SQLTYPE(CLOB:1000000) CCSID(*UTF8) ;
        DCL-DS DATA;
          DATA_LEN UNS(10);
          DATA_DATA CHAR(1000000) CCSID(1208);
        END-DS DATA;

They have been converted to RPG data structures. I will explain how to use the subfields as I use them.

Lines 5 – 8: These variable definitions are very standard. ThisCustomer and StartDate will be used in the SQL statement to select the results I desire.

And on to the SQL statements…

09  exec sql SET OPTION COMMIT=*NONE,CLOSQLCSR=*ENDMOD,
                        DATFMT=*ISO ;

10  exec sql VALUES(SELECT XMLGROUP(RTRIM(ORDERNBR) AS 
                                                 "Order_Number",
11                                  RTRIM(CUSTOMER) AS 
                                              "Customer_Number",
12                                  ORDERDATE AS "Order_Date",
13                                  ORDERAMT AS "Order_Amount",
14                                  RTRIM(SKU) AS "Part_Number"
15                         OPTION ROW "Order" ROOT "New_Orders")
16                         FROM ORDERFILE
17                         WHERE CUSTOMER = :ThisCustomer
18                           AND ORDERDATE >= :StartDate)
19            INTO :XmlData ;

Line 9: I put the SQL options in every program I use SQL in so that I don't have to worry about them when compiling the code.

Lines 10 - 19: This is where the magic happens and the relational data from the file/table ORDFILE is made into XML.

Line 10: There is a lot going on here. I am using the SQL VALUES statement to place the results of my Select statement into variable XmlData, line 19. XmlData has to be defined as XML_CLOB to the results are formatted correctly. The XMLGROUP function returns the results of the Select statement as "well formatted XML”. RTRIM, right trim, in this format removes any trailing spaces from the string. The AS allows me to give the XML children elements the names I desire.

Line 15: This is where I can give the names for the root and parent elements. ROW defines the parent's name, and ROOT the root's name.

Lines 16 – 18: Are pretty standard. FROM is the file/table/view the data comes from. WHERE allows me to select the information for just one customer and for all order after the given date.

Line 19: The XML string generated is placed in the variable XmlData.

If I was running the program using debug I could see the results from this SQL statement:

EVAL XmlData
XMLDATA_LEN OF XMLDATA = 639
XMLDATA_DATA OF XMLDATA =

  1 '<?xml version="1.0" encoding="IBM037"?><New_Orders><Order><O'
 61 'rder_Number>15170Q</Order_Number><Customer_Number>9877A</Cus'
121 'tomer_Number><Order_Date>2018-05-01</Order_Date><Order_Amoun'
181 't>100.23</Order_Amount><Part_Number>1BTEF8U1</Part_Number></'
241 'Order><Order><Order_Number>56554B</Order_Number><Customer_Nu'
301 'mber>9877A</Customer_Number><Order_Date>2018-05-06</Order_Da'
361 'te><Order_Amount>9.51</Order_Amount><Part_Number>AEW445</Par'
421 't_Number></Order><Order><Order_Number>22231C</Order_Number><'
481 'Customer_Number>9877A</Customer_Number><Order_Date>2018-04-1'
541 '5</Order_Date><Order_Amount>78.00</Order_Amount><Part_Number'
601 '>N/A</Part_Number></Order></New_Orders>                     '

It is not formatted nicely with each element on its own line, but it still "well formed”. Do notice that in the first element, for the character set, it has IBM037 as the character set. This is because I defined XmlData, on line 2, with the CCSID 37.

The next few lines of the program are used to define the IFS file:

20  clear Outfile ;
21  Outfile_Name = %trimr(Path) ;
22  Outfile_NL = %len(%trimr(Outfile_Name)) ;
23  Outfile_FO = SQFCRT ;

There are four subfields in this data structure:

  1. OUTFILE_NAME: Name of the IFS file, including the full path.
  2. OUTFILE_NL: Length of the IFS file name.
  3. OUTFILE_FO: File operation, see below.
  4. OUTFILE_DL: Not used.

I don't have to remember the numbers that are used for the different file operations as the SQL precompiler includes them in the program as constants:

01  DCL-C SQFRD CONST(2);
02  DCL-C SQFCRT CONST(8);
03  DCL-C SQFOVR CONST(16);
04  DCL-C SQFAPP CONST(32);
  • SQFRD: Read only
  • SQCRT: Create
  • SQFOVR: Overwrite
  • SQFAPP: Append

Line 21: I move the IFS path and file name into the subfield for the file name.

Line 22: I determine the length of the file name using RPG's %TRIM and %LEN built in functions.

Line 23: I want to create new IFS file so I use the value from the SQFCRT constant.

And onto the last few lines of the program…

24  Changed = XmlData_Data ;
25  Data_Data = %scanrpl('IBM037':'UTF-8':Changed) ;
26  Data_Len = %len(%trimr(Data_Data)) ;

27  exec sql SET :Outfile = :Data ;

28  *inlr = *on ;

Line 24 - 26: When I created my XML, contained in XmlData, I used the CCSID 37 which ended up as the character type definition element at the start of the file. I do not want my IFS file to be CCSID 37, I want it to be UTF-8. It is easy to convert the data from character set to another by just moving it from one variable to another. But my definition element has to change too. To be able to use the Scan and Replace built in function, %SCANRPL, to make the change all three of the BiF's parameters have to be the same CCSID. Therefore, I have to move the contents of XmlData to a variable, Changed, that uses the program's default CCSID, line 24. I can then use the %SCANRPL BiF to replace the IBM037 with UTF-8, and move the result into the variable I defined as UTF-8, line 25. Finally I need to determine the length of the string in Data_Data and place the result in Data_Len, line 26.

Line 27: This is, in my humble opinion, one of the coolest lines in the whole program. This SQL Set statement create a file in the IFS, and then moves the contents of the variable Data to it. Wow!

When I go to the IFS folder MyFolder I find my file xmlfile.xml, and it contains the XML data I generated using the SQL statement, lines 10 – 19. When I open the file the data looks as it did in the variable XmlData, but I have formatted it to make it easier to understand below:

<New_Orders>
  <Order>
    <Order_Number>15170Q</Order_Number>
    <Customer_Number>9877A</Customer_Number>
    <Order_Date>2018-05-01</Order_Date>
    <Order_Amount>100.23</Order_Amount>
    <Part_Number>1BTEF8U1</Part_Number>
  </Order>
  <Order>
    <Order_Number>56554B</Order_Number>
    <Customer_Number>9877A</Customer_Number>
    <Order_Date>2018-05-06</Order_Date>
    <Order_Amount>9.51</Order_Amount>
    <Part_Number>AEW445</Part_Number>
  </Order>
  <Order>
    <Order_Number>22231C</Order_Number>
    <Customer_Number>9877A</Customer_Number>
    <Order_Date>2018-04-15</Order_Date>
    <Order_Amount>78.00</Order_Amount>
    <Part_Number>N/A</Part_Number>
  </Order>
</New_Orders>

The other of XML format I know is where the data is placed within the parent element as "attributes”:

01  <?xml version="1.0" encoding="UTF-8">
02  <root>
03    <parent child1=”xxx” child2=”xxx” />
03    <parent child1=”xxx” child2=”xxx” />
04  </root>

Again Db2 for i comes to my rescue. With the addition of the AS ATTRIBUTES on line 16, below, the result is now formatted into this form.

10  exec sql VALUES(SELECT XMLGROUP(RTRIM(ORDERNBR) AS
                                                 "Order_Number",
11                                 RTRIM(CUSTOMER) AS 
                                              "Customer_Number",
12                                  ORDERDATE AS "Order_Date",
13                                  ORDERAMT AS "Order_Amount",
14                                  RTRIM(SKU) AS "Part_Number"
15                         OPTION ROW "Order" ROOT "New_Orders"
16                         AS ATTRIBUTES)
17                         FROM ORDFILE
18                         WHERE CUSTOMER = :ThisCustomer
19                           AND ORDERDATE >= :StartDate)

The results in XMLData looks like:

EVAL XmlData
XMLDATA_LEN OF XMLDATA = 423
XMLDATA_DATA OF XMLDATA =

  1 '<?xml version="1.0" encoding="IBM037"?><New_Orders><Order Or'
 61 'der_Number="15170Q" Customer_Number="9877A" Order_Date="2018'
121 '-05-01" Order_Amount="100.23" Part_Number="1BTEF8U1"/><Order'
181 ' Order_Number="56554B" Customer_Number="9877A" Order_Date="2'
241 '018-05-06" Order_Amount="9.51" Part_Number="AEW445"/><Order '
301 'Order_Number="22231C" Customer_Number="9877A" Order_Date="20'
361 '18-04-15" Order_Amount="78.00" Part_Number="N/A"/></New_Orde'
421 'rs>                                                         '

One parent element now looks like:

<Order Order_Number="15170Q" Customer_Number="9877A" 
 Order_Date="2018-05-01" Order_Amount="100.23"
 Part_Number="1BTEF8U1"/>

If this was a production program it would be shorter. I would define the variable XmlData as UTF-8, thereby removing the need for lines needed to convert the data type to UTF-8 & replace the IBM037 from the string. But as this was an example I wanted to show what the SQL statement produced before I wrote the data to the file.

I think all that remains is to thank Birgitta for her example, and to list the reference links below.

 

 

This article was written for IBM i 7.3, and should work for earlier releases too.

 

Complete program

01  **free

02  dcl-s XmlData sqltype(xml_clob:3000) ccsid(37) ;
03  dcl-s Outfile sqltype(clob_File) ;
04  dcl-s Data sqltype(clob:1000000) ccsid(*utf8) ;

05  dcl-s Changed varchar(3000) ;
06  dcl-s Path varchar(100) inz('/MyFolder/xmlfile.xml') ;

07  dcl-s ThisCustomer char(10) inz('9877A') ;
08  dcl-s StartDate date(*iso) inz(d'2018-04-01') ;

09  exec sql SET OPTION COMMIT=*NONE,CLOSQLCSR=*ENDMOD,DATFMT=*ISO ;

10  exec sql VALUES(SELECT XMLGROUP(RTRIM(ORDERNBR) AS "Order_Number",
11                                  RTRIM(CUSTOMER) AS "Customer_Number",
12                                  ORDERDATE AS "Order_Date",
13                                  ORDERAMT AS "Order_Amount",
14                                  RTRIM(SKU) AS "Part_Number"
15                         OPTION ROW "Order" ROOT "New_Orders")
16                         FROM ORDERFILE
17                         WHERE CUSTOMER = :ThisCustomer
18                           AND ORDERDATE >= :StartDate)
19            INTO :XmlData ;

20  clear Outfile ;
21  Outfile_Name = %trimr(Path) ;
22  Outfile_NL = %len(%trimr(Outfile_Name)) ;
23  Outfile_FO = SQFCRT ;

24  Changed = XmlData_Data ;
25  Data_Data = %scanrpl('IBM037':'UTF-8':Changed) ;
26  Data_Len = %len(%trimr(Data_Data)) ;

27  exec sql SET :Outfile = :Data ;

28  *inlr = *on ;

9 comments:

  1. Hello Simon, i think there is limit of XML output file size ?

    ReplyDelete
    Replies
    1. According to IBM's documentation a XML_CLOB variable must be greater than 1 and less that 16,773,100

      Delete
  2. Sergio L Puentes-ValladaresMay 9, 2018 at 5:30 AM

    Buenísimo, siempre con funcionalidades útiles, GRACIAS buddy

    ReplyDelete
  3. Is there any way to parse the XML without using XML sax functions in Iseries? An example would be much appreciated. Thanks in advance.

    ReplyDelete
  4. Sweet example Simon - Thanks! Now, lets see how to do the same thing with JSON ;)

    ReplyDelete
  5. Hi Simon,i tried this program but it is returning Unicode/Hex values into IFS folder's XML file .Is it version issue.

    ReplyDelete
    Replies
    1. Don't we want the output to be in Unicode?
      that is the best output format.
      if not you will have to Play with the CCSID values for the variables until you get the CCSID what you want.

      Delete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.