Wednesday, August 24, 2016

Extracting data from journals using SQL

display_journal journal dspjrn

Previously I wrote about how to extract data from a Journal using the Display Journal command, DSPJRN. There is an alternative approach with SQL using a User Defined Table Function, UDTF, called DISPLAY_JOURNAL. When I use a SQL Select statement this will return the same information as the output I was using from the DSPJRN command. Unlike the output from the DSPJRN command that comes in four different formats, the output from DISPLAY_JOURNAL has to be all things to all men, and contains more information than I need. You can see all the available columns in the IBM documentation that I have provide a link to at the bottom of this post.

I am just going to do the same as I did in the previous post, I want to get the changes made to a file on a specific date. As this is really a continuation of the previous post I am not going to repeat a lot of what I talked about there, if you feel I have missed something then check in that post.

The information I want is:

Description DISPLAY_JOURNAL
column
DSPJRN
equivalent
Time change happened ENTRY_TIMESTAMP JOTSP
Journal code, type of information JOURNAL_CODE JOCODE
Change type (insert, delete, update) JOURNAL_ENTRY_TYPE JOENTT
User JOB_USER JOUSER
Job name JOB_NAME JOJOB
Job number JOB_NUMBER JONBR
File, library, member OBJECT JOOBJ
Object type OBJECT_TYPE N/A
Program PROGRAM_NAME JOPGM
Program library PROGRAM_LIBRARY N/A
Data from record ENTRY_DATA JOESD
Sequence number within journal SEQUENCE_NUMBER JOSEQN

The column with the record data, ENTRY_DATA is defined as a BLOB (Binary Large OBject) data type, which will need to be converted before I use it. DISPLAY_JOURNAL has a number of parameters that can be passed to it to reduce the amount of data returned. In this example I will be using the parameters for:

  • Journal name and library, which are mandatory
  • Journal code "R", which is code for record changes
  • File I want the data for

There are many other parameters, and I recommend you check the IBM documentation to see if any would apply to your scenarios.

Now let's go straight to the code. My example is written in CL, I could have written it equally well in RPG, but in this case I just wrote it in CL.

Let's start with the simple stuff:

01  DCL VAR(&CHECK_DATE) TYPE(*CHAR) LEN(10) VALUE('2016-07-15')

02  RUNSQL SQL('DROP TABLE QTEMP/@JOURNAL') COMMIT(*NC)
03  MONMSG MSGID(SQL0000)

Line 1: This variable will be used to extract the data from the Journal for only this date.

Line 2 and 3: Before I create the table/file I need to delete it if it is already in QTEMP, and as this example is all SQL I have used the DROP TABLE rather than the DLTF command. Those of you who are familiar with this blog know I use the RUNSQL statement a lot. If it is new to you please read Run SQL statements in your CL.

Now for the code to extract the data from the Journal into a table/file in QTEMP:

04  RUNSQL SQL('CREATE TABLE QTEMP/@JOURNAL AS +
05              (SELECT ENTRY_TIMESTAMP AS ENTRY,+
06                      JOURNAL_CODE AS JRNCODE,+
07                      CASE WHEN JOURNAL_ENTRY_TYPE = ''PT'' +
08                             THEN ''INSERT'' +
09                           WHEN JOURNAL_ENTRY_TYPE = ''PX'' +
10                             THEN ''INSERT BY RRN'' +
11                           WHEN JOURNAL_ENTRY_TYPE = ''UB'' +
12                             THEN ''UPDATE BEFORE'' +
13                           WHEN JOURNAL_ENTRY_TYPE = ''UP'' +
14                             THEN ''UPDATE AFTER'' +
15                           WHEN JOURNAL_ENTRY_TYPE = ''DL'' +
16                             THEN ''DELETE'' +
17                           ELSE JOURNAL_ENTRY_TYPE +
18                      END AS JRNTYPE,+
19                      JOB_NAME AS JOBNAME,+
20                      JOB_USER AS JOBUSER,+
21                      JOB_NUMBER AS JOBNBR,+
22                      SUBSTR(OBJECT,1,10) AS FILE,+
23                      SUBSTR(OBJECT,11,10) AS FILELIB,+
24                      SUBSTR(OBJECT,21,10) AS FILEMBR,+
25                      OBJECT_TYPE AS OBJTYPE,+
26                      PROGRAM_NAME AS PGMNAME,+
27                      PROGRAM_LIBRARY AS PGMLIB,+
28                      CAST(ENTRY_DATA AS CHAR(2000)) AS RECORD,+
29                      SEQUENCE_NUMBER AS SEQNBR +
30                 FROM TABLE(DISPLAY_JOURNAL(''PRODLIB'',+
31                                            ''ORDJRN'',+
32                           JOURNAL_CODES => ''R'',+
33                          OBJECT_LIBRARY => ''PRODLIB'',+
34                             OBJECT_NAME => ''ORDDTL'',+
35                           OBJECT_MEMBER => ''ORDDTLR'',+
36                          OBJECT_OBJTYPE => ''*FILE'')) +
37                 AS X  +
38                WHERE SUBSTR(CHAR(ENTRY_TIMESTAMP),1,10) = +
39                             ''' || &CHECK_DATE || ''' +
40                ORDER BY SEQUENCE_NUMBER) +
41              WITH DATA') +
42           COMMIT(*NC)

Before I get started I need to define something. Being English and living in the USA there are times I lose track as to whether the word I use for something is English-English or American-English. A case in point is what I would call the single quote ( ' ), perhaps others would call in a apostrophe. When using the RUNSQL command if I have alphanumeric strings within the statement I need to have two single quotes ( '' ) before and after it, not a double quote ( " ).

Line 4: I am going to make a table of this information, and it will be called @JOURNAL in the library QTEMP.

Line 5: This is the start of the Select statement that ends on line 40. Like every Select statement it starts with a list of fields that I want. The first is ENTRY_TIMESTAMP, and I am using the AS to give it a short name, ENTRY.

Lines 7 – 17: Here I am "translating" the Journal Entry Code to something more meaningful using a CASE. The word "INSERT" is a lot easier for everyone to understand rather than the code "PT". For more information on using the CASE see Creating derived columns in SQL View.

Lines 23 – 24: There are not separate columns for the file name, library and member. It all comes in one, OBJECT. On these lines I am using the substring function to split OBJECT into three columns.

Line 28: As I mentioned above the data from the file is in the column ENTRY_DATA which is defined as a BLOB with the length of 8,000. On this line I am using the CAST to convert it to a character column of 2,000. If this was real program, not a test, I would CAST this to the record length of the ORDDTL file.

Lines 30 and 31: The FROM part of a Select statement is normally where I would give the name of the table/file to get the data from. As I am using DISPLAY_JOURNAL, which is a User Defined Table Function, UDTF, I use it instead. This function comes with a whole load of parameters I can use. The first two are mandatory as they are the file the Journal is in, line 30, and the Journal, line 31. All of the other parameters are optional.

Line 32: As I am interested in the changes to the Journaled file I want the Journal Code "R".

Lines 33 – 36: As I am only interested in the data from one of the files that are being journaled this is where I give its information. I have to give the library, file name, member, and type for this to work.

Line 37: As this is UDTF I need to have an AS, the letter(s) that follow are unimportant. I am lazy so I have just used "X".

Line 38 and 39: As I am only interested in the data for one day this is where I stipulate the date, which I defined on line 1.

Line 40: I am sorting the file by sequence to ensure that the rows are retrieved in the order they were added to the Journal.

Line 41: As this is a Create Table I need WITH DATA as I want data in my table.

In the last part of my example I want to give the columns descriptions that are meaningful to me. To do this I am using the method I described in Changing column headings in output file.

43  RUNSQL SQL('LABEL ON COLUMN QTEMP/@JOURNAL (+
44                ENTRY IS ''Time trigger        happened'',+
45              JRNCODE IS ''Journal             code'',+
46              JRNTYPE IS ''Journal             entry type'',+
47              JOBNAME IS ''Job                 name'',+
48              JOBUSER IS ''Job                 user'',+
49               JOBNBR IS ''Job                 number'',+
50                 FILE IS ''File                name'',+
51              FILELIB IS ''File                library'',+
52              FILEMBR IS ''File                member'',+
53              OBJTYPE IS ''Type of             object'',+
54              PGMNAME IS ''Program             name'',+
55               PGMLIB IS ''Program             library'',+
56               RECORD IS ''Record              data'',+
57               SEQNBR IS ''Sequence            number'')') +
58           COMMIT(*NC)

So what does it look like when I run this program and then look in @JOURNAL?

Time trigger                Journal  Journal        Job         Job      Job
happened                    code     entry type     name        user     number
2016-07-15-10.15.41.152368     R     INSERT         QPADEV000A  JANED    295956
2016-07-15-11.11.41.152368     R     INSERT         QPADEV000A  JANED    295961
2016-07-15-15.31.33.569520     R     UPDATE BEFORE  QPADEV0015  SUZIEQ   295752
2016-07-15-15.31.33.569520     R     UPDATE AFTER   QPADEV0015  SUZIEQ   295752


File        File        File        Type of  Program     Program     Record
name        library     member      object   name        library     data
ORDDTL      PRODLIB     ORDDTLR     *QDDS    ORD001      -           SQ01820
ORDDTL      PRODLIB     ORDDTLR     *QDDS    ORD001      -           SQ01821
ORDDTL      PRODLIB     ORDDTLR     *QDDS    ORD001A     -           MA10017
ORDDTL      PRODLIB     ORDDTLR     *QDDS    ORD001A     -           MA10017

I have not displayed all of the "Record data" as it is file specific and be different for each file.

I could access the Journal directly using DISPLAY_JOURNAL in a Fetch, rather than extract the information to a work file first. I am going to "block" my Fetches to reduce the amount of input operations I need to perform to the journal. My example of this approach would be:

01  dcl-s CheckDate char(10) inz('2016-08-22') ;

02  dcl-ds Jrn qualified dim(999) ;
03    Entry timestamp ;
04    Code char(1) ;
05    Type char(2) ;
06    JobName char(10) ;
07    JobUser char(10) ;
08    JobNbr char(6) ;
09    File char(10) ;
10    FileLib char(10) ;
11    FileMbr char(10) ;
12    PgmName char(10) ;
13    Record char(2000) ;
14    SeqNbr packed(21) ;
15  end-ds ;

16  dcl-s Elements packed(3) inz(%elem(Jrn)) ;

17  exec sql DECLARE C0 CURSOR FOR
18             SELECT ENTRY_TIMESTAMP,JOURNAL_CODE,JOURNAL_ENTRY_TYPE,
19                    JOB_NAME,JOB_USER,JOB_NUMBER,SUBSTR(OBJECT,1,10),
20                     SUBSTR(OBJECT,11,10),SUBSTR(OBJECT,21,10),
21                     PROGRAM_NAME,CAST(ENTRY_DATA AS CHAR(2000)),
22                     SEQUENCE_NUMBER
23               FROM TABLE(DISPLAY_JOURNAL('PRODLIB',
24                                          'ORDJRN',
25                         JOURNAL_CODES => 'R',
26                        OBJECT_LIBRARY => 'PRODLIB',
27                           OBJECT_NAME => 'ORDDTL',
28                         OBJECT_MEMBER => 'ORDDTL',
29                        OBJECT_OBJTYPE => '*FILE'))
30               AS X
31              WHERE SUBSTR(CHAR(ENTRY_TIMESTAMP),1,10) = :CheckDate
32              ORDER BY SEQUENCE_NUMBER ;

33  exec sql OPEN C0 ;

34  dow (1 = 1) ;
35    exec sql FETCH NEXT FROM C0 FOR :Elements ROWS
                     INTO :Jrn ;
36    if (SQLCOD <> 0) ;
37      leave ;
38    endif ;


39  enddo ;

40  exec sql CLOSE C0 ;

Line 1: Definition of the date I would to Fetch the data for. I will be using this when I define the cursor.

Lines 2 – 15: This data structure array contains all of the columns I want.

Line 16: Rather than "hard code" the number of rows I want to fetch in the Fetch statement by initializing the variable with the number of elements in the data structure array if I change the number of elements in the array I don't have to change any other code.

Lines 17 – 32: The declaration of the cursor is pretty much the same as my example when creating the work file. I did drop a couple of the columns, OBJECT_TYPE and PROGRAM_LIBRARY, and I did not define short names for the columns in the SQL statement. The short names are defined in the data structure array as subfields.

Line 33: Having declared the cursor I need to open it before I can use it.

Lines 34 – 39: This is a very simple Do-loop that I would code to fetch all of the data for the Journal. I am not going to explain how to do this in too much detail as I gave examples in SQL blocking fetches, getting more than one row at a time.

Line 35: This is where I fetch the same number rows as I have elements in my data structure array, into that array. Examples of how to tell how many rows were retrieved, etc, are given in the post I mentioned in the previous paragraph.

Lines 36 - 38: If something went amiss, such as no records were retrieved, which happens at "end of file", SQLCOD will not be zero and I want to leave this Do-loop.

Line 40: Upon existing the Do-loop I have no further use for this cursor, therefore, I close it.

Before using the approach to directly Fetch the data from the Journal I would perform a test to compare the amount of time it takes to extract the data from the Journal into a work file and the programs to report this data versus directly Fetch the data from the Journal. You might find that one approach takes a lot longer than the other depending upon the amount of data in the Journal for the file and date(s) you are interested in.

Rather than define all of the columns, selection criteria etc time and again I thought i would make it easier by creating a View:

CREATE VIEW PRODLIB.ORDVIEW01 AS
            (SELECT ENTRY_TIMESTAMP AS ENTRY,
                    JOURNAL_CODE AS JRNCODE,
                    CASE WHEN JOURNAL_ENTRY_TYPE = 'PT'
                           THEN 'INSERT'
                         WHEN JOURNAL_ENTRY_TYPE = 'PX'
                           THEN 'INSERT BY RRN'
                         WHEN JOURNAL_ENTRY_TYPE = 'UB'
                           THEN 'UPDATE BEFORE'
                         WHEN JOURNAL_ENTRY_TYPE = 'UP'
                           THEN 'UPDATE AFTER'
                         WHEN JOURNAL_ENTRY_TYPE = 'DL'
                           THEN 'DELETE'
                         ELSE JOURNAL_ENTRY_TYPE
                    END AS JRNTYPE,
                    JOB_NAME AS JOBNAME,
                    JOB_USER AS JOBUSER,
                    JOB_NUMBER AS JOBNBR,
                    SUBSTR(OBJECT,1,10) AS FILE,
                    SUBSTR(OBJECT,11,10) AS FILELIB,
                    SUBSTR(OBJECT,21,10) AS FILEMBR,
                    OBJECT_TYPE AS OBJTYPE,
                    PROGRAM_NAME AS PGMNAME,
                    PROGRAM_LIBRARY AS PGMLIB,
                    CAST(ENTRY_DATA AS CHAR(1000)) AS RECORD,
                    SEQUENCE_NUMBER AS SEQNBR
                    FROM TABLE(DISPLAY_JOURNAL('PRODLIB',
                                               'ORDJRN',
                              JOURNAL_CODES => 'R',
                             OBJECT_LIBRARY => 'PRODLIB',
                                OBJECT_NAME => 'ORDDTL',
                              OBJECT_MEMBER => 'ORDDTLR',
                             OBJECT_OBJTYPE => '*FILE''))
               AS X) ;

While I can use this View using STRSQL, I cannot use it in CL, RPG, and Query Management Query. When I do I get a SQL Code of -901, SQL State of 58004, telling me to look in my joblog. There I see a CPF4204, with the message text of "Internal failure occurred in query processor". Doing a quick search it would appear that the IBM i I working on needs a PTF, which I could not get loaded before publishing this.

  select * from ordrview01
  where substr(char(Entry),1,10) = '2016-08-15'
  order by SeqNbr

Having tried both the DSPJRN command and the DISPLAY_JOURNAL which do I prefer? I would use the DISPLAY_JOURNAL because, in my experience, it is slightly faster.

 

You can learn more about DISPLAY_JOURNAL from the IBM website here.

 

This article was written for IBM i 7.2, and should work for earlier releases too.

8 comments:

  1. Our platform keeps getting better and better. Thanks for keeping us up to date.

    ReplyDelete
  2. This is REALLY cool! I can't get it to work on V7R1, though. It complains that "[SQL0204] DISPLAY_JOURNAL in *LIBL type *N not found."

    ReplyDelete
    Replies
    1. First did you make sure QSYS2 is in your library list.

      Second DISPLAY_JOURNAL only came in the base release of 7.2 . It was a PTF for earlier releases for 7.1 the PTF(s) are:

      DB2 for IBM i Group PTF: SF99701: 710 DB2 for IBM i - Level 3

      Or

      5770SS1 V7R1M0 SI39820
      5770SS1 V7R1M0 SI39821

      Delete
  3. Do you have a technique for retrieving the current sequence number for a journal? for journaling to be really useful you would want to know this before the first use. I guess you could use a Timestamp but that's not as absolute as querying the current CSN/LSN/SEQNO.

    ReplyDelete
    Replies
    1. There is a SEQNO column that you can retrieve, as I do in my example. Is that what you mean?

      Delete
  4. Yep, typical CDC usage, read current SEQNO, do full load, process journal up to what the SEQNO was (so you can handle any reprocessing that was done during the load), then using the Last SEQNO processed for each of the next journal pulls to apply transactional consistent changes.
    Doing something like a display_journal with max(seqno) as the starter is doable but its a pretty high cost query especially on a really busy system. In other platforms there is a simple call to get the LSN/CSN (equivalent to SEQNO) but I haven't found an easy way to get it, the best I have seen, and I lost the link was a proc/program that creates a dummy entry, captures the seqno, then rolls it back.

    ReplyDelete
  5. hi, any suggstions on how can someone parse ENTRY_DATA into individual BigInt cols(if any)? for ex: this can give entry_data from given position as varcar:
    cast( cast(substring(entry_data,3, 15) as varchar(15) for bit data) as varchar(15) ccsid 37) as Item
    Is there a way to do the same for BigInt?

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.