Wednesday, July 1, 2026

Retrieving the value of a column from a previous row

I was sent a copy of a job log from weeks ago, and had been asked to identify which parts of the job had run the longest. Fortunately, the spool file of the job log was still available so I could do some SQL magic to extract useful information from it. The first thing I want to do is to determine is how long each program or command took. I can extract the timestamp from the spool file, but I need to be able to retrieve the timestamp for the previous entry too. How can I retrieve that information for the previous entry?

After a bit of searching I found a SQL function that would allow me to do this. Rather than go straight into showing what I did with the joblog spool file, I am going to start with a simpler example. I have a DDL table, TESTTABLE, that has a decimal column, TEST_NUMBER, and nine rows of data. I can show the data using the following SQL statement:

01  SELECT TEST_NUMBER
02    FROM TESTTABLE

Which returns:

TEST_NUMBER
-----------
          1
          2
          3
          4
          5
          6
          7
          8
          9

To get the details from the previous row I need to use a LAG, the syntax of which is:

 LAG(< column >, 1) OVER (ORDER BY < key column >)

Where:

  • < column > :  The name of the column I want to retrieve the data from
  • 1 :  This means that I want to retrieve the value that was in the column in the previous record
  • < key column > :  A column that contains something I can use as a key to identify the previous row

As I want the value of the column TEST_NUMBER from the previous row I use a 1 in the LAG, and I order by TEST_NUMBER.

01  SELECT TEST_NUMBER,
02         LAG(TEST_NUMBER,1) OVER (ORDER BY TEST_NUMBER) AS "Previous"
03    FROM TESTTABLE

The results are:

TEST_NUMBER   Previous
-----------   --------
          1   <NULL>
          2          1
          3          2
          4          3
          5          4
          6          5
          7          6
          8          7
          9          8

The value for the previous row for the row where TEST_NUMBER is "1" is null as there is no row before it.

I can guess what the results will be, but I want to prove to myself what happens if I want to see the data from the row prior to the previous row. I replace the "1" with "2".

01  SELECT TEST_NUMBER,
02         LAG(TEST_NUMBER,2) OVER (ORDER BY TEST_NUMBER) AS "Previous"
03    FROM TESTTABLE

And now the results show the values from the row that is two records previous.

TEST_NUMBER   Previous
-----------   --------
          1   <NULL>
          2   <NULL>
          3          1
          4          2
          5          3
          6          4
          7          5
          8          6
          9          7

What happens if I have two rows with the same value? I had to test it:

01  INSERT INTO TESTTABLE VALUES(5) ;

02  SELECT TEST_NUMBER,
03         LAG(TEST_NUMBER,1) OVER (ORDER BY TEST_NUMBER) AS "Previous"
04    FROM TESTTABLE ;

Line 1: I am inserting a new row into the table where the TEST_NUMBER is "5".

Lines 2 – 4: I want to retrieve the previous row's data.

The results are:

TEST_NUMBER   Previous
-----------   --------
          1   <NULL>
          2          1
          3          2
          4          3
          5          4
          5          5
          6          5
          7          6
          8          7
          9          8

Notice that there are two rows where TEST_NUMBER is "5". The first has a previous row value of "4", and the second has "5" as it is the second row with the value of "5".

At the start I said I was using this to determine run times from a joblog. What I want to do is to extract the timestamp from each of the records like the one shown below, and compare that to the timestamp of the previous equivalent record in the spool file:

CPF1124    Information             00   06/07/26  01:30:02.436409  QWTPIIPP     Q...

The date is shown in *MDY format in the partition I am using, it may be displayed in a different format on the partition you use.

In these examples I am extracting the data directly from a small joblog spool file I generated for this post, using the SPOOLED_FILE_DATA table function. It returns two columns of data:

  1. ORDINAL_POSITION:  The spool file line number
  2. SPOOLED_DATA:  The spool file record

I can extract and convert the line with the "timestamp" in it to a true timestamp with the following SQL statement:

01  SELECT ORDINAL_POSITION AS "Line",
02         SUBSTR(SPOOLED_DATA,41,25) AS "Substr",
03         TIMESTAMP_FORMAT(SUBSTR(SPOOLED_DATA,41,25),'MM/DD/YY  HH24:MI:SS.NNNNNN')
04            AS "Timestamp"
05    FROM TABLE(SYSTOOLS.SPOOLED_FILE_DATA(
06    JOB_NAME => '539939/QUSER/TESTCL',
07    SPOOLED_FILE_NAME => 'QPJOBLOG',
08    SPOOLED_FILE_NUMBER => '*LAST'))
09    WHERE SUBSTR(SPOOLED_DATA,53,1) = ':' AND SUBSTR(SPOOLED_DATA,56,1) = ':'
10    ORDER BY ORDINAL_POSITION

Line 1: I want the spool file line number.

Line 2: I am substring the character date and time from the spool file record.

Line 3 and 4: Converting the substring-ed data to a true timestamp. If I was using a different partition that used a different date format I would need to change "MM/DD/YY" to match the partitions date format.

Lines 5 – 8: Definition and the parameters for SPOOLED_FILE_DATA.

Line 9: This is what makes the records I am looking for unique. If they have a colon ( : ) in positions 53 and 56 these are the records I want.

Line 10: Order by the spool file line number.

Which gives me the true timestamp value:

Line   Substr                      Timestamp
----   -------------------------   --------------------------
   5   06/07/26  01:30:02.436409   2026-06-07 01:30:02.436409
   8   06/07/26  01:30:02.437185   2026-06-07 01:30:02.437185
  20   06/07/26  01:30:02.437439   2026-06-07 01:30:02.437439
  22   06/07/26  01:30:02.437831   2026-06-07 01:30:02.437831
  27   06/07/26  01:30:02.438040   2026-06-07 01:30:02.438040
  32   06/07/26  01:30:12.468285   2026-06-07 01:30:12.468285
  38   06/07/26  01:30:12.492354   2026-06-07 01:30:12.492354
  43   06/07/26  01:30:22.522558   2026-06-07 01:30:22.522558
  49   06/07/26  01:30:22.524595   2026-06-07 01:30:22.524595

Having shown how I made the values I needed I can now show my final statement:

01  SELECT ORDINAL_POSITION AS "Line",
02         TIMESTAMP_FORMAT(SUBSTR(SPOOLED_DATA,41,25),'MM/DD/YY  HH24:MI:SS.NNNNNN')
03            AS "This row",
04         LAG(TIMESTAMP_FORMAT(SUBSTR(SPOOLED_DATA,41,25),'MM/DD/YY  HH24:MI:SS.NNNNNN'),1) 
05            OVER (ORDER BY ORDINAL_POSITION) AS "Previous row",
06         TO_CHAR(TIMESTAMP_FORMAT(SUBSTR(SPOOLED_DATA,41,25),'MM/DD/YY  HH24:MI:SS.NNNNNN') - 
07         LAG(TIMESTAMP_FORMAT(SUBSTR(SPOOLED_DATA,41,25),'MM/DD/YY  HH24:MI:SS.NNNNNN'),1) 
08             OVER (ORDER BY ORDINAL_POSITION),'000G00G00.000000') as "Difference"
09    FROM TABLE(SYSTOOLS.SPOOLED_FILE_DATA(
10      JOB_NAME => '539939/QUSER/TESTCL',
11      SPOOLED_FILE_NAME => 'QPJOBLOG',
12      SPOOLED_FILE_NUMBER => '*LAST'))
13    WHERE SUBSTR(SPOOLED_DATA,53,1) = ':' AND SUBSTR(SPOOLED_DATA,56,1) = ':'
14    ORDER BY ORDINAL_POSITION

Line 1: Line number of the current record.

Lines 2 and 3: Converting to character timestamp of the current record to a "true" timestamp.

Lines 4 and 5: Using LAG to get the timestamp from the previous record.

Lines 6 – 8: Calculate the difference between the two timestamps, and then convert the result to a character string using TO_CHAR. The "G" will be replaced by a comma ( , ).

Lines 9 – 14: Same as before.

The results are:

Line   This row                     Previous row                 Difference
----   --------------------------   --------------------------   ---------------
   5   2026-06-07 01:30:02.436409   <NULL>                       <NULL>
   8   2026-06-07 01:30:02.437185   2026-06-07 01:30:02.436409   000,00,00.000776
  20   2026-06-07 01:30:02.437439   2026-06-07 01:30:02.437185   000,00,00.000254
  22   2026-06-07 01:30:02.437831   2026-06-07 01:30:02.437439   000,00,00.000392
  27   2026-06-07 01:30:02.438040   2026-06-07 01:30:02.437831   000,00,00.000209
  32   2026-06-07 01:30:12.468285   2026-06-07 01:30:02.438040   000,00,10.030245
  38   2026-06-07 01:30:12.492354   2026-06-07 01:30:12.468285   000,00,00.024069
  43   2026-06-07 01:30:22.522558   2026-06-07 01:30:12.492354   000,00,10.030204
  49   2026-06-07 01:30:22.524595   2026-06-07 01:30:22.522558   000,00,00.002037

The commas in the Difference column are used to separate the parts of the time. The first comma separate hours from minutes, and the second separates minutes from seconds. The decimal point indicates microseconds.

I now have a way to determine which parts of the job are taking the longest, and can analyze those.

 

This article was written for IBM i 7.6, and should work for some earlier releases too.

No comments:

Post a Comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.