Wednesday, March 13, 2019

Using SQL to check if a batch job is active

sql to find if job is active in subsystem

A colleague asked me if there was an easy way, using SQL, to tell if a job was active in a subsystem. This subsystem, I am going to call it TESTSBS, should contain two jobs that remain active all day, copying and sending data between various IBM i partitions. During the day-end process the jobs are ended, and then resubmitted after the backups.

Recently there have been problems with these jobs. If one of these jobs errored the system operators would answer the message with "C", which would end the job and the data would not be transferred until someone else noticed. Or the jobs would not end during the day-end process, therefore, after the backups completed two new versions of the job would be submitted resulting in four jobs, two of each.

My colleague had looked into ways of determining which jobs were active in the subsystem. During the day if one of the jobs was missing it could be resubmitted. After the backup finished only if the job was not active would it be submitted. All of the methods she had come up with to determine if the jobs were active were, in her opinion, too complicated. Which is why she came to me.

In this example only one of the two jobs is active, the second job, SECONDJOB, is not active.

                           Work with Subsystem Jobs

Subsystem  . . . . . . . . . . :   TESTSBS

Type options, press Enter.
  2=Change   3=Hold   4=End   5=Work with   6=Release   7=Display
  8=Work with spooled files   13=Disconnect

Opt  Job         User        Type     -----Status-----  Function
     FIRSTJOB   SOMEUSER     BATCH    ACTIVE            DLY-60

I have written about the ACTIVE_JOB_INFO table function a couple of times before, including showing how it can be used to monitor for errors.

This table function can have various parameters passed to it to reduce the amount of information it searches through to find the desired results, one of those is the subsystem name. By using the SUBSYSTEM_LIST_FILTER parameter I can limit the search to just one subsystem. For example if I want to return a list of all the jobs active in TESTSBS I can just use the following Select statement.

SELECT JOB_NAME
FROM TABLE(QSYS2.ACTIVE_JOB_INFO(SUBSYSTEM_LIST_FILTER 
=> 'TESTSBS')) A

The results show me that I am missing that second job.

JOB_NAME
236408/QSYS/TESTSBS
295285/SOMEUSER/FIRSTJOB

I could create a program with the subsystem and job names hard coded, but in my opinion that would be a waste. It would be better if I create a program where I would pass it:

  1. Subsystem name
  2. Job name

And it would return:

  1. Job active?

My program, SBSTEST, is RPG with the SQL statement embedded within. I think it is short enough to be shown all at once.

01  **free
02  ctl-opt Main(Main) option(*srcstmt) dftactgrp(*no) ;

03  dcl-pr Main extpgm('SBSTEST') ;
04    *n char(10) ;  //Subsystem name
05    *n char(10) ;  //Job name
06    *n char(1) ;   //Active=Y, Inactive=N
07  end-pr ;

08  dcl-proc Main ;
09    dcl-pi *n ;
10      SbsName char(10) ;
11      SbsJob char(10) ;
12      JobActive char(1) ;
13    end-pi ;

14    JobActive = 'N' ;

15    exec sql SELECT 'Y' INTO :JobActive
16       FROM TABLE(QSYS2.ACTIVE_JOB_INFO(SUBSYSTEM_LIST_FILTER 
                                           => :SbsName)) A
17       WHERE JOB_NAME LIKE CONCAT('%',RTRIM(:SbsJob))
18       FETCH FIRST ROW ONLY ;
19  end-proc ;

Line 1: No surprise that this is written in totally free RPG.

Line 2: My control options state that there is a Main procedure, I only need the source statement option as there is no file updated in this program, and as the program contains a (Main) subprocedure it cannot run in the default activation group.

Lines 3 – 7: As parameters are passed to this program I need a procedure prototype for the Main procedures. As I described above it has three parameters. I never bother to give the parameters names in the prototype, therefore, I need to use *N (null) instead.

Line 8: Start of the Main procedure.

Lines 9 – 13: Here is the procedure interface. I don't have to give it a name, so I use *N instead. Here is where I name the parameters.

Line 14: Set the default value of "N" to JobActive, which is the value the calling program will use to know if the job is active in the subsystem.

Lines 15 - 18: This is the SQL statement that will let me know if a job is active.

Line 15: "Y" is moved to the parameter JobActive if the statement returns a result. I am sure you all know that all the RPG parameters, variables, etc. have to start with a colon ( : ) when used in a SQL statement.

Line 16: The SUBSYSTEM_LIST_FILTER parameter is used with the ACTIVE_JOB_INFO table function for just the subsystem passed in the parameter SbsName.

Line 17: I only need to job name part of the Job Name column, the job number and job user are irrelevant. The job name is the last of the three parts of the Job Name column, therefore, I need to use a WHERE with a LIKE to do a "wild card" type search. The percentage character ( % ) is the wild card character. Here I want to return all results where the Job Name column ends with the job name parameter. To make the value of that will be used I need to concatenate the percentage character to the start of the job name, which must be right trimmed to remove all trailing blanks. If it is not right trimmed the WHERE will look for the job name followed by the number of blank characters afterwards. Using the right trim allows WHERE to return any row where the Job Name ends with the passed job name.

Line 18: If there are multiple jobs running in the subsystem with the same job name I can only return the first one, otherwise I will get an SQL error.

Line 19: The Main procedure ends with no *INLR or RETURN required.

There are other ways I could have coded the concatenation, like:

17      WHERE JOB_NAME LIKE '%' CONCAT TRIM(:SbsJob) ;

Or I could have used the pipe symbols ( || ).

17      WHERE JOB_NAME LIKE '%' || TRIM(:SbsJob) ;

What about a program to call this? I wrote this simple program CL program to show how SBSTEST can be called, and what to do with the returned result.

01  PGM

02  DCL VAR(&ACTIVE) TYPE(*CHAR) LEN(1)

03  CALL PGM(SBSTEST) PARM('TESTSBS' 'FIRSTJOB' &ACTIVE)

04  IF COND(&ACTIVE = 'N') THEN(DO)
      /* Submit first job */
05  ENDDO

06  CALL PGM(SBSTEST) PARM('TESTSBS' 'SECONDJOB' &ACTIVE)

07  IF COND(&ACTIVE = 'N') THEN(DO)
      /* Submit second job */
08  ENDDO

09  ENDPGM

Line 2: The variable &ACTIVE will contain the returned value from the called program.

Line 3: The first time the program is called I am passing the subsystem name, TESTSBS, and the job name, FIRSTJOB as strings. The result is returned in the variable &ACTIVE.

Line 4: Here I know that FIRSTJOB is active, and &ACTIVE contains "Y". Therefore, the job is not submitted again.

Line 6: SBSTEST is called again with the same subsystem, but with the job name of SECONDJOB.

Line 7: We know that &ACTIVE is "N" as there is no job called SECONDJOB in this subsystem. Now the second job can be submitted.

So there you have it, a simple way using SQL to know whether any job in any particular system is active. My colleague was impressed with how few lines of code this took, and copied it to her environment to use.

 

You can learn more about the ACTIVE_JOB_INFO table function from the IBM website here.

 

This article was written for IBM i 7.3 and 7.2.

8 comments:

  1. I know this sounds funny but ...
    I looked at library QSYS2 on our production partition. The code works beautifully for a subsystem I look at daily to validate all 11 jobs are running in it.
    How would I locate the table function in QSYS2. It isn’t listed in the library.

    ReplyDelete
    Replies
    1. ACTIVE_JOB_INFO is the "long" name of the object. The system "short" name will be something different.

      My opinion is who cares what the "short" name is as you can only use this in a SQL interface.

      Delete
  2. in case you have more than one job that end in "%JOB" if you change the where clause, it will find the exact match:

    where substr(job_name, locate('/', job_name, locate('/', job_name) + 1) + 1) = :sbsjob

    ReplyDelete
  3. Maybe I didn't explain my question quite right ... if I open a green screen and run dsplib qsys2, I can't see a table function which is where I looked. I then opened ACS, Databases, Schemas, open the library (QSYS2), opened Functions and then the list of functions is listed.

    ReplyDelete
  4. Great article Thank you

    ReplyDelete
  5. I have used one of these tables before with a filter (I think it was job_info with a user filter) and I found it could take a surprisingly long time to run, up to about 60 seconds on our production system. I had to abandon using sql and use a different technique which was much faster. Was that just due to the large number of jobs on the system? Any performance issues with this table?

    ReplyDelete
  6. Simon, thanks for sharing. It get better and better every release . I used user space to do the same , my code design was about the same less the embedded sql statement.. great read and code examples. Another teaching moment.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.