CHAIN versus SETLL the results @ RPGPGM.COM

Sunday, September 1, 2013

CHAIN versus SETLL the results

Having started the debate on which is faster in my post Validation: CHAIN versus SETLL I decided to put the theory to the test.

I created a DDS file with a single key:

  A                                      UNIQUE
  A          R TESTFILER
  A            KEY            7P 0
  A            F1             3
  A            F2             5P 0
  A            F3            30
  A            F4             3P 2
  A            F5            50
  A          K KEY

And filled the file with a million records, the field KEY contained the values of 1 to 1 million.

I created two almost identical programs. Each one would perform their operation 1 million times, and write a record to an output file with the Start time, End time, and the number of milliseconds it took to perform the 1 million operations. Those programs are listed below:

CHAIN program

01 FTESTFILE  IF   E           K DISK
02 FTESTPF    O    E             DISK

03 D KeyFld          S                   like(KEY)
04 D TmeStmp         S               Z
05 D i               S             10U 0
    /free
06     PGM = 'CHAIN' ;
07     START = %timestamp() ;

08     for i = 1 by 1 to 1000000 ;
09        TmeStmp = %timestamp() ;
10        KeyFld = %subdt(TmeStmp:*ms) ;
11        chain KeyFld TESTFILER ;
12     endfor ;

13     FINISH = %timestamp() ;
14     DIFFERENCE = %diff(FINISH:START:*ms) ;
15     write TESTPFR ;
16     *inlr = *on ;

SETLL program

01 FTESTFILE  IF   E           K DISK
02 FTESTPF    O    E             DISK

03 D KeyFld          S                   like(KEY)
04 D TmeStmp         S               Z
05 D i               S             10U 0
    /free
06     PGM = 'SETLL' ;
07     START = %timestamp() ;

08     for i = 1 by 1 to 1000000 ;
09        TmeStmp = %timestamp() ;
10        KeyFld = %subdt(TmeStmp:*ms) ;
11        setll KeyFld TESTFILER ;
12     endfor ;

13     FINISH = %timestamp() ;
14     DIFFERENCE = %diff(FINISH:START:*ms) ;
15     write TESTPFR ;

16    *inlr = *on ;

The only differences are:

Line 6: Name written to the field in the output file.
Line 11: The operation is performed.

A CL program was written to call each of these programs five times.

The CL was submitted to the QINTER job queue on an IBM i 8406 70Y on a Sunday afternoon. Being a holiday weekend I knew that there would be no-one else running jobs on this server. The results were:

PGM	DIFFERENCE microseconds
CHAIN	34,178,000
SETLL	32,557,000
CHAIN	34,057,000
SETLL	32,547,000
CHAIN	34,271,000
SETLL	32,556,000
CHAIN	34,356,000
SETLL	32,785,000
CHAIN	34,467,000
SETLL	32,608,000
Average `CHAIN`	34,287,750
Average `SETLL`	32,610,600

This shows that in these programs where the operation was performed 1 million times the SETLL is 1.68 seconds faster than the CHAIN.

For one single operation the difference is negligible. But we all need to make our own decisions on when to use these kinds of performance differences, when to use one operation code rather than another, and what this conveys to others, or to are ourselves at a later date, when looking at the code.

Monday September 2: John Blenkinsop made an interesting comment about what would happen if there were unsuccessful SETLL and CHAIN operations?

I rebuilt the input file with the key field, KEY, containing the values of 1 to 1 million, but this time I incremented by 2, resulting in a file of 500,000 records.

I ran the same programs that I had before and these were the results:

PGM	DIFFERENCE microseconds
CHAIN	61,133,000
SETLL	55,052,000
CHAIN	59,066,000
SETLL	47,901,000
CHAIN	58,410,000
SETLL	47,259,000
CHAIN	56,340,000
SETLL	48,568,000
CHAIN	57,335,000
SETLL	48,109,000
Average `CHAIN`	58,456,800
Average `SETLL`	49,377,800

The difference is even more significant, 9.08 seconds, in favor of SETLL.

11 comments:

UnknownSeptember 1, 2013 at 9:29 PM
I don't think it matters a much that there is only 1.68 sec difference for execution of million times. What matters if you couple this with 10 to twenty other times in that program you do the more expensive operation, along with other cpu wasters. Then you are talking about some time especially if you couple that was another 20 or thirty programs that have the same type of philosophy.

To me it all boils down to are you willing to do things the most efficiently (with out using something obfuscated and hard to maintain) or just not care and add bigger faster cpu when you need it.

Sure one operation run even a million times in a one time program is no problem, but 50 to 100 of them in as many programs run every night in a nightly cycle could make the difference as to whether you have time to do your backups or get the system back to the users in a timely manner, and in the 24/7 world that can be important.
ReplyDelete
Replies
AnonymousSeptember 2, 2013 at 4:38 AM
(John Blenkinsop, AS400 Specialists @ LinkedIn)

In your test program, you created a test file with 1,000,000 records keyed from 1 to 1,000,000.

You CHAIN or SETLL with a key value derived from the timestamp - that is, an integer number of milliseconds. Every CHAIN or SETLL will get a hit.

But what about the difference in performance when there is NO matching record?

I created the test data key from 1 to 1,000,000 in an increment of 2, giving 500,000 records in the test file.

I then used amended versions of your programs which set the key from 1 to 1,000,000 in an increment of 1. Therefore half of the CHAINS and SETLLs would fail.

The resulting difference (admittedly in only one run of each program) was 6 seconds:

*...+....1....+....2....+....3....+....4....+....5....+....6....+....7
CHAIN 2013-09-02-11.15.40.4500002013-09-02-11.16.25.139000 ãi
CCCCD44444FFFF6FF6FF6FF4FF4FF4FFFFFFFFFF6FF6FF6FF4FF4FF4FFFFFF00004800
38195000002013009002011B15B40B4500002013009002011B16B25B1390000004690F

SETLL 2013-09-02-11.16.31.6790002013-09-02-11.17.10.219000 e
ECEDD44444FFFF6FF6FF6FF4FF4FF4FFFFFFFFFF6FF6FF6FF4FF4FF4FFFFFF00008400
25333000002013009002011B16B31B6790002013009002011B17B10B2190000003500F

Perhaps some more tests should be run to get an average, since other system operations of course have an effect, but it does point to the greater efficiency of SETLL, for verifying existence, against CHAIN for the same purpose.
ReplyDelete
Replies
JeshuaSeptember 2, 2013 at 10:39 AM
Well maybe SETLL is "faster" than CHAIN, but the difference is very very shortly... for a single record would be around 1.5 * 10^-7 or 0.00000015 seconds! insignificant!
ReplyDelete
Replies
Jon ParisSeptember 2, 2013 at 10:45 AM
There is a flaw in the testing methodology.

RPG Timestamps are only populated to thousandths of a second. So the test could only ever generate 1 in every 1,000 keys. As a result you will get the exact same value over and over again. Even if it were accurate to the millisecond the speed of the machine would still render the same values multiple times. Milliseconds are just too long for modern hardware.

To do it properly would require a pseudo random number e.g. CEERAN0.

Regardless - I don't think there was ever any real doubt that in most cases SETLL would outperform CHAIN. (Although if NOUNREF were specified then the difference would probably be smaller.)

The original question concerned validation. An in my opinion a subprocedure (containing whatever method you prefer) is the only approach in a modern programs and the only one where the intent is unambiguous.
ReplyDelete
Replies
Paul WadeSeptember 2, 2013 at 11:59 AM
SETLL is definitely more efficient - the data is loaded into memory. The CHAIN instruction will also lock a record depending on coding so be careful you are testing like for like. If you only want to know the existence of a record use SETLL, if you actually want the data use CHAIN and when using CHAIN be careful if the file if update capable because it will lock the record unless you tell it not to. Timings will also vary depending on what else the system is doing so in a simulation with little else contending for cache the timings may suggest a closer difference between SETLL and CHAIN. On a busier system you might well get a much greater difference. These performance tricks or simply awareness of performance from a programming perspective or very important and systems scale.
ReplyDelete
Replies
Louise SchmidtSeptember 2, 2013 at 8:08 PM
Thanks, This is great!
ReplyDelete
Replies
AnonymousSeptember 3, 2013 at 6:34 AM
I rarely build physical files with keys in them, having worked with ERP applications for a while; logical files built on PF accessed via CHAIN or SETLL will be another practical scenario where the performance results could show significant time difference in favor of SETLL.
ReplyDelete
Replies
Leslie PaulusSeptember 5, 2013 at 2:59 PM
I coach my programmers to code based upon desired and planned result. So every line of code should be understood and execute what is needed, not more, not less. That means that SETLL is the code unless you are purposely retrieving data. Using CHAIN because it doesn't hurt in a particular circumstances sets a bad precedent.
ReplyDelete
Replies
Victor PomortseffMarch 6, 2024 at 11:06 PM
Strictly speaking, the test is not entirely correct. The fact is that SetLL does not read the record, it only sets a pointer to the record with the required key value. While Chain finds a record by key and reads its contents.
This is noticeable (as mentioned above) in cases where you need to check the existence of a record with a given key value. Especially where the data is stored in PF, and the key is stored separately in the LF file. Using SetLL + %Equal will be faster than Chain due to the fact that in the first case there will be no access to PF, only to LF.
If you compare, then you need to compare Chain vs SetLL + Read - only in this case the final result (read contents of the record) will be identical.
ReplyDelete
Replies

Add comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.

RPGPGM.COM - From AS400 to IBM i

Pages

Sunday, September 1, 2013

CHAIN versus SETLL the results

11 comments: