Every developer type person who has worked on the IBM i, or its predecessors, has had to use the Copy File command, CPYF, to copy data from one file to another. Most have rarely used more that the first screen's parameters, thereby missing a way to make the command execute faster. For small files the difference is negligible, but in a file of several 100,000 records or more the method I describe below will make a noticeable difference.
I have to thank Stu Haddock for reminding me of this. It is all down to the value of the FROMRCD parameter of the CPYF command.
The default for the FROMRCD parameter is *START, which means that you want to start copying the data from the start of the file. You can also enter from which record you wish to copy. If I enter "1", to indicate that I want to start copying from the first records. I would expect that to be the same as the FROMRCD(*START). But it is not. If I am copying a keyed file:
- FROMRCD(*START) - The records are copied in keyed order.
- FROMRCD(1) - The records are copied in arrival sequence, in other words the order the records are in the file, and not in keyed order.
But does this really make a difference?
I created a file called BIGFILE that contained eight fields. The first field, FLD1, is a packed field and is the file's only key field. I was not going to do much of a test if the records were number sequentially, so I wrote a program to write 1 million records to the file and FLD1 contains a random number generated by the CEERAN0 API. If you are not familiar with this API or with generating random numbers you should read the post Generating random numbers.
I created three programs:
The first used CPYF with FROMRCD(*START):
CPYF FROMFILE(MYLIB/BIGFILE) + TOFILE(QTEMP/@BIGFILE) MBROPT(*ADD)
The second used CPYF with FROMRCD(1):
CPYF FROMFILE(MYLIB/BIGFILE) + TOFILE(QTEMP/@BIGFILE) MBROPT(*ADD) + FROMRCD(1)
And the third program used CPYF with FROMRCD(1) and control blocking. To learn about how to use control blocking see Using control blocking to improve database performance.
OVRDBF FILE(BIGFILE) TOFILE(MYLIB/BIGFILE) + OVRSCOPE(*JOB) SEQONLY(*YES *BUF256KB) OVRDBF FILE(@BIGFILE) TOFILE(QTEMP/@BIGFILE) + OVRSCOPE(*JOB) SEQONLY(*YES *BUF256KB) CPYF FROMFILE(MYLIB/BIGFILE) + TOFILE(QTEMP/@BIGFILE) MBROPT(*ADD) + FROMRCD(1)
I ran this on an IBM i where I was the only user signed on. Each program was submitted ten times to a "single threaded" job queue in the order of program 1, 2, 3, 1, 2, 3, etc. The time taken for each CPYF was calculated in seconds, and the average for each type of copy file.
|Type of copy||Average
|FROMRCD(1) with control blocking||13.6|
The FROMRCD(*1) took less than half the time it took the FROMRCD(*START) to complete. Using control blocking made the CPYF even faster completing in 38% of the time it takes using the FROMRCD(*START) to finish.
Next time you have to copy a large file you should use control blocking and FROMRCD(1) rather than just the CPYF command's defaults.
Do note that the FROMRCD parameter can only be used with Physical file and SQL tables, not with Logical files.
What happens if you want to use selection criteria? See here.
You can learn more about the CPYF command on the IBM web site here.
This article was written for IBM i 7.2, and should work for earlier releases too.