Speeding up CPYF @ RPGPGM.COM

Wednesday, August 19, 2015

Speeding up CPYF

cpyf fromrcd *start and control blocking *buf256

Every developer type person who has worked on the IBM i, or its predecessors, has had to use the Copy File command, CPYF, to copy data from one file to another. Most have rarely used more that the first screen's parameters, thereby missing a way to make the command execute faster. For small files the difference is negligible, but in a file of several 100,000 records or more the method I describe below will make a noticeable difference.

I have to thank Stu Haddock for reminding me of this. It is all down to the value of the FROMRCD parameter of the CPYF command.

The default for the FROMRCD parameter is *START, which means that you want to start copying the data from the start of the file. You can also enter from which record you wish to copy. If I enter "1", to indicate that I want to start copying from the first records. I would expect that to be the same as the FROMRCD(*START). But it is not. If I am copying a keyed file:

FROMRCD(*START) - The records are copied in keyed order.
FROMRCD(1) - The records are copied in arrival sequence, in other words the order the records are in the file, and not in keyed order.

But does this really make a difference?

I created a file called BIGFILE that contained eight fields. The first field, FLD1, is a packed field and is the file's only key field. I was not going to do much of a test if the records were number sequentially, so I wrote a program to write 1 million records to the file and FLD1 contains a random number generated by the CEERAN0 API. If you are not familiar with this API or with generating random numbers you should read the post Generating random numbers.

I created three programs:

The first used CPYF with FROMRCD(*START):

  CPYF FROMFILE(MYLIB/BIGFILE) +
         TOFILE(QTEMP/@BIGFILE) MBROPT(*ADD)

The second used CPYF with FROMRCD(1):

  CPYF FROMFILE(MYLIB/BIGFILE) +
         TOFILE(QTEMP/@BIGFILE) MBROPT(*ADD) +
         FROMRCD(1)

And the third program used CPYF with FROMRCD(1) and control blocking. To learn about how to use control blocking see Using control blocking to improve database performance.

  OVRDBF FILE(BIGFILE) TOFILE(MYLIB/BIGFILE) +
           OVRSCOPE(*JOB) SEQONLY(*YES *BUF256KB)

  OVRDBF FILE(@BIGFILE) TOFILE(QTEMP/@BIGFILE) +
           OVRSCOPE(*JOB) SEQONLY(*YES *BUF256KB)

  CPYF FROMFILE(MYLIB/BIGFILE) +
         TOFILE(QTEMP/@BIGFILE) MBROPT(*ADD) +
         FROMRCD(1)

I ran this on an IBM i where I was the only user signed on. Each program was submitted ten times to a "single threaded" job queue in the order of program 1, 2, 3, 1, 2, 3, etc. The time taken for each CPYF was calculated in seconds, and the average for each type of copy file.

Type of copy	Average seconds
`FROMRCD(*START)`	36.0
`FROMRCD(1)`	15.2
`FROMRCD(1)` with control blocking	13.6

The FROMRCD(*1) took less than half the time it took the FROMRCD(*START) to complete. Using control blocking made the CPYF even faster completing in 38% of the time it takes using the FROMRCD(*START) to finish.

Next time you have to copy a large file you should use control blocking and FROMRCD(1) rather than just the CPYF command's defaults.

Do note that the FROMRCD parameter can only be used with Physical file and SQL tables, not with Logical files.

What happens if you want to use selection criteria? See here.

You can learn more about the CPYF command on the IBM web site here.

This article was written for IBM i 7.2, and should work for earlier releases too.

17 comments:

steve pearsonAugust 19, 2015 at 7:21 PM
that's actually pretty useful. i do a lot of crazy things with CPYF but have completely ignored that parm before.
ReplyDelete
Replies
BrunoAugust 19, 2015 at 11:22 PM
Hi,
I have written several file copy programs using SQL. The blocking factor is more important and very significant time savings. When the target files include index, it is very interesting to hold rebuild while the copywith CHGLF command.
ReplyDelete
Replies
BhabaniAugust 20, 2015 at 7:06 AM
Sir, This info is really helpful as i frequently use this command.

Thanks a lot keep posting new things which are hard to find in internet.
ReplyDelete
Replies
AuzaiderAugust 20, 2015 at 2:03 PM
Thanks a lot Simon!!!
ReplyDelete
Replies
Hashan AriyathilakeAugust 22, 2015 at 12:02 AM
That's really useful..Thanks a lot...
ReplyDelete
Replies
ken holwayAugust 23, 2015 at 4:14 AM
but is CPYF better than using sql?
ReplyDelete
Replies
AnonymousAugust 25, 2015 at 2:45 PM
Perhaps in a world of constraints and triggers, we should follow the advice of IBMs experts and "Throw CPYF and CRTDUPOBJ away" in favor of SQL inserts.
--Keith Hodges
ReplyDelete
Replies
AnonymousAugust 27, 2015 at 7:12 PM
Great post Simon. We have extremely large files in our organisation which we copy sometimes and the FROMRCD and blocking will play a big part in my command from now on.
ReplyDelete
Replies
AnonymousFebruary 9, 2016 at 9:59 AM
Kind of a related question: If I have a partial file on one system, can I copy using FROMRCD and TORCD with these options to put the records into the same relative record numbers on the "target"? In my case, I have a large number of deleted rows on the file copy, the target. If I copy from record number 1,000,000 on the original, the source, will those get copied into the corresponding relative record numbers on the file copy, the target?
ReplyDelete
Replies
AnonymousJune 6, 2017 at 7:24 AM
And also could divide and conquer. Retrieve the number of rows in the table and divide it by the number of batch jobs to do CPYF. Say 200,000,000 rows, 20 jobs, that's 10,000,000 rows per job. Submit each batch job to QUSRNOMAX to do a CPYF with specific FROMRCD/TORCD row numbers (1 to 10,000,000, 10,000,001 to 20,000,000, etc). This can be done in CL and pass in number of jobs and do SBMJOBs in a loop. When the TO table has no more locks, it's done (or use API QUSRJOBI 'JOBI0400' to check each batch job).

Ringer
ReplyDelete
Replies
OlivierJanuary 8, 2019 at 9:28 AM
Hi,

Is the creation of a SQL index can speed a cpyf using INCREL option?

Regards,

Olivier Drubigny.
ReplyDelete
Replies
Todd McMillianMay 6, 2021 at 5:26 AM
Thanks for sharing
ReplyDelete
Replies

Add comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.

RPGPGM.COM - From AS400 to IBM i

Pages

Wednesday, August 19, 2015

Speeding up CPYF

17 comments: