Wednesday, November 8, 2017

Inserting more than one row at a time

sql insert to add more than 1 record at a time

Most of us who have been programming in RPG for many years are familiar with doing things at a record level. We read, write, update, and delete one record at a time. Moving to SQL allows us to do things a set of data at a time, more than one row/record. We can fetch multiple rows from a table/file, delete more than one row with a single statement, and we can add multiple rows/records to a table/file at once. You can still do row/record level access using SQL, but until you start thinking in sets you do not get to experience the speed of file I/O that you can with using multiple row/record statements.

In this post I am going to give two examples of performing multiple row inserts.

  1. Inserting more than one row/record at a time with just one statement with hard coded values
  2. Inserting just a few rows/records from one table to another, as you would do to create testing data

 

1. Inserting multiple hard coded rows

For the posts I wrote about regular expressions I used a table PERSONS.

CREATE TABLE MYLIB.PERSON (
  PERSON_ID FOR "PID" INTEGER
            GENERATED ALWAYS AS IDENTITY
            (START WITH 1, INCREMENT BY 1, NOCYCLE),
  FIRST_NAME FOR "FNAME" VARCHAR(20),
  LAST_NAME FOR "LNAME" VARCHAR(30)
)

The table contains three columns, with the first being an identity column, Db2 for i will create the value when a row is Inserted into the table. Therefore, I must not give a value for that column, I use DEFAULT as the value. If I was just inserting one row I would use:

INSERT INTO PERSON VALUES(DEFAULT,'ROBERT','ALLEN')

I had 25 rows to add. I could have used 25 Insert statements to add the rows. Why would I do that when I can insert all 25 at once in one Insert statement.

INSERT INTO PERSON VALUES(DEFAULT,'ROBERT','ALLEN'),
                         (DEFAULT,'John','Austin'),
                         (DEFAULT,'don','bennett'),
                         (DEFAULT,'DIETMAR','BRUCK'),
                         (DEFAULT,'Brian','Caine'),
                         (DEFAULT,'george','curtis'),
                         (DEFAULT,'ALAN','DALEY'),
                         (DEFAULT,'Ron','Framer'),
                         (DEFAULT,'lol','harvey'),
                         (DEFAULT,'RON','HEWITT'),
                         (DEFAULT,'Brian','Hill'),
                         (DEFAULT,'peter','hill'),
                         (DEFAULT,'COLIN','HOLDER'),
                         (DEFAULT,'Stuart','Imlach'),
                         (DEFAULT,'eric','jones'),
                         (DEFAULT,'MICK','KEARNS'),
                         (DEFAULT,'Frank','Kletzenbauer'),
                         (DEFAULT,'arthur','lightening'),
                         (DEFAULT,'BILLY','MYERSCOUGH'),
                         (DEFAULT,'Brian','Nicholas'),
                         (DEFAULT,'reg','ryan'),
                         (DEFAULT,'KEN','SATCHWELL'),
                         (DEFAULT,'Nelson','Stiffle'),
                         (DEFAULT,'ray','straw'),
                         (DEFAULT,'BOB','WESSON')

The column values for each row are enclosed with parentheses ( ( ) ), and each row is separated by a comma ( , ). I formatted the above statement in a way I thought would clearly show each row, and make it easier for another programmer to understand what this statement does. I could have easily had each rows columns follow the previous row.

INSERT INTO PERSON VALUES(DEFAULT,'ROBERT','ALLEN'),(DEFAULT,'John',
'Austin'),(DEFAULT,'don','bennett'),(DEFAULT,'DIETMAR','BRUCK'),
(DEFAULT,'Brian','Caine'),(DEFAULT,'george','curtis'),(DEFAULT,
'ALAN','DALEY'),(DEFAULT,'Ron','Framer'),(DEFAULT,'lol','harvey'),
(DEFAULT,'RON','HEWITT'),(DEFAULT,'Brian','Hill'),(DEFAULT,'peter',
'hill'),(DEFAULT,'COLIN','HOLDER'),(DEFAULT,'Stuart','Imlach'),
(DEFAULT,'eric','jones'),(DEFAULT,'MICK','KEARNS'),(DEFAULT,'Frank',
'Kletzenbauer'),(DEFAULT,'arthur','lightening'),(DEFAULT,'BILLY',
'MYERSCOUGH'),(DEFAULT,'Brian','Nicholas'),(DEFAULT,'reg','ryan'),
(DEFAULT,'KEN','SATCHWELL'),(DEFAULT,'Nelson','Stiffle'),(DEFAULT,
'ray','straw'),(DEFAULT,'BOB','WESSON')

Personally I think the first example is better as it is easier to see what the statement does as the rows are all on their own line.

As this is one Insert statement all of the rows are added to the table in one file output, which is a lot faster than 25 individual file outputs.

 

2. Inserting just a few records from one table to another

We all create test data we use to test our programs. Most of the time I do not want to use all the records/rows from a file/table as there is just too much data. The program can take a long time to run, and then I feel intimidated by the amount of data I feel I need to check. Most of the time I just copy a few records/rows from the production file into a test file, and use the test file.

How to extract the data for my test file?

I could use the Copy file command, CPYF, to copy a number of records from my production file to my test file. I only want records where the field Column 1 is equal to A, B, or C, and only five of each.

CPYF FROMFILE(PRODLIB/PRODFILE)
       TOFILE(TESTFILE/TESTFILE)
       MBROPT(*ADD)
       FROMRCD(1)
       TORCD(15)
       INCREL((*IF COLUMN1 *EQ 'A')
              (*OR COLUMN1 *EQ 'B')
              (*OR COLUMN1 *EQ 'C'))

Using FROMRCD(1) makes my copy faster, than if I had used FROMRCD(*START).

The problem with this statement is that it will only look in the first 15 records for any records that meet my selection criteria. If there are none in the first 15 records I get no records copied, or perhaps I get 10 As and 5 Bs.

I could use SQL Insert statements to first select the eligible rows, and then insert them into my test file. This code snippet is from a RPG, the same could be done in a CL program using the RUNSQL command, or even have this in a source member and execute it using the Run SQL Statement command, RUNSQLSTM.

exec sql DELETE FROM TESTLIB.TESTFILE ;

exec sql INSERT INTO TESTLIB.TESTFILE
          SELECT * FROM PRODLIB.PRODFILE
           WHERE COLUMN_1 = 'A'
           FETCH FIRST 2 ROWS ONLY ;

exec sql INSERT INTO TESTLIB.TESTFILE
          SELECT * FROM PRODLIB.PRODFILE
           WHERE COLUMN_1 = 'B'
           FETCH FIRST 2 ROWS ONLY ;

exec sql INSERT INTO TESTLIB.TESTFILE
          SELECT * FROM PRODLIB.PRODFILE
           WHERE COLUMN_1 = 'C'
           FETCH FIRST 2 ROWS ONLY ;

The interesting part of these Insert statements is the FETCH FIRST 2 ROWS ONLY. Once the second eligible row is found the statement ends.

   Col 1
     A
     A
     B
     B
     C
     C

The problem with this approach is that I need to execute three statements, three times PRODFILE is searched for the eligible records. It would be more efficient, and faster, if I could do it all in one statement.

DELETE FROM TESTLIB.TESTFILE ;

INSERT INTO TESTLIB.TESTFILE
  (SELECT * FROM PRODLIB.PRODFILE
    WHERE COLUMN_1 = 'A'
    FETCH FIRST 2 ROWS ONLY)
UNION
  (SELECT * FROM PRODLIB.PRODFILE
    WHERE COLUMN_1 = 'B'
    FETCH FIRST 2 ROWS ONLY)
UNION
  (SELECT * FROM PRODLIB.PRODFILE
    WHERE COLUMN_1 = 'C'
    FETCH FIRST 2 ROWS ONLY) ;

This code is from a member I created, I then executed the statements within using the RUNSQLSTM command.

The UNION combines the three statements into one. Therefore, when this Insert statement is run it only needs to search PRODFILE once. I can see that when I look in TESTFILE.

   Col 1
     A
     A
     C
     B
     C
     B

The data inserted clearly shows that the Insert statement ran as one statement, rather than three separate ones, clearly more efficient than using three Insert statements.

 

I hope that the above examples help you to see the benefits to start looking at data in sets, rather than at a record level. If you can change your mindset you will quickly find how much easier it is to use these multiple row/record functionality, and how much faster your programs execute.

 

You can learn more about SQL Insert statement from the IBM website here.

 

This article was written for IBM i 7.3, and should work for earlier releases too.

3 comments:

  1. I think it's better to include the column names so pre-existing inserts won't fail in the future when a new column is added. Column names not referenced get assigned their default value.

    INSERT INTO PERSON
    (PERSON_ID, FIRST_NAME, LAST_NAME)
    VALUES(DEFAULT,'ROBERT','ALLEN')

    Ringer

    ReplyDelete
  2. Coincidentally, we were just using this technique to load data into a new database, from an older one. The older database stored date values as the number of days since 01-01-1991, which was a bit odd. Our SELECT statement performed the conversion as follows:

    date('1901-01-01') + rntjdt days as todt, ...

    The data-conversion operations were actually quite entailed, but we managed that by building a series of cascading SQL views (views that are based on preceding ones), until we had a view that resembled the tables that we were loading data into.

    ReplyDelete
    Replies
    1. I have the same as you, built Views using other Views to get the data in the manner I desire. It is so cool that Db2 for i is smart enough to combine all the logic of the Views and Views of Views into one "statement" when those Views of Views are used.

      See Build Views and Views of Views

      Delete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.