Wednesday, March 8, 2023

Copying data with an Identity column from one table to another and not change the value

It has always been a problem with copying data from one SQL DDL table to another if the table contains an identity column. Identity columns are controlled by the database itself. When a new row is inserted into the table the database assigns the Identity column in the new row the next sequence number. They are a great way to ensure that the table has a unique key, and they can be shared with other files as an external key to link back to the original table.

The issue has always been that if I copy data from one table to another with an Identity column of the same name the database will assign the inserted rows new identity values using the sequence of the copied to table. This has been a great frustration to us all, and has led to many forsaking Identity columns altogether.

In the latest round of Technology Refreshes to IBM i 7.5 TR1 and IBM i TR7 a new global variable was introduced to allow me to copy from one file to another and the Identity column's values not be changed.

Global variables have been around for several releases. They store their data externally, outside of any program that uses them. Their beauty is that the value they contain is only for that session, all other sessions will either see the global variable's default value, or whatever value that session has moved into the global variable.

The latest TRs brings us a new Global Variable, REPLICATION_OVERRIDE. The variable can only contain two values:

  • N which is the default
  • Y

If the value for the session is 'Y' then I can copy the contents from one table to another without replacing the values in the Identity column.

Before I start showing you this I need to have a Table, TESTTABLE, I will be copying data from.

I created a very simple table with only two columns:

01  CREATE TABLE MYLIB.TESTTABLE (
02   IDENTITY FOR UNIQUE NUMERIC(10,0)
03     GENERATED ALWAYS AS IDENTITY
04     (START WITH 1, INCREMENT BY 1, NO CYCLE),
05   OTHER_COLUMN FOR COLUMN OTHERCOL VARCHAR(20))

Lines 2 – 4: These are the lines that define an Identity column. If you want to learn more about what you need to give for a column to be an Identity column read my first post about Identity columns.

Line 5: I added a second column which is just a regular variable length character, VARCHAR.

I can confirm my identity really is an Identity column by using the SYSCOLUMNS View:

01  SELECT TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME,IS_IDENTITY 
02    FROM QSYS2.SYSCOLUMNS 
03   WHERE TABLE_SCHEMA = 'MYLIB'
04     AND TABLE_NAME = 'TESTTABLE'

Line 1: I only want the following columns returned in my results:

  • TABLE_SCHEMA:  Schema or library the Table is in
  • TABLE_NAME:  Name of the table
  • COLUMN_NAME:  Name of the column
  • IS_IDENTITY:  Is the column an Identity column?

The results from this are:

TABLE_SCHEMA  TABLE_NAME  COLUMN_NAME   IS_INDENTITY
------------  ----------  ------------  ------------
MYLIB         TESTTABLE   IDENTITY      YES
MYLIB         TESTTABLE   OTHER_COLUMN  NO

The results confirm that my Identity column really is one.

Now I can insert data into this table:

01  INSERT INTO TESTTABLE (OTHER_COLUMN)
02    VALUES('first'),
03          ('second'),
04          ('third'),
05          ('fourth'),
06          ('fifth')

Line 1: As the Identity column is updated by the database I do not include it in the columns to be inserted into. Therefore, I only need to list the non-identity columns, which in this case is OTHER_COLUMN.

Lines 2 – 6: I am inserting multiple rows into the table in one statement.

I can use the following to check the identity value the database gave these inserted rows:

SELECT * FROM TESTTABLE

Which returns:

IDENTITY   OTHER COLUMN
--------   ------------
       1   first
       2   second
       3   third
       4   fourth
       5   fifth

I am going to delete all the rows from the Table:

DELETE TESTTABLE

And then insert them again

01  INSERT INTO TESTTABLE (OTHER_COLUMN)
02    VALUES('first'),
03          ('second'),
04          ('third'),
05          ('fourth'),
06          ('fifth')

When I use the Select statement to view the contents of the Table the Identity value was not reset when I deleted the old rows from the Table.

IDENTITY   OTHER COLUMN
--------   ------------
       6   first
       7   second
       8   third
       9   fourth
      10   fifth

I need a Table to copy these rows into. I created a second Table, TESTTABLE2, that is identical to the first:

01  CREATE TABLE MYLIB.TESTTABLE2 (
02    IDENTITY FOR UNIQUE NUMERIC(10,0)
03     GENERATED ALWAYS AS IDENTITY
04     (START WITH 1, INCREMENT BY 1, NO CYCLE),
05   OTHER_COLUMN FOR COLUMN OTHERCOL VARCHAR(20))

I use the Insert statement to copy the rows from TESTTABLE into TESTTABLE2. Most of the time I would use the following statement to copy all the columns in the Table to the other:

01  INSERT INTO TESTTABLE2
02    SELECT * FROM TESTTABLE

Alas, I get an error message:

SQL State: 428C9 
Vendor Code: -798 
Message: [SQL0798] Value cannot be specified for GENERATED ALWAYS column UNIQUE. 
Cause . . . . . :   A value cannot be specified for column UNIQUE because it is
defined as GENERATED ALWAYS.

As I cannot copy the Identity column I have to use the following Insert statement:

01  INSERT INTO TESTTABLE2 (OTHER_COLUMN)
02    SELECT OTHER_COLUMN FROM TESTTABLE

Notice I have only inserted the values in the OTHER_COLUMN column. The Identity column's values have been given by the database. When I look at the rows in TESTTABLE2:

SELECT * FROM TESTTABLE2

I see:

IDENTITY   OTHER COLUMN
--------   ------------
       1   first
       2   second
       3   third
       4   fourth
       5   fifth

The Identity values have been calculated afresh for this Table. Which is not what I want.

This is where I use the REPLICATION_OVERRIDE Global Variable.

I can retrieve what the Global Variable contains for this session by using:

VALUES QSYS2.REPLICATION_OVERRIDE

Which returns the variable's default value:

00001
------
N

To be able to change the Global Variable I need to be authorized to the QIBM_DB_GENCOL_OVERRIDE Function Usage Identifier. I can check whether I am or not using the SQL_CHECK_FUNCTION_USAGE scalar function:

VALUES QSYS2.SQL_CHECK_FUNCTION_USAGE('QIBM_DB_GENCOL_OVERRIDE')

This will return '1' if I am authorized to it or '0' if I am not. In this case I am not authorized to it:

00001
------
N

I authorize myself to the function with the following command:

CHGFCNUSG FCNID(QIBM_DB_GENCOL_OVERRIDE) USER(<user-profile>) USAGE(*ALLOWED)

You might need your IBM i Sys Admin to do this for you if you do not have adequate authority to run the command.

I change the Global Variables value with a SQL SET statement:

SET QSYS2.REPLICATION_OVERRIDE = 'Y'

And I can see that the value within it has changed:

VALUES QSYS2.REPLICATION_OVERRIDE


00001
------
Y

Just to prove it is only this session that has the changed value I opened a second Run SQL Scripts window and checked what it showed for the value in REPLICATION_OVERRIDE:

VALUES QSYS2.REPLICATION_OVERRIDE


00001
------
N

If I try to an Insert statement with just OTHER_COLUMN:

INSERT INTO TESTTABLE2 (OTHER_COLUMN) 
  SELECT OTHER_COLUMN FROM TESTTABLE

I receive the following error message:

SQL State: 22005 
Vendor Code: -20584 
Message: [SQ20584] Default not allowed for table TESTTABLE2 in MYLIB. 
Cause . . . . . :   A value of DEFAULT was specified for at least one column in
table TESTTABLE2 in MYLIB.  The value was either explicitly provided, specified
as an extended indicator value, or implicitly requested by omission of the 
column from the insert or update operation. When the REPLICATION_OVERRIDE
global variable is set to Y, defaults are not supported.
Recovery  . . . :   Specify a value for all columns in the INSERT, UPDATE, or
MERGE statement.

I must insert all the columns from TESTTABLE into TESTTABLE2, including the Identity column. My Insert statement now becomes:

INSERT INTO TESTTABLE2 
  SELECT * FROM TESTTABLE

And the values of the Identity column have been copied to TESTTABLE2's identity column:

SELECT * FROM TESTTABLE2


IDENTITY   OTHER COLUMN
--------   ------------
       6   first
       7   second
       8   third
       9   fourth
      10   fifth

Now I am done with the Global Variable I can set it back to its default for this session with another SQL Set statement:

SET QSYS2.REPLICATION_OVERRIDE = 'N'

Or when the session or job ends the value for that session is lost, and the next session will see the default value for REPLICATION_OVERRIDE.

 

You can learn more about the REPLICATION_OVERRIDE SQL Global Variable from the IBM website here.

 

This article was written for IBM i 7.5 TR1 and 7.4 TR7.

4 comments:

  1. I have struggled with this for years. I use the overriding system value. In this example I have Order Header and Order Detail that I want to copy from the production environment to the test environment. Order detail uses the identity of Order Header as a foreign key.
    First, clear the order detail since it has a foreign key constraint.
    TRUNCATE TESTLIB/ORDER_DETAIL;
    Then clear the order header.
    TRUNCATE TESTLIB/ORDER_HEADER;
    Then copy the order header from production to test. Keep the identity column values from production.
    INSERT INTO TESTLIB/ORDER_HEADER OVERRIDING SYSTEM VALUE
    (SELECT * FROM PRODLIB/ORDER_HEADER)
    Then copy the order detail from production to test. This retains the correct foreign key relationship.
    INSERT INTO TESTLIB/ORDER_DETAIL OVERRIDING SYSTEM VALUE
    (SELECT * FROM PRODLIB/ORDER_DETAIL)
    Also, I have a command that I run immediately after this that resets the next identity value to the max(identity value) + 1.
    Looking at the documentation for REPLICATION_OVERRIDE I think it basically does the same as OVERRIDING SYSTEM VALUE.
    I am curious if you still need to update the next identity value with this new option. What happens when you try to add a new record to your TESTTABLE2 after adding those records? I think it will try to add a record with a duplicate identity value and assuming there is a unique key it should fail.

    ReplyDelete
    Replies
    1. I have found that it does allow me to enter rows that will then result in duplicate identity values!

      I would need to reset the identity column to greater than the rows I inserted into TESTTABLE2 using what is described here.

      Delete
  2. The CPYF command will also copy records and not change the identity column values.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.