Wednesday, October 3, 2018

Creating uniqueness using rowid

rowid column in sql table to ensure record uniqueness

The Db2 for i manual offers using a row id column, ROWID, as an alternative to an Identity column for ensuring table row uniqueness. While I wrote about using a ROWID SQL data type in RPG, I did not describe how the ROWID works when defined as a column in a table.

ROWID can only be used in SQL DDL tables, I cannot add it to a DDS file, using ALTER TABLE. It is defined in a similar manner to an Identity column where I say that the columns contents are generated by Db2 for i automatically.

For example, here is the DDL table I will be using:

01  CREATE TABLE MYLIB.TESTTABLE
02  (COLUMN1 CHAR(3) NOT NULL,
03   COLUMN2 CHAR(10),
04   COLUMN3 ROWID GENERATED ALWAYS IMPLICITLY HIDDEN
05  )

Line 4: COLUMN3 is the row id column. I have to say that it is GENERATED ALWAYS as Db2 for i will place the value in this column when I insert a row into the table.

I have also given IMPLICITLY HIDDEN, this means that the column will only be displayed by a Select statement if I list it in the list of columns I want to retrieve. If use * in the Select statement then it is not included in the returned results. I discussed this further in a post about adding audit columns to tables or files.

I need data in the table, so I used a multiple row Insert statement to populate it:

INSERT INTO TESTTABLE (COLUMN1,COLUMN2)
                      VALUES('001','FIRST 1'),
                            ('002','SECOND'),
                            ('001','FIRST 2'),
                            ('003','THIRD'),
                            ('001','FIRST 3'),
                            ('004','FOURTH'),
                            ('005','FIVE'),
                            ('006','SIX'),
                            ('001','FIRST 4')

On the first line I have only given the first two columns, and the values inserted only do so to two columns. As the third column, COLUMN3, is GENERATED ALWAYS Db2 for i generates the value of the row id when the row is inserted.

Now I can show the difference that the IMPLICITLY HIDDEN makes. In this first Select I am using * to select all the rows:

SELECT * FROM TESTTABLE

COLUMN1  COLUMN2
  001    FIRST 1
  002    SECOND
  001    FIRST 2
  003    THIRD
  001    FIRST 3
  004    FOURTH
  005    FIVE
  006    SIX
  001    FIRST 4

COLUMN3 is missing, as I did to give it in the Select statement:

SELECT COLUMN1,COLUMN2,COLUMN3 FROM TESTTABLE

COLUMN1  COLUMN2     COLUMN3
  001    FIRST 1     ~þób±8°*10*2A08R         *
  002    SECOND      ~þób°ùØ*10*2A08R         *
  001    FIRST 2     ~þób°ÿ-*10*2A08R         *
  003    THIRD       ~þób°\-*10*2A08R         *
  001    FIRST 3     ~þób°÷ *10*2A08R         *
  004    FOURTH      ~þób°S**10*2A08R         *
  005    FIVE        ~þób°T *10*2A08R         *
  006    SIX         ~þób°T\*10*2A08R         *
  001    FIRST 4     ~þób°U^*10*2A08R         *

I can see that the row id column contains hexadecimal data. But can I convert that to something I can read? Yes, using the following statement:

SELECT HEX(COLUMN3)
  FROM TESTABLE


HEX
A18ECE828FF89001F1F010F2C1F0F8D940400000000000000001
A18ECE8290DD8001F1F010F2C1F0F8D940400000000000000001
A18ECE8290DF6001F1F010F2C1F0F8D940400000000000000001
A18ECE8290E06001F1F010F2C1F0F8D940400000000000000001
A18ECE8290E14001F1F010F2C1F0F8D940400000000000000001
A18ECE8290E22001F1F010F2C1F0F8D940400000000000000001
A18ECE8290E30001F1F010F2C1F0F8D940400000000000000001
A18ECE8290E3E001F1F010F2C1F0F8D940400000000000000001
A18ECE8290E4B001F1F010F2C1F0F8D940400000000000000001

The only data type I can convert, CAST, a row id column to is character, which must be 40 long. Then I use the HEX to convert the character representation of the hexadecimal value to something more readable.

I needed to prove to myself that the row id was not like the relative record number, RRN, which can change. I ran a Select statement to view my three columns and the RRN of the rows.

SELECT RRN(*) As RRN,COLUMN1,COLUMN2,HEX(CAST(COLUMN3 AS CHAR(40)))
  FROM TESTTABLE A


RRN COLUMN1 COLUMN2  HEX
  1   001   FIRST 1  A18ECE828FF89
  2   002   SECOND   A18ECE8290DD8
  3   001   FIRST 2  A18ECE8290DF6
  4   003   THIRD    A18ECE8290E06
  5   001   FIRST 3  A18ECE8290E14
  6   004   FOURTH   A18ECE8290E22
  7   005   FIVE     A18ECE8290E30
  8   006   SIX      A18ECE8290E3E
  9   001   FIRST 4  A18ECE8290E4B

Note:  I am only showing the part of the hexadecimal of the row id that is unique, this makes it easier to see in this post.

Now I am going to delete all of the rows where the value in COLUMN1 is not '001':

DELETE FROM TESTTABLE WHERE COLUMN1 <> '001'

I use the SQL View SYSTABLESTAT to confirm that my table contains deleted records:

SELECT SYSTEM_TABLE_SCHEMA,SYSTEM_TABLE_NAME,NUMBER_ROWS,
       NUMBER_DELETED_ROWS
  FROM QSYS2.SYTABLESTAT
 WHERE SYSTEM_TABLE_SCHEMA = 'MYLIB'
   AND SYSTEM_TABLE_NAME = 'TESTTABLE'


LIBRARY     FILE  NUMBER_ROWS  NUMBER_DELETED_ROWS
NAME        NAME
MYLIB  TESTTABLE       4                    5

As I would expect four "active" rows and five deleted. The RRN will not change until I remove the deleted rows, using the Reorganize Physical File Member command, RGZPFM.

RGZPFM FILE(TESTTABLE)

I run the Select using the SYSTABLESTAT to confirm that the deleted records have been moved.

LIBRARY     FILE  NUMBER_ROWS  NUMBER_DELETED_ROWS
NAME        NAME
MYLIB  TESTTABLE       4                    0

When I run the Select on my table I can see that the RRN has changed, but the row id has not.

RRN COLUMN1 COLUMN2  HEX
  1   001   FIRST 1  A18ECE828FF89
  2   001   FIRST 2  A18ECE8290DF6
  3   001   FIRST 3  A18ECE8290E14
  4   001   FIRST 4  A18ECE8290E4B

Like the Identity column, As the row id is Db2 for i generated it cannot be changed.

How would I use the row id in a RPG program?

In the example below I want to update the value in COLUMN2 for most recent row where COLUMN1 = '001'.

01  **free
02  ctl-opt option(*nodebugio:*srcstmt:*nounref) ;

03  dcl-s Col2 char(10) ;
04  dcl-s Col3 sqltype(rowid) ;

05  exec sql SET OPTION COMMIT = *NONE, CLOSQLCSR = *ENDMOD ;

06  exec sql SELECT COLUMN3 INTO :Col3
07             FROM TESTTABLE
08            WHERE COLUMN1 = '001'
09            ORDER BY COLUMN3 DESC ;

Do some other stuff...

20  exec sql UPDATE TESTTABLE SET COLUMN2 = :Col2
21            WHERE COLUMN3 = :Col3 ;

Line 1: Nowadays I only code in RPG totally free.

Line 2: My favorite control options.

Line 3: I have defined this variable to be the same as COLUMN2.

Line 4: This variable is defined as a SQL type ROWID compatible variable.

Line 5: I always add these SQL options into the source of my programs to ensure that I, or others, do not forget them when compiling the source.

Lines 6 – 9: Here I am selecting into the variable Col3 the value of the most recent row id for COLUMN1 '001', the one with highest row id which will be first as I have sorted COLUMN3 in descending order.

I left a gap after this as in the real world other processing would happen.

Lines 20 and 21: After the other processing I want to update the value of the COLUMN2. As I have the row id of the most recent '001' rows I can use Col3 in the where clause to identify the row I want to update.

All pretty easy so far... But what would I do if I want to hard code the value of row id, not retrieve it from the table.

01  **free
02  ctl-opt option(*nodebugio:*srcstmt:*nounref) ;

03  dcl-s Col2 char(10) ;
04  dcl-s Col3 sqltype(rowid) ;

05  exec sql SET OPTION COMMIT = *NONE, CLOSQLCSR = *ENDMOD ;

06  Col3 = x'A18ECE8290E4B001F1F010F2C1F0F8D940400000000000000001' ;

07  exec sql SELECT COLUMN2 INTO :Col2
08             FROM TESTTABLE
09            WHERE COLUMN3 = :Col3 ;

Lines 1 – 5: Same as previous example program.

Line 6: I am moving the hexadecimal value of the row id for the last record to the program row id variable, Col3.

Lines 7 – 9: I can now use the variable Col3 to retrieve the row I desire.

I can also use the ROWID scalar function as part of the WHERE clause with the hexadecimal value of the row id:

07  exec sql SELECT COLUMN2 INTO :Col2
08             FROM TESTTABLE
09            WHERE COLUMN3 = 
10  ROWID(x'A18ECE8290E4B001F1F010F2C1F0F8D940400000000000000001') ;

Having taking sometime to experiment with the ROWID would I use it in place of an Identity column? No. I feel safer continuing to use the Identity column as the row id is just too complicated.

 

You can learn more about this from the IBM website:

 

This article was written for IBM i 7.3, and should work for some earlier releases too.

4 comments:

  1. > I feel safer continuing to use the Identity column...
    I would agree with this. One possible advantage to ROWID could be it can't be reset back to the minimum value so always unique. But an Identity column defined as NO CYCLE (the default) does not get reset either.

    Ringer

    ReplyDelete
  2. I know this is an old article, but I have been searching for this answer. Is the ROWID generated value unique to the specific instance of the specific file or unique,period. Could I see the same ROWID value in two instances of the same file on the same system only in different schema?

    ReplyDelete
  3. I enjoy looking through a post that will make people think.

    Also, many thanks for allowing for me to comment!

    ReplyDelete
  4. It is in reality a nice and useful piece of information. I
    am happy that you shared this helpful information with us.
    Please keep us informed like this. Thank you for sharing.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.