Wednesday, August 7, 2019

Using a SQL Sequence to renumber a column

sql sequence to provide a sequentail number to a sql statement

The title sounds a bit strange, but so is the subject of this post. There is a thing in Db2 for i that I can create called a Sequence. A Sequence uses the rules I give it to return to me a sequential value I can use to update column in a table.

I am sure it will become a bit clearer when I give my example.

I have ten students who have completed taken a course. At the end of the course the students take a test. The results of this test are contained in a SQL table I created, along with their name.

01  CREATE OR REPLACE TABLE MYLIB.STUDENT
02  (LAST_NAME VARCHAR(30),
03   FIRST_NAME VARCHAR(20),
04   MID_INITIAL CHAR(1),
05   SCORE DECIMAL(5,2)) ;

The students and their scores are:

LAST_NAME   FIRST_NAME   MID_INITIAL   SCORE
---------   ----------   -----------   -----
FUESTER     MAX                        25.69
ECKERT      TRISTAN           M        49.86
APUZZO      CAROL             C        14.89
TORRES      JOSE                       94.96
PENA        JUANITA           M        74.09
HONG        XIA                        18.54
REED        STEVE             J        65.83
OKEEFE      VINCENT                     6.02
KING        CATHERINA         K        18.76
WILEY       CHRISTIAN         J        60.77

There is a prize for the top five students, the size of which depends upon the rank of their test score. I am going to save this into a table that I can forward onto the people who fund the prize.

01  CREATE OR REPLACE TABLE MYLIB.RANKING
02  (STUDENT VARCHAR(50),
03   SCORE DECIMAL(5,2),
04   RANK SMALLINT) ;

I can use the following Insert statement to fill this table:

01  INSERT INTO MYLIB.RANKING
02  SELECT RTRIM(LAST_NAME) || ', ' ||
03           RTRIM(FIRST_NAME) || ' ' || 
04           MID_INITIAL,
05         SCORE,0
06    FROM STUDENT
07   ORDER BY SCORE DESC
08   LIMIT 5 ;

Lines 2 – 4: I am concatenating the students' names into one column. The RTRIM removes the spaces on the right side of the column. The double pipe ( || ) signify what is concatenated together.

Line 8: The LIMIT is used to state how many results I want returned, in this case five.

STUDENT              SCORE   RANK
------------------   -----   ----
TORRES, JOSE         94.96      0
PENA, JUANITA M      74.09      0
REED, STEVE J        65.83      0
WILEY, CHRISTIAN J   60.77      0
ECKERT, TRISTAN M    49.86      0

I can infer the rank of each student, but it would be better to have the rank as a number. These is where the Sequence comes into play.

I used the following statement to create my Sequence:

01  CREATE OR REPLACE SEQUENCE MYLIB.CALC_RANK
02    START WITH 1
03    INCREMENT BY 1
04    NO MAXVALUE
05    NO CYCLE ;

Line 1: When I execute this statement a data area of the same name is created. As I qualified the data area is created in my library, MYLIB. In the documentation from IBM makes it clear that we should not change the created data area:

The *DTAARA objects should not be changed with the Change Data Area (*CHGDTAARA) or any other similar interface because doing so may cause unexpected failures or unexpected results when attempting to use the SQL sequence through SQL.

I have given the most basic statement to show how easy this is to create. There are other parameters, but for this example, they are not necessary.

I want to:

  • Start my sequence with number 1 (line 2)
  • Increment the returned number by 1 (line 3)
  • There is no maximum value (line 4)
  • The sequence number will not cycle when its maximum value is reached (line 5)

To update the RANK column I would use the following statement:

UPDATE RANKING SET RANK = NEXT VALUE FOR CALC_RANK ;

And the RANK column in my table is updated.

STUDENT              SCORE   RANK
------------------   -----   ----
TORRES, JOSE         94.96      1
PENA, JUANITA M      74.09      2
REED, STEVE J        65.83      3
WILEY, CHRISTIAN J   60.77      4
ECKERT, TRISTAN M    49.86      5

Be warned that if I run the Update statement a second time the numbers in the RANK column will be 6 – 10.

If I want to resequence I would need to restart the sequence. Fortunately there is a statement to do that:

ALTER SEQUENCE CALC_RANK RESTART ;

After executing this if I run the Sequence again the first row has a value of 1.

Now I have created my Sequence, and I know how to restart it I can now include it in the Insert statement that inserts row into the table RANKING, rather than have it as a separate step.

01  DELETE FROM RANKING ;
02  ALTER SEQUENCE CALC_RANK RESTART ;

03  INSERT INTO RANKING
04  SELECT RTRIM(LAST_NAME) || ', ' ||
05           RTRIM(FIRST_NAME) || ' ' ||
06           MID_INITIAL,
07         SCORE,
08         NEXT VALUE FOR CALC_RANK
09    FROM STUDENT
10   ORDER BY SCORE DESC
11   LIMIT 5 ;

Line 1: I have the Delete statement to delete any rows that maybe present in the table.

Line 2: Restart the number returned from the Sequence.

Line 8: This is the only change from the previous Insert statement. This adds the value returned from the Sequence into the RANK column.

My results are as expected:

STUDENT              SCORE   RANK
------------------   -----   ----
TORRES, JOSE         94.96      1
PENA, JUANITA M      74.09      2
REED, STEVE J        65.83      3
WILEY, CHRISTIAN J   60.77      4
ECKERT, TRISTAN M    49.86      5

If I am done with my Sequence I can delete it using the SQL Drop statement.

DROP SEQUENCE CALC_RANK

If I wanted to I could create new function that would start with 10 and increment by 10:

CREATE SEQUENCE MYLIB.NEW_SEQ
  START WITH 10
  INCREMENT BY 10
  NO MAXVALUE
  NO CYCLE ;

Or 100s, or whatever. What I need to set my Sequence to return just depends on what I want values I want.

 

You can learn more about this from the IBM website:

 

This article was written for IBM i 7.3, and should work for some earlier releases too.

5 comments:

  1. wouldn't a window function be more practical in that case?

    select rtrim(last_name) || ', ' || rtrim(first_name) || mid_initial, score, rank() over(order by score desc)
    from student
    fetch first 5 rows only

    ReplyDelete
    Replies
    1. Remember that the examples I give in these posts are very simple, therefore, there could well be better ways of doing the same exact example.

      RANK() is new to me. Something I will have a "play" with.

      Delete
  2. Why not use the AS IDENTITY clause on the CREATE TABLE statement?
    https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0020108.html

    ReplyDelete
    Replies
    1. This is just an example of how to use a Sequence.

      If I was doing this in "live" I would probably use an identity column.

      Delete
  3. Ok so here is an example that might show where this could be useful. I have a time created lists of objects via the DSPOBJD command to an outfile. Then using SQL "wrote" a script to delete all of those objects. The script is a single column table. Then I have to CPYF the script table back to a member in QCLSRC. I think doing an OVRDBF to the new member name and then using this construct I can create the sequence number column.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.