Wednesday, September 5, 2018

Using SQL to determine how many records end with...

determine what alpha fields end with

I was asked an intriguing question that I thought it would make a good post for this blog:

Using SQL, how would I get a count of all records in a file where the value in an alphanumeric field ends with 18?

Rather than show just the finished SQL statement I am going to show how, and why, I built each part of the select statement.

Let me start with my test file. It will come as no surprise to regular readers of this blog that it is called TESTFILE, and is in the library QTEMP. The file contains one alphanumeric/character field, CHARFIELD, its length is irrelevant. Its contents looks like:

01  SELECT CHARFIELD
02    FROM QTEMP.TESTFILE


CHARFIELD
0018
1800
18
00000018
118
X18X

The first step I am going to do is to remove all leading zeroes. I can do this using the left trim, LTRIM, built in function. This built in function has two parameters:

  1. Name of the column/field
  2. Characters to remove

In this case I want the LTRIM to remove any leading zeroes from the string in the fields.

01  SELECT CHARFIELD,
02         LTRIM(CHARFIELD,'0')
03    FROM QTEMP.TESTFILE


CHARFIELD   LTRIM
0018        18
1800        1800
18          18
00000018    18
118         118
X18X        X18X

I cannot use the use the results from the LTRIM for testing the last two characters as the strings still contain blanks to the right. I can remove those using the right trim built in function, RTRIM, and this time I do not want to remove any trailing characters that are not blank.

01  SELECT CHARFIELD,
02         LTRIM(CHARFIELD,'0'),
03         RTRIM(LTRIM(CHARFIELD,'0'))
04    FROM QTEMP.TESTFILE


CHARFIELD   LTRIM       RTRIM
0018        18          18
1800        1800        1800
18          18          18
00000018    18          18
118         118         118
X18X        X18X        X18X

Line 3 shows that I RTRIM the results from the previous LTRIM.

What I need to do now is to retrieve the last two characters in each string. Fortunately there is a built in function to do this, RIGHT. This built in function will return in the results the rightmost number of characters I require, in this case two.

01  SELECT CHARFIELD,  
02         LTRIM(CHARFIELD,'0'),
03         RTRIM(LTRIM(CHARFIELD,'0')),
04         RIGHT(RTRIM(LTRIM(CHARFIELD,'0')),2)
05    FROM QTEMP.TESTFILE


CHARFIELD   LTRIM       RTRIM       RIGHT
0018        18          18          18
1800        1800        1800        00
18          18          18          18
00000018    18          18          18
118         118         118         18
X18X        X18X        X18X        8X

Line 4 shows that I want the two rightmost characters of the result returned by the RTRIM, which is using the results from the LTRIM.

I know some people will ask do I really need to use the RTRIM? Couldn't I just use the RIGHT on the results from only the LTRIM? Let me see what happens if I don't use the RTRIM.

01  SELECT CHARFIELD,
02         LTRIM(CHARFIELD,'0'),
03         RTRIM(LTRIM(CHARFIELD,'0')),
04         RIGHT(RTRIM(LTRIM(CHARFIELD,'0')),2),
05         RIGHT(LTRIM(CHARFIELD,'0'),2) AS RIGHT2
06    FROM QTEMP.TESTFILE
                  

CHARFIELD   LTRIM       RTRIM       RIGHT       RIGHT2
0018        18          18          18
1800        1800        1800        00
18          18          18          18
00000018    18          18          18
118         118         118         18
X18X        X18X        X18X        8X

The RIGHT2 column, defined on line 5, is blank. If you think about it so it should be. The strings within the columns all contain trailing blanks. If I pick the row/field with the longest string it contains: '00000018  '. The result of the LTRIM, from line 2 of the statement, still has trailing blanks: '18  '. Thus, the rightmost two characters are '  '.

The RTRIM, line 3, removes the trailing blanks: '18', so now the rightmost characters are '18'.

The question asked wanted a count not a list of rows.

01  SELECT COUNT(*)
02    FROM QTEMP.TESTFILE
03   WHERE RIGHT(RTRIM(LTRIM(CHARFIELD,'0')),2) = '18'


COUNT ( * )
          4

This statement says I want a count of the rows/records, line 1, that fit the criteria in the where, line 3. The statement in the where clause is the same as the RIGHT column in the earlier statements. The result is 4, as there are four rows/records where the string in the alphanumeric field end with '18'.

 

This article was written for IBM i 7.3, and should work for earlier releases too.


Addendum

Since publishing this post this morning I have received numerous messages informing me that an easier way is to use the following:

SELECT * FROM QTEMP.TESTFILE
 WHERE CHARFIELD LIKE '%18'

Alas, that does not work as none of the strings have "18" in the last two positions of the column/field.

Michel Bortolotto, David Ford, John Panzenhazen, Brian Rusch, and one Anonymous came up with an approach, similar to the above, that works using either the TRIM or RTRIM.

SELECT * FROM QTEMP.TESTFILE
WHERE RTRIM(CHARFIELD) LIKE '%18'

Which gives me the following:

CHARFIELD
0018
18
118
00000018

If I change the select to a count I get the same result as my earlier example:

SELECT COUNT(*) FROM QTEMP.TESTFILE
 WHERE RTRIM(CHARFIELD) LIKE '%18'


COUNT ( * )
          4

5 comments:

  1. Hi Simon,
    there is also locate_in_string which can lookup from the right (using a negative number as third parameter):
    with data (s) as (values
    cast('0018' as char(10)),
    cast('1800' as char(10)),
    cast('18' as char(10)),
    cast('00000018' as char(10)),
    cast('118' as char(10)),
    cast('X18X' as char(10)))
    select s
    from data
    where locate_in_string(trim(s),'18', -1) = length(trim(s)) - 1

    S
    0018
    18
    00000018
    118

    ReplyDelete
  2. One more:
    with data (s) as (values
    cast('0018' as char(10)),
    cast('1800' as char(10)),
    cast('18' as char(10)),
    cast('00000018' as char(10)),
    cast('118' as char(10)),
    cast('X18X' as char(10)))
    select s
    from data
    where regexp_count(s,'18\b') > 0

    ReplyDelete
  3. And one more (if you don't mind):
    with data (s) as (values
    cast('0018' as char(10)),
    cast('1800' as char(10)),
    cast('18' as char(10)),
    cast('00000018' as char(10)),
    cast('118' as char(10)),
    cast('X18X' as char(10)))
    select s
    from data
    where regexp_count(trim(s),'(18)$') > 0

    ReplyDelete
  4. like is pretty expensive. A faster, cheaper solution is

    SELECT COUNT(*) FROM QTEMP.TESTFILE
    WHERE RIGHT(' '||TRIM(CHARFIELD),2) = '18';

    The concat is needed to prevent an error when the trimmed field is less than 2 characters long.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.