Wednesday, April 16, 2025

LISTAGG scalar function can return distinct results

I did not realize it had been so long since I first wrote about the LISTAGG scalar function. I mentioned very briefly in that article something that became very useful to me just a short while ago.

The LISTAGG scalar function aggregates the results from a one or more rows into a single new column, separated by a character of my choice.

In my scenario I have a table, TESTTABLE, that contains a column of colors, COLOR:

01  SELECT COLOR,COUNT(*) "Count" 
02    FROM TESTTABLE
03   GROUP BY COLOR

Some of these colors are repeated:

COLOR   Count
------  -----
ORANGE      1
BLUE        1
GREEN       1
RED         3
YELLOW      2

If I use a minimal LISTAGG:

01  SELECT LISTAGG(COLOR,', ') AS "Colors"
02    FROM TESTTABLE

Line 1: I am creating the new column Colors using LISTAGG with the comma ( , ) as the separator character.

When executed the result is:

Colors
--------------------------------------------------
RED, GREEN, BLUE, YELLOW, RED, RED, YELLOW, ORANGE

It is easier to see the duplicates if I sort the values in COLOR, I do this using the following:

01  SELECT LISTAGG(COLOR,', ') 
02           WITHIN GROUP (ORDER BY COLOR) AS "Colors"
03    FROM TESTTABLE

Line 2: This is the code that does the ordering. The WITHIN GROUP acts like a GROUP BY. The (ORDER BY COLOR) is the sort criteria.

My result is:

Colors
--------------------------------------------------
BLUE, GREEN, ORANGE, RED, RED, RED, YELLOW, YELLOW

I can also sort the aggregated results in descending order with the following:

01  SELECT LISTAGG(COLOR,', ') 
02           WITHIN GROUP (ORDER BY COLOR DESC) AS "Colors"
03    FROM TESTTABLE

Line 2: The order by now contains DESC, sort in descending order.

Which returns:

Colors
--------------------------------------------------
YELLOW, YELLOW, RED, RED, RED, ORANGE, GREEN, BLUE

While sorting the values in the column works in most situations, there are some occasions where it does not. For example, if I was to use LISTAGG with text lines from a Table I might not want to sort them as the results become confusing:

01  SELECT LISTAGG(RTRIM(TEXT_COMMENTS),', ')
02           WITHIN GROUP (ORDER BY TEXT_COMMENTS) AS "Texts"
03    FROM TESTTABLE_1

The above produces this result:

Texts
--------------------------------------------------------
First line of text, Fourth line, Second line, Third line

This is not what I wanted as the fourth line of text follows the first, and is before the second line.

Here it is best not to sort:

01  SELECT LISTAGG(RTRIM(TEXT_COMMENTS),', ') AS "Texts"
02    FROM TESTTABLE_1

The results are what I wanted:

Texts
--------------------------------------------------------
First line of text, Second line, Third line, Fourth line

Returning to my original scenario, if I want to only have each color once I would use the following:

01  SELECT LISTAGG(DISTINCT COLOR,', ') 
02           WITHIN GROUP (ORDER BY COLOR) AS "Colors"
03    FROM TESTTABLE

Line 1: DISTINCT has been added to the LISTAGG. Distinct selects only unique results. This gives me:

Colors
--------------------------------
BLUE, GREEN, ORANGE, RED, YELLOW

Which is what I desired.

I also found that I can combine multiple columns into one LISTAGG result. In this example I want to have a list of customers' names and the states where they live:

01  SELECT LISTAGG((RTRIM(NAME) || ',' || STATE),' : ')
02    FROM CUSTOMER

Line 1: I need to trim the NAME column as it is CHAR, fixed width character. I am using the double pipe ( || ) in place of the word CONCAT to concatenate the name and state, separated by a comma. I am separating each aggregation by a colon ( : ).

The results are:

00001
-----------------------------------------------
Micky Mouse,FL : Donald Duck,CA : Goofy Goof,OH

Both of these saved me a lot of extra coding to get the results I desired.

 

This article was written for IBM i 7.5, and should work for some earlier releases too.

2 comments:

  1. How do you do undo this if someone gave you an aggregated report?

    ReplyDelete
    Replies
    1. I guess it would depend how the data was aggregated.

      You could use the SQL SPLIT scalar function, see here.

      Or you could use the RPG BiF %SPLIT, see here.

      Delete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.