Wednesday, March 4, 2020

Adding subtotals and a total to a SQL Select

adding subtotals and total to sql select

I have often looked at the results returned to me by a SQL statement and wondered to myself "Wouldn't it be nice if I could add a subtotal to these results".

Yesterday I decided to have a search using my favorite search engine to see if there is an easy way to do this. I could find examples in other flavors of SQL, but not in Db2 for i. Feeling piqued I decided to try one of these examples with one of my SQL Select statements, and was really pleased to find that what was given in the example also worked in Db2 for i too. Thank goodness for SQL interoperability and standardization.

I am not going to give the source code for the SQL DDL table I will be using in these examples. My table, TABLE_OF_THINGS, has three columns:

  1. TYPE
  2. SUBTYPE
  3. THING

For those of you wondering why I am using such a long table name that is too long to have a decent system name I used the FOR SYSTEM in the CREATE TABLE statement to give it the system name: THINGSTAB.

I can use the SQL statement below to view the table's contents:

01  SELECT * FROM TABLE_OF_THINGS

TYPE  SUBTYPE     THING
----  ----------  ----------
1     FRUIT       STRAWBERRY
1     VEGETABLE   MAIZE
2     VERTEBRATE  BEAR
1     FRUIT       RASPBERRY
1     VEGETABLE   TURNIP
2     VERTEBRATE  DOG
1     FRUIT       APPLE
1     VEGETABLE   CELERY
2     VERTEBRATE  CAT
1     FRUIT       STRAWBERRY

I can add a count and group the results, but that is not really what I want:

01  SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT
02    FROM TABLE_OF_THINGS
03   GROUP BY TYPE,SUBTYPE,THING
04   ORDER BY TYPE,SUBTYPE,THING

TYPE  SUBTYPE     THING       COUNT
----  ----------  ----------  -----
1     FRUIT       APPLE           1
1     FRUIT       RASPBERRY       1
1     FRUIT       STRAWBERRY      2
1     VEGETABLE   CELERY          1
1     VEGETABLE   MAIZE           1
1     VEGETABLE   TURNIP          1
2     VERTEBRATE  BEAR            1
2     VERTEBRATE  CAT             1
2     VERTEBRATE  DOG             1

I was fearing that I would have to take these results and then run another statement over them to create subtotals and a total. That was until I discovered a post for Microsoft SQL Server that gave me three examples of how I can create subtotals.

 

GROUP BY ROLLUP

This addition to my statement gives me subtotals and a total at the bottom of the results.

01  SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT
02    FROM TABLE_OF_THINGS
03   GROUP BY ROLLUP (TYPE,SUBTYPE,THING)

Line 3: The GROUP BY ROLLUP has to be followed by the columns the results need to be rolled up for.

With this statement I will get a count of all of the things, a subtotal of the subtype and type, and a total of all the results:

TYPE  SUBTYPE     THING       COUNT
----  ----------  ----------  -----
1     FRUIT       APPLE           1
1     FRUIT       RASPBERRY       1
1     FRUIT       STRAWBERRY      2
1     FRUIT       -               4
1     VEGETABLE   CELERY          1
1     VEGETABLE   MAIZE           1
1     VEGETABLE   TURNIP          1
1     VEGETABLE   -               3
1     -           -               7
2     VERTEBRATE  BEAR            1
2     VERTEBRATE  CAT             1
2     VERTEBRATE  DOG             1
2     VERTEBRATE  -               3
2     -           -               3
-     -           -              10

The hyphens ( - ) denote null values.

This nice and simple SQL statement fulfills the purpose I was looking for. But there are other, dare I say weirder, ways of using the GROUP BY.

 

GROUP BY GROUPING SETS

The grouping sets allow me to group the results into different ways as they are returned in the results from the SQL statement. For example:

01  SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT
02    FROM TABLE_OF_THINGS
03   GROUP BY GROUPING SETS ((TYPE,SUBTYPE,THING),(THING,TYPE),
                             (TYPE),())

Line 3: I can group together the columns in any way I want, providing they are enclosed in parentheses ( ( ) ) and each column within them is separated by a comma.

The empty set of parentheses at the end will produce the grand total.

When run I get the following:

TYPE  SUBTYPE     THING       COUNT
----  ----------  ----------  -----
1     FRUIT       APPLE           1
1     FRUIT       RASPBERRY       1
1     FRUIT       STRAWBERRY      2
1     VEGETABLE   CELERY          1
1     VEGETABLE   MAIZE           1
1     VEGETABLE   TURNIP          1
1     -           APPLE           1
1     -           CELERY          1
1     -           MAIZE           1
1     -           RASPBERRY       1
1     -           STRAWBERRY      2
1     -           TURNIP          1
1     -           -               7
2     VERTEBRATE  BEAR            1
2     VERTEBRATE  CAT             1
2     VERTEBRATE  DOG             1
2     -           BEAR            1
2     -           CAT             1
2     -           DOG             1
2     -           -               3
-     -           -              10

I know the results are a bit strange and not really practical, but I just wanted to show what was possible.

I could also use the following statement to replicate the results I generated with the GROUP BY ROLLUP:

01  SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT
02    FROM PGMSDHTST3.TABLE_OF_THINGS
03    GROUP BY GROUPING SETS ((TYPE,SUBTYPE,THING),(TYPE,SUBTYPE),
                              (TYPE),())

Line 3: Each grouping set produces the "levels" of results I want.

 

GROUP BY CUBE

This where the results get really weird. The GROUP BY CUBE creates all possible combination of the columns in the results. For example with these three columns:

TYPE SUBTYPE THING
Y Y Y
Y Null Y
Y Y Null
Y Null Null
Null Y Y
Null Null Y
Null Y Null
Null Null Null

The statement looks simple enough:

01  SELECT TYPE,SUBTYPE,THING,COUNT(*) AS COUNT
02    FROM TABLE_OF_THINGS
03   GROUP BY CUBE (TYPE,SUBTYPE,THING)
04   ORDER BY TYPE,SUBTYPE,THING

It took me a bit of time to get my head around the results:

TYPE  SUBTYPE     THING       COUNT
----  ----------  ----------  -----
1     FRUIT       APPLE           1
1     FRUIT       RASPBERRY       1
1     FRUIT       STRAWBERRY      2
1     FRUIT       -               4
1     VEGETABLE   CELERY          1
1     VEGETABLE   MAIZE           1
1     VEGETABLE   TURNIP          1
1     VEGETABLE   -               3
1     -           APPLE           1
1     -           CELERY          1
1     -           MAIZE           1
1     -           RASPBERRY       1
1     -           STRAWBERRY      2
1     -           TURNIP          1
1     -           -               7
2     VERTEBRATE  BEAR            1
2     VERTEBRATE  CAT             1
2     VERTEBRATE  DOG             1
2     VERTEBRATE  -               3
2     -           BEAR            1
2     -           CAT             1
2     -           DOG             1
2     -           -               3
-     FRUIT       APPLE           1
-     FRUIT       RASPBERRY       1
-     FRUIT       STRAWBERRY      2
-     FRUIT       -               4
-     VEGETABLE   CELERY          1
-     VEGETABLE   MAIZE           1
-     VEGETABLE   TURNIP          1
-     VEGETABLE   -               3
-     VERTEBRATE  BEAR            1
-     VERTEBRATE  CAT             1
-     VERTEBRATE  DOG             1
-     VERTEBRATE  -               3
-     -           APPLE           1
-     -           BEAR            1
-     -           CAT             1
-     -           CELERY          1
-     -           DOG             1
-     -           MAIZE           1
-     -           RASPBERRY       1
-     -           STRAWBERRY      2
-     -           TURNIP          1
-     -           -              10

See what I mean.

 

Having briefly played with all three of these I can definitely see myself using the GROUP BY ROLLUP. I will probably use the GROUP BY GROUPSETS. I am stumped to think of a scenario where using GROUP BY CUBE would fit a scenario in the environment I work in.

 

This article was written for IBM i 7.4, and should work for some earlier releases too.

6 comments:

  1. At a place I once worked, we had data stored by company, area, region, district, store, year, period, week, day, hour, item. We could have used cube with a materialized query table to provide a data warehouse since we downloaded and added new data every night.

    ReplyDelete
  2. Just so much power!

    ReplyDelete
  3. I think it was Scott Forstie that had an example where he replaced the nulls with *TOTAL* or *SUBTOTAL* to make it easier to understand. In the select statement use the coalesce function. for example:
    SELECT
    coalesce(TYPE, '*TOTAL*') as TYPE,
    coalesce(SUBTYPE, '*TOTAL*') as SUBTYPE,
    coalesce(THING, '*SUBTOTAL*') as THING,
    COUNT(*) AS COUNT
    FROM TABLE_OF_THINGS
    GROUP BY ROLLUP (TYPE,SUBTYPE,THING)

    ReplyDelete
  4. Actually it has been possible to have subtotals in SQL result sets since IBM i 6.1...!
    See this article from 2009 by the excellent Michael Sansoterra:

    https://www.itjungle.com/2009/03/11/fhg031109-story02/

    ReplyDelete
  5. Thanks Simon!

    ReplyDelete
  6. stefan uytterhoevenAugust 11, 2023 at 6:57 AM

    wow , nice! Didn't know that. 'group by rollup'

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.