Wednesday, February 3, 2021

Calculating differences between dates in SQL

calcuate difference between 2 dates and then show it in words

I was asked if there is an easy way in SQL to calculate the difference between two date fields.

I am sure there are many different ways to do this. In my example I have a file, TESTFILE, that contains a date field, WKDATE. I want to compare this date to today's date and produce a string that tells me the number of years, months, and days difference. Let me start off with the calculation of the difference:

01  SELECT WKDATE,
02         CURRENT_DATE AS "Curr date",
03         CURRENT_DATE - WKDATE AS "Diff"
04      FROM TESTFILE ;

Line 2: CURRENT_DATE is SQL's special register that contains today's date.

Line 3: To calculation the difference between the file's date, WKDATE, and today's date I can perform a simple subtraction.

The results are:

WKDATE      Curr date   Diff
----------  ----------  -------  
2020-01-31  2021-02-03        3
2020-06-04  2021-02-03      729
1995-02-21  2021-02-03   251110
2020-04-01  2021-02-03     1002
1904-10-24  2021-02-03  1160310

The results in the Diff column show the number of years, months, and days. I find it easier to look at the result in reverse order to understand what it shows:

  • Last two digits are the number of days
  • Third and fourth digits from the end are the number of months
  • Everything before that is the number of years

The value of 3 in the first row of the results shows that the difference between the two dates in 3 days.

The value in the second row of the results shows that the difference is 7 months and 29 days.

The third row's difference is 25 years, 11 months, and 10 days.

If I want to make a nicely formatted string I need to convert the result from decimal to be character:

01  SELECT WKDATE,
02         CURRENT_DATE AS "Curr date",
03         TO_CHAR(CURRENT_DATE - WKDATE) AS "Diff"
04    FROM TESTFILE ;

The results now look like:

WKDATE      Curr date   Diff
----------  ----------  -------  
2020-01-31  2021-02-03  3
2020-06-04  2021-02-03  729
1995-02-21  2021-02-03  251110
2020-04-01  2021-02-03  1002
1904-10-24  2021-02-03  1160310

As the Diff column is now character the values are left justified. This makes it too hard for me to substring out the year, month, and days from the results. Even if I replace the TO_CHAR with a Cast statement, on line 3, the results remain left justified.

01  SELECT WKDATE,
02         CURRENT_DATE AS "Curr date",
03         CAST((CURRENT_DATE - WKDATE) AS CHAR(11)) AS "Diff"
04    FROM TESTFILE ;

This is where I started thinking creatively. I remembered that I could format any number using the TO_CHAR with an equivalent of an edit word. This would do two things for me:

  1. Right justify the result
  2. Place a character to separate the years, months, and days
01  SELECT WKDATE,
02         CURRENT_DATE AS "Curr date",
03         TO_CHAR(CAST((CURRENT_DATE - WKDATE) AS DEC(10,0)),
             '9999G99G99') AS "Diff"
04    FROM TESTFILE ;

By using the formatting, 9999G99G99, the TO_CHAR will replace the G characters with commas. Making the results look like:

WKDATE      Curr date   Diff
----------  ----------  ---------  
2020-01-31  2021-02-03          3
2020-06-04  2021-02-03       7,29
1995-02-21  2021-02-03   25,11,10
2020-04-01  2021-02-03      10,02
1904-10-24  2021-02-03  116,03,10

If I count the number of commas in each result I can determine if the result is:

  • No commas = Days only
  • 1 comma = Months and days
  • 2 commas = Years, months, and days

I decide to count the number of commas using the Regular Expression for Counting, REGEXP_COUNT, to count the number of commas in the string produced by the TO_CHAR.

01  SELECT WKDATE,
02         CURRENT_DATE AS "Curr date",
03         TO_CHAR(CAST((CURRENT_DATE - WKDATE) AS DEC(10,0)),
             '9999G99G99') AS "Diff"
04         REGEXP_COUNT((TO_CHAR(CAST((CURRENT_DATE - WKDATE) 
             AS DEC(10,0)),'9999G99G99')),',') AS "Commas"
05    FROM TESTFILE ;

Which gives me the following:

WKDATE      Curr date   Diff      Commas
----------  ----------  --------- ------
2020-01-31  2021-02-03          3      0
2020-06-04  2021-02-03       7,29      1
1995-02-21  2021-02-03   25,11,10      2
2020-04-01  2021-02-03      10,02      1
1904-10-24  2021-02-03  116,03,10      2

At first I tried to make a single statement to break apart the results into the strings I wanted. But this became too ugly and, in my opinion, difficult to show and understand. Therefore, I made it a CTE, Common Table Expression, which combines the results of two statements to give me what I desire.

01  WITH T1(WKDATE,CURRDATE,NBR,COMMAS) AS
02  (SELECT WKDATE,
03     CURRENT_DATE,
04     TO_CHAR(CAST((CURRENT_DATE - WKDATE) AS DEC(10,0)),
         '9999G99G99'),
05     REGEXP_COUNT((TO_CHAR(CAST((CURRENT_DATE - WKDATE) 
         AS DEC(10,0)),'9999G99G99')),',')
06  FROM TESTFILE),

07  T2(WKDATE,CURRDATE,NBR,COMMAS,TEXT) AS
08  (SELECT T1.*,
09     CASE COMMAS
10       WHEN 0 THEN LTRIM(TO_CHAR(NBR)) || ' days'
11       WHEN 1 THEN LTRIM(SUBSTR(NBR,7,2)) || ' months ' || 
           TO_CHAR(TO_NUMBER(SUBSTR(NBR,10,2))) || ' days'    
12       ELSE LTRIM(SUBSTR(NBR,1,5)) || ' years ' ||
           TO_CHAR(TO_NUMBER(SUBSTR(NBR,7,2))) || ' months ' ||
           TO_CHAR(TO_NUMBER(SUBSTR(NBR,10,2))) || ' days'
13     END
14  FROM T1)
      
15  SELECT * FROM T2 ORDER BY WKDATE DESC ;

I look at the above and break it up into three parts:

  1. Lines 1 – 6
  2. Lines 7 – 14
  3. Line 15

Line 1: All CTE start with the word WITH. I am going to build a temporary table, in memory, called T1. It will contain four columns: WKDATE, CURRDATE, NBR, and COMMAS.

Lines 2 – 6: This is the SQL statement that builds T1. It is the same statement as I showed above. This statement is ended with a comma at its end on line 6.

Line 7: Start of the second part. I am creating another table, T2, This table contains the same columns as T1 with the addition of one called TEXT.

Line 8: This will include all the columns from T1 in T2.

Lines 9 – 13: I am using a Case statement to condition how I build the string that will be in the column TEXT.

Line 10: If the COMMAS column contains no commas then the DIFF column is just days. I need to remove any leading zeros. I do this by using the TO_CHAR, which removes leading any leading zeroes. I then use the LTRIM to remove any leading blanks.

Line 11: If there is one comma then I know the column NBR contains months and days. I know the NBR column is 11 long, therefore, I can substring out the months as they start in the 7th position, if the month is less than 10 by using the LTRIM I left justify the month. Days start in the 10th position of NBR. I substring the days out, convert that string to decimal with TO_NUMBER, and then back to character with TO_CHAR to remove any leading zero.

Line 12: The Else will only be executed when there are three commas. Here I use the same ways the month and day were extracted in line 11. First I extract the years, and then the month and days.

Line 14: Notice that there is no comma at the end of this line.

Line 15: This statement "builds" the results from T2 that is displayed to the user.

I removed the CURRDATE from the results so that this will fit in the width of this post, as we all know what today's date is.

                       COM
WKDATE      NBR        MAS  TEXT
----------  ---------  ---  --------------------------
2020-01-31          3    0  3 days
2020-06-04       7,29    1  7 months 29 days
2020-04-01      10,02    1  10 months 2 days
1995-02-21   25,11,10    2  25 years 11 months 10 days
1904-10-24  116,03,10    2  116 years 3 months 10 days

The approach of using the CTE has made this simple to use and to follow.

 

This article was written for IBM i 7.4, and should work for some earlier releases too.

7 comments:

  1. Thanks to share this feature, I didn't know about it.

    ReplyDelete
  2. Line 12: The Else will only be executed when there are three commas.

    Not to be nit-picky, but this should be "...there are two commas". I love your articles though. I always find some new trick! Thanks!

    ReplyDelete
  3. I came across this unusual way of representing the difference between timestamps when trying to find a way to use SQL to calculate the duration of processes based on their start and end timestamps. The result of one calculation was 113 which means one minute and 13 seconds (i.e. 73 seconds), rather than 113 seconds. Does anyone know of a simple way to return the actual number of second between two timestamps in SQL?

    ReplyDelete
  4. Just a comment, it is possible to determine years, months, days, hours, minutes, seconds directly from the numeric difference of a date/time values.

    Here an Example for generating the text difference in a different way.

    WITH T1(WKDATE,CURRDATE,NumDiff)
    AS (SELECT WKDATE, CURRENT_DATE, Current_Date - WKDate
    FROM TESTFILE)

    Select T1.*,
    Year(NumDiff) NbrYears,
    Month(NumDiff) NbrMonths,
    Day(NumDiff) NbrDays,
    Case When Year(NumDiff) > 0
    Then Year(NumDiff) concat ' years '
    Else '' End concat
    Case When Month(NumDiff) > 0
    Then Month(NumDiff) Concat ' months '
    Else '' End concat
    Case When Day(NumDiff) > 0
    Then Day(NumDiff) concat ' days'
    Else '' End Text
    from T1;

    ReplyDelete
  5. Sachin AbeywardenaMay 10, 2021 at 8:57 AM

    Helpful!!!!! Tip...Thank you Simon...

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.