Tuesday, September 20, 2022

Using SQL to perform a domain lookup

We all develop our own algorithms to validate email addresses. There must be an "@" sign in the string, there must be a period after that, etc. Most of these I have seen fail due to the domain name part of the email address. There are now Top Level Domains, TLD, (the letters that come after the "dot") that are longer than three characters. I know of many non-IBM i applications that use a DNS lookup to determine if the domain is active, although this will not guarantee that the domain has email.

As part of the IBM i 7.5 and 7.4 TR6 release is a simple way to do a DNS lookup using SQL.

Before I get started I need to state the obvious: What I am going to show you here will only work if your IBM i partition can connect to the internet. If it cannot then you will be unable to duplicate what I describe here.

The DNS lookup is performed by a Db2 for i Table Function, DNS_LOOKUP, which is found in the library QSYS2. Two parameters can be passed to the Table function:

  1. SEARCH_NAME:  The domain name I want to resolve the host for
  2. DOMAIN_SERVER:  Host name or IP address of the domain name server. Can be omitted

Four columns are returned in the results. I am only interested in the first two:

  1. ADDRESS_SPACE_TYPE:  IP type for the IP address. Either IPV4 or IPV6
  2. IP_ADDRESS:  Either the IPV4 or IPV6 address

In this first example I want to perform a DNS look up for the following three domains:

  1. GOOGLE.COM
  2. IBM.COM
  3. MICROSOFT.COM

Rather than have three sets of results, I thought I would combine them into one using an Union clause:

SELECT 'google.com',
       ADDRESS_SPACE_TYPE AS "Type",
       IP_ADDRESS AS "IP address"
  FROM TABLE(QSYS2.DNS_LOOKUP('google.com'))
UNION
SELECT 'IBM.COM',ADDRESS_SPACE_TYPE,IP_ADDRESS
  FROM TABLE(QSYS2.DNS_LOOKUP('IBM.COM'))
UNION
SELECT 'Microsoft.Com',ADDRESS_SPACE_TYPE,IP_ADDRESS
  FROM TABLE(QSYS2.DNS_LOOKUP(' Microsoft.Com'))

For the Union to be use correctly I have to return the same number of columns from each Select statement. I have only given the first parameter, the search name (domain name). Notice that I have entered the domain name in different cases.

Results are returned regardless of the case of the search pattern:

What           Type  IP address
-------------  ----  ------------------------
google.com     IPV4  142.250.181.238
google.com     IPV6  2a00:1450:4001:80f::200e
IBM.COM        IPV4  23.79.150.191
IBM.COM        IPV6  2a02:26f0:480:38d::3831
IBM.COM        IPV6  2a02:26f0:480:383::3831
Microsoft.Com  IPV4  20.103.85.33
Microsoft.Com  IPV4  20.81.111.85
Microsoft.Com  IPV4  20.84.181.62
Microsoft.Com  IPV4  20.112.52.29
Microsoft.Com  IPV4  20.53.203.50

If I pass the Table Function a domain that does not exist no results are returned.

How can I use it to validate the domain part of an email address? This is a RPG program to do that. I am going to show it in two parts:

  1. The main body
  2. The subprocedure that does all the interesting stuff

First, the main body:

01  **free
02  ctl-opt option(*srcstmt) dftactgrp(*no) ;

03  dcl-s EmailAddress char(100) ;
04  dcl-s Valid ind ;

05  EmailAddress = 'Someone@GMail.com' ;
06  Valid = CheckDomain(EmailAddress) ;

07  EmailAddress = 'Bad.One@rubbish.bad'  ;
08  Valid = CheckDomain(EmailAddress) ;

09  *inlr = *on ;

Line 1: Yes, it is totally free RPG!

Line 2: My favorite control options. I need the DFTACTGRP as I am going to call a subprocedure.

Lines 3 and 4: Definition of the "global" variables.

Line 5: Moving a potential email address into the variable that will be passed to the subprocedure.

Line 6: The procedure is called and the returned value is placed in the variable Valid.

Lines 7 and 8: Are the same as lines 5 and 6, for another potential email address.

As I said all the interesting stuff is in the subprocedure:

10  dcl-proc CheckDomain ;
11    dcl-pi *n ind ;
12      EmailAddress char(100) ;
13    end-pi ;

14    dcl-s Domain char(100) ;
15    dcl-s Domain20 char(20) ;
16    dcl-s Found char(1) ;

17    Found = '0' ;

18    Domain = %subst(EmailAddress : %scan('@' : EmailAddress) + 1) ;

19    exec sql SELECT '1' INTO :Found
                 FROM TABLE(QSYS2.DNS_LOOKUP(RTRIM(:Domain))) 
                LIMIT 1 ;

20    Domain20 = Domain ;
21    dsply ('Domain ' + %trimr(Domain20) + ' found = ' + Found) ;

22    return Found ;
23  end-proc ;

Lines 11 – 13: The procedure interface. I have not named the subprocedure in the procedure interface, line 11, so I give *N. This subprocedure returns an indicator value. A 100 character variable is passed to the subprocedure, and I have called it EmailAddress.

Lines 14 – 16: The definition of the "local" variables, that are only available in this subprocedure.

Line 17: I will be using this variable to contain the result of my validation of the domain name.

Line 18: I am extracting the domain name from the email address variable. I do this using a substring built in function, %SUBST, I am only using two of its parameters: the name of the variable and where to start the "substring-ing". If I do not give a length it will "substring" from the starting point to the end of the variable. I determining where to start the substring by scanning the variable for an "@", and adding one to that for the first character of the domain name.

Line 19: Here I am using the SQL Select statement to validate that there is a result returned by DNS_LOOKUP. The first line states that it will move '1’ into the variable Found if the statement completes successfully. If it does not Found will remain unchanged. I decided to use the RTRIM scalar function on the Domain variable. A domain name can be a maximum of 63 characters, although most are a lot shorter. I used the RTRIM to only pass the domain name, without any trailing blanks. You do not have to do this, unless you want to.

Lines 20 and 21: These lines are only present for testing purposes. I have to use Domain20 as the program will not compile if I use Domain in the DSPLY operation code, as it has the potential for being longer than the 50 characters DSPLY allows.

Line 22: I return the value in Found, allowing the main body of the program know if the domain is valid or not.

After compiling the program I ran it and these lines were written to the job log by the DSPLY operation code.

DSPLY  Domain GMail.com found = 1
DSPLY  Domain rubbish.bad found = 0

I like the ease of using DNS_LOOKUP. I am considering changing my email address validation procedure to include it.

 

You can learn more about the DNS_LOOKUP SQL Table Function from the IBM website here.

 

This article was written for IBM i 7.5 and 7.4 TR6.

5 comments:

  1. This also works for 7.3 TR 12.

    ReplyDelete
  2. Thanks as always Simon!

    ReplyDelete
  3. This is fantastic Simon... Thank you for sharing

    ReplyDelete
  4. Specifying DNS_LOOKUP(' Microsoft.Com') with a leading space resulted in an error "INVALID SEARCH_NAME". I'm assuming you have a typo. I can also confirm this works on 7.3. Thank you for this. This is new to me.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.