Tuesday, July 7, 2020

Selects with Intersect to find common records in two files

sql selects with intersect to find common data

Having written about Selects with Union I thought it was time to mention to Selects with Intersect.

Select statements with the INTERSECT combines the matching results from two tables into one set of results. To reiterate "matching" means that the same value must appear in the given column(s) in both files.

I have two example tables with the same column names, they contain the following data:

Table 1 Table 2
ANGELA
BETTY
DOROTHY
EVELYN
GEORGINA
GEORGINA
FRANCESCA
EVA
DOROTHY
CAROLINE
ANGELA

I deliberately reversed the names in the second table to show that INTERSECT does not use the table's relative record number, RRN.

In this example the SQL statement is very simple:

01  SELECT FIRST_NAME FROM TABLE1
02    INTERSECT
03  SELECT FIRST_NAME FROM TABLE2

The returned results are:

FIRST_NAME
----------
GEORGINA
DOROTHY
ANGELA

The results are returned in the order they are (I call it "reverse order") as that is the order the rows are in the second table in the statement. If I want the results in alphabetical order I would add an ORDER BY to my statement:

01  SELECT FIRST_NAME FROM TABLE1
02    INTERSECT
03  SELECT FIRST_NAME FROM TABLE2
04   ORDER BY 1


FIRST_NAME
----------
ANGELA
DOROTHY
GEORGINA

I can achieve the same results with a INNER JOIN:

01  SELECT A.FIRST_NAME
02    FROM TABLE1 A
03   INNER JOIN TABLE2 B
04      ON A.FIRST_NAME = B.FIRST_NAME
05   ORDER BY 1

The returned results are the same as the Selects with the INTERSECT.

FIRST_NAME
----------
ANGELA
DOROTHY
GEORGINA

 

You can learn more about the INTERSECT from the IBM website here.

 

This article was written for IBM i 7.4, and should work for some earlier releases too.

5 comments:

  1. INTERSECT eliminates duplicate records, so wouldn't the inner join be:

    SELECT DISTINCT A.FIRST_NAME
    FROM TABLE1 A INNER JOIN TABLE2 B
    ON A.FIRST_NAME = B.FIRST_NAME
    ORDER BY 1

    ReplyDelete
  2. Nice!
    I've always used the inner join to do this. Intersect is much more efficient... in writing the statement anyway.

    ReplyDelete
  3. Grazie per la dritta!!! Lo proverò sicuramente

    ReplyDelete
  4. Simon, great read. Never used the intersect option before only inner and outer. Look like it’s very interesting and straightforward. Thanks for sharing.

    ReplyDelete
  5. The statement with no ORDER BY is more correctly described to return the rows "in an undetermined order". When we upgraded from OS 6.1 to 7.2 some years ago we learned this the hard way, as the OS that had *always* returned an unordered SELECT in arrival sequence no longer did so, creating several problems for us where we had relied on that fact. This "unorderedness" has been documented for SQL, as far as I know, indefinitely. The golden rule is... if order matters at all, YOU must specify it.

    ReplyDelete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.