Equivalent of SETLL using SQL @ RPGPGM.COM

Tuesday, September 3, 2013

Equivalent of SETLL using SQL

I received several messages and comments in response to my post Validation: CHAIN versus SETLL informing me that I can do the same in SQL.

Alas, no-one sent me any examples of how they would do it. Therefore, after some Googling I found a way.

In this example I will be using I will perform two "SETLLs" to a file, ORDFILE, looking for a match to the Order Number field, ORDNBR.

In the first scenario I used a valid Order Number and in the second an invalid number.

01 D OrdNo1          S              7    inz('MSFH540')
02 D OrdNo2          S                   like(OrdNo1) inz('???????')
03 D i               S              5I 0
04  /free
05     exec sql select 1 into :i
06        from ordfile where ordnbr = :OrdNo1 ;

07     exec sql select 1 into :i
08         from ordfile where ordnbr = :OrdNo2 ;

Lines 1 and 2 define the work fields I will be using for the test order numbers.

On line 3 I have defined an integar field, i, that will be used to flag whether the "SETLL" was successful or not.

Lines 5 and 6 are the SQL statement that performs the look up matching the value in the field OrdNo1 to the field ORDNBR, which is successful and 1 is moved to i.

Lines 7 and 8 perform an unsuccessful match. As there is not a record in the ORDFILE with a key that matches, and no value is moved to i. If I check if i is equal to 1 to test if the match was successful then I would get a false positive, as i remains unchanged and equal to 1.

To overcome the possibility of false positives I would insert two new lines to move zero to i before performing the SQL statement, see below:

01 D OrdNo1          S              7    inz('MSFH540')
02 D OrdNo2          S                   like(OrdNo1) inz('???????')
03 D i               S              5I 0
04  /free
05     i = 0 ;
06     exec sql select 1 into :i
07        from ordfile where ordnbr = :OrdNo1 ;

08     i = 0 ;
09     exec sql select 1 into :i
10         from ordfile where ordnbr = :OrdNo2 ;

Now when the unsuccessful "SETLL" is performed i is equal to zero.

Do you know a better way? If so add a Comment, below, describing how you would do it.

This article was written for IBM i 7.1, and it should work with earlier releases too.

38 comments:

Alok KumarSeptember 3, 2013 at 2:55 PM
Thanks for this post
ReplyDelete
Replies
AnonymousSeptember 4, 2013 at 6:15 AM
(John Blenkinsop, via LinkedIn AS400 Specialists Group)

I'm running a test of the CHAIN, SETLL and SQL methods at the moment on our 'test' machine. All methods are very similar to your blog examples. Test makes one million iterations over 500,000 records so that half are hits.

Preliminary results of the first set (of ten) show :

CHAIN 42,280,000
SETLL 39,069,000
SELECT 264,819,000

Values are milliseconds.

The SQL method is far slower, the program is 48KB bigger, and in my opinion the code is much more obscure. I'm all for SQL in its place, but its place is not record existence validation in RPG programs.
ReplyDelete
Replies
Alok KumarSeptember 4, 2013 at 7:08 AM
We can also use
select count(*) into :i from where

and after this statement we can check the value of i.

if i > 0, that means record exist in file for the mentioned condition in where clause
ReplyDelete
Replies
Christopher Burns, Sr.September 4, 2013 at 7:08 AM
I have often used "SELECT COUNT(*) INTO :MYVAR WHERE...", and then tested whether MYVAR was greater than zero. The method you described is probably more efficient.

I wonder if the precompiler would tolerate "SELECT '1' INTO :MYVAR WHERE...", if MYVAR were an indicator variable. That might help the subsequent logic read a little better.
ReplyDelete
Replies
Hugh BradySeptember 4, 2013 at 7:11 AM
I've used select count(*) in the past but I think you can do something with a "where exists" clause that would ensure the query returns after finding first occurrence of the record. The main advantage to thinking in terms of SQL is not necessarily the bench mark speed of every query. It’s more about all the flexibility of the DDL and DML statements and all the enhancements IBM continues to make to the SQL query engine. I certainly don’t hate native I/O, it’s probably still better for some applications. Also, I wouldn’t suggest replacing or rewriting things that work. If it’s not broken, then don’t fix it.
ReplyDelete
Replies
AnonymousSeptember 4, 2013 at 8:17 AM
(John Blenkinsop, via LinkedIn AS400 Specialists Group)

I've tested Hugo Cantor's method (COUNT) against Simon's select into method (SELECT). Hugo's method is below:

select count(*) into :i
from sysibm.sysdummy1
where exists(select key from testfile where key = :keyfld);

The times for the tests (I only did two of each, interleaved in a batch job) are:

SELECT 263,640,000
COUNT 437,418,000
SELECT 263,214,000
COUNT 440,169,000

Note that the COUNT method is almost twice as slow. I noted that SQL methods opened more than one SQL view of the data file. In the COUNT method, the sysdummy1 file was, of course, also opened, and had 2 million IOs for 1 million from one open of the test file. This would contribute to the increased run time.

Despite Hugh Brady's optimism regarding the flexibility of DDL and DDM, and the continuing enhancement of the engine, you have to at some point look at what is the best tool to use for the job in hand, and to use embedded SQL just to check a record's existence is slow, unintuitive and vastly inefficient.
ReplyDelete
Replies
AnonymousSeptember 4, 2013 at 9:36 AM
Any time you are using count(*) you must process the entire file/table to get the answer. I prefer the following for sql:

select '1' into :NamedRecordFoundIndicator
from sysibm.sysdummy1
where exists(select key from testfile where key = :keyfld);
ReplyDelete
Replies
Sharon WKSeptember 4, 2013 at 10:27 AM
When I was first learning SQL, I was told that using "select count(1) ....." instead of "Select count(*) ...." was more efficient for validation checking, because using the wildcard "*" retrieved all the fields where using a number simply counted the occurences of the records, possibly from the indexes if present.
I have never been able to verify that with statistics, however.
ReplyDelete
Replies
Hugh BradySeptember 4, 2013 at 2:50 PM
I feel like John is missing the point of the SQL argument and inventing his own counter argument. SQL is better because it’s portable, you can run the same query on any database that conforms to the SQL standard. It’s easier to do unit testing because you can run your queries interactively. It’s easier to connect to remote partitions and systems (potentially non-DB2) using SQL-connect, JDBC and so forth. It has better support for transaction control. It’s more human readable, especially to non-IT folk. It supports relational integrity constraints within the DBMS. It has more advanced indexing options like encoded vector indexes. Most importantly, it allows for a loose coupling between the data definition layer (tables), the data presentation layer (views, procedures and functions) and your application layer (queries and code). You might have the fastest native I/O program in the world but you’ll still need to modify and recompile it every time someone wants to add a column to your database file. Again, the point is not to replace native file operations with SQL and then have a pissing contest. The point of SQL is to add an additional layer of abstraction between any programing language (RPG, C, and Java) and its data source. I realize it’s too often abused and I think that’s the underlying cause for a lot of misperception.

This trivial and contrived performance debate between SETLL and CHAIN is utterly moot. I’ve made a good portion of my living fixing performance issues in System i environments for the last decade, and never once has the solution to a performance related problem been to use SETLL in place of CHAIN.
ReplyDelete
Replies
AnonymousSeptember 4, 2013 at 11:06 PM
In SQL scripts (stand-alone or embedded in RPG) I use the SQL EXISTS clause for checking the presence or absence of a key value.

In RPG I almost always use SETLL with the %EQUAL BIF. I use the EXISTS in a SQL statement for record selection, where the record selection from one table depends on whether or not something is found in another table.

I would think that having SQL count all records would be more time consuming than just seeing if a key value exists, either with EXISTS or SETLL and %EQUAL.
ReplyDelete
Replies
Rocky MarquissSeptember 5, 2013 at 4:54 AM
Without testing I'd suspect there's still a performance hit with aggregate function (COUNT) over just inserting a value - in otherwords I'd replace select count(*) into :i with 1 into :i - it's a simple assignment.

However - you probably noticed I did prefix my statement with 'IF I were to use SQL...' - I'm a believer of using the right tool for the job - IF the right tool is SQL - use it. IF the right tool is native I/O (SETLL, CHAIN, etc) - use it. We all too often try to use pliers when a hammer is called for...
ReplyDelete
Replies
Rajendra SharmaSeptember 5, 2013 at 4:59 AM
i have use EXISTS, that is much faster in SQL, as this test if there will be record or not.
SETLL does the same thing, it return if there will be record or not.
If you want i can provide you some generic statement, where i had in my project.
ReplyDelete
Replies
John TrevinoSeptember 5, 2013 at 11:03 AM
In my experience, SQL, CHAIN, SETLL each has its place depending upon the environment and accessed files. Having also benchmarked each method for use in a high throughput online environment during benchmark testing with IBM, I found native IO calls faster (SETLL and then CHAIN) when checking existence for a single file record. The data suggested that the overhead associated with SQL makes it a better fit for situations involving simultaneous multiple file access/queries.
ReplyDelete
Replies
Tai WongSeptember 6, 2013 at 2:48 PM
Simon, you may offend the SQL purist by sandwiching the SQL within RPGLE. You are a sly fox, in that you already know the answer will performance wise be a no-contest. It is like pitting RAM against a Hard Drive for retrieval, fundamentally from an architectural view point. However, if you want bees, just stir up the hive.
ReplyDelete
Replies
Vernon HambergSeptember 7, 2013 at 8:32 AM
I would think a combination of a WHERE clause and ORDER BY in descending order.

This presents a situation not easily handled in SQL, seems to me. I've worked with this kind of thing in writing an Open Access handler for RLA opcodes. I'd not considered the SETLL/READPE combination - only SETLL/READP and SETGT/READPE.

Problem is, how do you know which direction you are going?

Maybe another answer is a brute force method - prepare the SELECT and open it, then FETCH from start until you get a match where you are currently, then FETCH back - or save the record at each level - YUK! So much IO!

Ideas! Ideas! Ideas!
ReplyDelete
Replies
Federico Cambero FenoySeptember 9, 2013 at 9:30 AM
I believe that programming should be clean, streamlined and robust.
and I believe that in these cases using embedded sql is not the best option as the language itself gives you the solution.
ReplyDelete
Replies
Christopher Burns, Sr.September 30, 2013 at 10:01 AM
I have followed this thread with significant interest. When I was first developing the Inuendo open source (http://inuendo.us), one of the concerns I had with the GET functions was performance in a very high volume environment, because an SQL SELECT INTO was being used instead of a SETGT/READPE for time index sensitive searches.

The reason I took that approach was so porting the open source code to other platforms would be simpler, and so non-RPG programmers who wanted to use the same techniques in other languages would have an easier time.

This conversation has enticed me to do a high volume benchmark test when I can carve out some time. Stay tuned.
ReplyDelete
Replies
mlevinOctober 16, 2013 at 8:49 AM
I know I am a bit late for this debate, but yesterday I had to use SQL to check if exists rather than the traditional setll, because the files in questions did not have any keys set up for the field I was checking and going in a traditional way would require me to read these files record by record from beginning to end. SQL did the job in two coding lines.
ReplyDelete
Replies
Wayne GroulxNovember 26, 2013 at 7:02 AM
It is amazing to see the various solutions provided within this post. Whether you use SQL or RPG, a lock is issued against the record (if one exist) before you can delete it. Birgitta is also correct in her statement that you must be careful and what some are suggesting with the use of SQL to get a count is madness for performance.

Imagine reading a 10 million record file to find how many records fit your criteria and then asking the system to do it again to perform the deletion.

This code will allow you to avoid doing a read prior to your delete clause. (RPG Free)

Setll (Key) FIlename;

dou %eof;
Delete (key) filename;
EndDo;

Please note that your test for deletion must be the key of the file. Meaning, If you have to insert an IF clause to that logic. You must read, test then perform the delete. This makes your question mute and using a SQL may be more efficient.

SQL optimization is based on the key access paths of views (Logicals) and parent (Physical) files. When you use a SQL fetch it locks the row (record) to allow the deletion.
I'm not sure what benefit is gained in performance by determine if a record exist prior to executing a Delete from file where statement. The SQLSTATE after the execute clause will tell you whether it did exist and only incurs the performance hit once instead of twice.

Hope this helps.
ReplyDelete
Replies
AVROHOMNJanuary 5, 2017 at 7:32 AM
Wrong. If You run the example SQL twice, one that finds the record, and the second that does not find the record, It still keeps a 1 in variable i.

Better use SQLCODE. 100 = Not found. 0 = Found, a number with a minus means the SQL failed.

Besidest that: using I as a variable is not a good proctice. If You have a big source it will be hard to serach for it. Why not use a more meaningful variable: Fund, or even better FoundMyfile.
ReplyDelete
Replies
elhayDecember 9, 2018 at 3:17 PM
i almost never use setll and instead i use the sql way
at least when performance isn't the main issue (and it's not 95% of the times)
main reason for me is that if i use sql i don't have to declare the file in the F spec and i don't have to remember what logical file have they key i need
sql is just more comfort to use
ReplyDelete
Replies

Add comment

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.

RPGPGM.COM - From AS400 to IBM i

Pages

Tuesday, September 3, 2013

Equivalent of SETLL using SQL

38 comments: