Tuesday, November 5, 2013

Validating numbers without TESTN

I have received an email asking me the following question:

How can I validate numbers in RPGLE /free without using testn?

I have spent a considerable amount of time dealing with data from Microsoft Excel spreadsheets that has been uploaded to the IBM i (AS400). It is easy to use the TESTN operation code to test ideally formatted "00123456", but what about "123456  " or "1,234.56  "?

In this post I am going to use five fields that you could encounter trying when converting an alphanumeric to a numeric field:

01 D A0              S              8    inz
02 D A1              S              8    inz('00001234')
03 D A2              S              8    inz('1234')
04 D A3              S              8    inz('1234.56')
05 D A4              S              8    inz('1,234.56')

The TESTN operation code offers three indicators to report back to us the results of the test:

  • First indicator position (71-72) - The field contains numeric characters. In the example below indicator 01 is used.
  • Second indicator postion (73-74) - The field contains at least one leading blank, for example " 1234".
  • Third indicator position (75-76) - The field is blank. In the example below indicator 02 is used.
C                   testn                   A0                   01  02
C                   testn                   A1                   01  02
C                   testn                   A2                   01  02
C                   testn                   A3                   01  02
C                   testn                   A4                   01  02

So what happens when the TESTN is used to test our fields?

  • A0 - indicator 02 is on as the field is blank.
  • A1 - indicator 01 is on as the field only contains numeric characters.
  • A2 and A3 - no indicator is on as the blanks at the end of the fields fail the TESTN test.
  • A4 - no indicator is on as it contains an invalid character, the comma (,).

I have seen people use the Check Characters BIF, %CHECK, to test the characters in the field, although I wonder why they do.

   /free                           
01   i = %check('0123456789.':A0) ;
02   i = %check('0123456789.':A1) ;
03   i = %check('0123456789.':A2) ;
04   i = %check('0123456789.':A3) ;
05   i = %check('0123456789.':A4) ;

The field i contains the first character not in the values listed.

  • A0 - i = 1 as blank is not in the allowed list of characters.
  • A1 - i = 0 success! The field only contains numeric characters.
  • A2 - i = 5 as the fifth character in the field is blank.
  • A3 - i = 8 as the eighth character in the field is blank.
  • A4 - i = 2 as the second character is a comma (,).

I would use the Convert to Decimal BIF, %DEC, within a MONITOR group. For more information about the MONITOR group read the post MONITOR for errors in RPG. I am not going to repeat the code below five times, I am just going to show one of the conversions as they would all be indentical.

   /free                           
01   monitor ;
02     N = %dec(Ax:8:2) ;
03     on-error ;
04       dsply Ax ;
05   endmon ;

On line 2 I am converting the alphanumeric field to a numeric field of 8,2. If this conversion produces an error I am displaying, line 4, the value of the alphanumeric field. I would never do that in a production program, I would have some error handing in place. But as this is a test program I am using the DSPLY operation code to show me the error.

  • A0 - error as blanks cannot be converted to a number.
  • A1, A2, A3 - success!
  • A4 - error as the comma cannot be converted.

To cope with the problems caused by A0 and A4 I change the code to this:

   /free                           
01   if (Ax = ' ') ;
02     N = 0 ;
03   else ;
04     Ax = %xlate(',':' ':Ax) ;
05     monitor ;
06       N = %dec(Ax:8:2) ;
07       on-error ;
08         dsply Ax ;
09     endmon ;
10   endif ;

Line 1 tests if the alphanumeric field is blank, A0. If it is then zero is move to N.

Line 4 uses the Translate BIF, %XLATE, to translate the comma to a blank, if one is present. A4 is translated from "1,234.56" to "1 234.56", and then converted to a number on line 6.

The statement above may look more complicated than the old TESTN, but as you can see that it is a better way to validate and convert an alphanumeric representation of a number to a numeric field.

You can learn more about these from IBM's website:

 

This article was written for IBM i 7.1, and it should work with earlier releases too.

38 comments:

  1. I remember that the old TESTN didin't work on negative numbers - so "-123N" would be valid with TESTN

    ReplyDelete
  2. Hear is a way to do the Simon

    If %CHECK('0123456789':Char_Field) = 0;
    // the character field is all numeric.
    Else;
    // there are non-numeric characters in the character field
    Endif;

    ReplyDelete
  3. Simon,

    With RPG Free, the Monitor/On-Error/Endmon structure is also a great and simple way to perform validations whether it be a numeric field, date field. etc. For example:

    /Free
    Monitor;
    newNum = testNum;
    On-Error;
    newNum = *zero;
    Endmon;
    /End-Free

    ReplyDelete
    Replies
    1. I agree with you about the MONITOR structure. So much so I wrote a post about it: MONITOR for errors in RPG

      Delete
  4. /=============================================================
    // Test field for numerics
    //=============================================================
    d posn S 3 0
    d numbers S 12 Inz(' .0123456789')
    d wrkNumber S 11 2
    d inputNumber S 10 Inz(' $1,234.56')
    d error S n

    C/free

    // Option 1 - just test for valid numerics
    Monitor;
    wrkNumber = %dec(inputNumber:10:2);
    On-error;
    error = *on;
    EndMon;

    // Option 2 - remove non-numerics from field, then convert
    posn = %check(numbers:inputNumber);
    Dow posn<>*zero;
    inputNumber = %replace('':inputNumber:posn:1);
    posn = %check(numbers:inputNumber:posn);
    EndDo;

    wrkNumber = %dec(inputNumber:10:2);

    *inlr = *on;

    /end-free

    Note that the "'numbers" field could contain the minus (-) sign to catch negative numbers as long as the '-' was leading or trailing. If editing numbers like tax IDs or phone number, which may contain imbedded hypens, you'd want to leave the minus sign out of the %check operation.

    ReplyDelete
  5. i am using monitor function.
    monitor;
    decValue = %dec(charValue);
    on-error;
    ....

    ReplyDelete
  6. Alakesan VyravippillaiNovember 6, 2013 at 10:12 PM

    Monitor ;
    IntNum = %Dec(CharNum:6:0);
    On-Error 105; // Error code 105 is active if it is not number
    IntNum = 0 ;
    @Error = 'Y' ;
    EndMon ;

    ReplyDelete
  7. Before to know TESTN statement i put the field to test with MOVEA in an array with the numer of elements equal to the number of characters of the field to test, after i did a loop to control each element of the array with LOKUP statement on a table containing "0123456789". If all the element of the array gave found on a table that means the fiels was numeric otherwise the field was to be elavued as alphanumeric.

    ReplyDelete
  8. Might be helpful to write follow up to this conversation with example of function to wrapper this logic for generic number checking with any length

    ReplyDelete
  9. Very often, the test for 'is this a number' precedes converting that string to a numeric field. Many times, this is some sort of data import from an external source, say a CSV file. Sometimes, the CSV file will have a string like '$12,345.67' Pretty much all of the ways to test for numeric will indicate that this is not a number and yet we want to extract the number to put in our database.

    In 2000, Barbara Morris published a routine to extract the number from a string which allows one to specify a currency symbol, decimal and thousands separator symbols. http://wiki.midrange.com/index.php/Character_to_number Mel Rothman and Giovanni Perotti have numeric conversion routines in CGIDEV2 as well (free and open source).

    ReplyDelete
  10. That brings up a good question. Does the FREE FORM (monitor) method for test numeric have any issues with negative numbers? ......I would test this myself--but being unemployed...I don't have a iseries in front of me. Can someone please test this?

    ReplyDelete
    Replies
    1. I have tested with the minus sign at the front, '-123.45', and at the end, '123.45-', and the numeric field I moved them to contains -123.45

      Delete
  11. Alakesan VyravippillaiNovember 7, 2013 at 11:05 PM

    monitor;
    $Amt = %Dec( '1,234.56' : 9 : 2 ) ;
    on-error 105;
    $Amt = 0 ;
    endmon ;

    ReplyDelete
  12. How about you read Simon's article before posting the same solutions he already provides?

    ReplyDelete
  13. with the enhanced free format, you don't have to use /free and /end-free at all. therefore, you'll be able to mix free and fixed "free"ly in calculations.

    So mixing in a TESTN will not be so ugly as it would have been before.

    Coming soon to a v7.1 near you!

    ReplyDelete
    Replies
    1. I don't think it works like that. There is no TESTN statement in free format so you can't use it. If that were the case there would be no need for the character and numeric bifs. Just continue to use MOVE and MOVEL.

      Delete
    2. Sorry, Paul, I wasn't clear enough.

      With the new support, there is no need anymore to use /free - therefore, you can use both free and fixed format one after the other without the extra directives.

      // some free form stuff
      variable = value;
      // a fixed form statement
      C TESTN ... whatever
      // more free form stuff
      var2 = val2;

      Delete
    3. According to the updated ILE RPG Reference, "If you code the /FREE or /END-FREE directive, it will be ignored,"

      To be more specific, here are a couple items from that manual -

      The /FREE and /END-FREE directives are no longer required. The compiler will ignore them.

      Free-form statements and fixed-form statements may be intermixed.

      Delete
    4. That's going to lead to some horrible and ugly code!

      Delete
    5. Au contraire, mon frere! Pardon my horrible French!

      I am of a different opinion - when non-free is needed, it will be much cleaner than it was. For most of what we do, we will need only free form, of course.

      Barbara mentioned a specific item that requires fixed form - a TAG as the target for the WHENVER clause in SQL. We can't even use a subroutine or subprocedure for that!

      And we can always make a subprocedure, to contain the majority of our fixed-form stuff, as needed. This kind of test is a natural for that, eh?

      Delete
    6. Holy Cow! The /FREE directive is now optional? Well, it looks like one step forward and one step back.

      If they really were concerned about TAG as target for SQL's WHENEVER clause, surely they could have come up with a better solution. Many other languages use a colon to identify a label. There's no reason not to allow the same syntax in RPG. But why the concern now? RPG has had embedded SQL for longer than /FREE.

      Back when free-form calcs were designed, feedback from RPG programmers heavily influenced the design of the feature. One concern from programmers that it should not be too easy to switch between fixed and free form calcs. Thus, one of the reasons for /FREE was to provide a speed-bump, to make it inconvenient to switch willy-nilly between fixed and free form.

      Regarding TESTN to test if you have numeric data, please please please read the RPG Reference manual very carefully. As Simon points out in his article, TESTN does not do what you want. You might be surprised that TESTN treats "123M" as a valid numeric. Likewise, "-123" is not considered by TESTN as valid numeric.

      Delete
    7. Holy compiler directive, Batman!!

      As to getting something to really replace the opcodes that didn't make it, Barbara did lobby for some, like TAG, for the reason that there is no free-form way to specify the target of a WHENEVER. But time and money and all that - you know the routine when doing development. There were reasons that those nice MOV* opcodes weren't included - I still wish some of them could have been. NOT CAB*, thank you very much!!

      The point of my last comments was to clear up something Paul Wren said - the question of using TESTN was not germaine - and the cautions are well-taken!

      Delete
    8. Vernon: Don't forget, I was there at the time. TAG wasn't included in /FREE simply because GOTO wasn't included. We knew about SQL's WHENEVER, but it wasn't considered a compelling enough reason to keep TAG. If I recall correctly, I think the intention was to include a tag using a colon syntax at some point. But it would appear that that thought was forgotten somewhere along the way. That would not be very difficult to implement.

      The MOVEx opcodes were excluded simply because they have goofy semantics. (A lot of the old RPG opcodes sort of make sense but only if you understand the old 360 instruction set.)

      Delete
    9. Yeah, Hans - Barbara mentioned something about how the compiler reads things that makes some things unusable, I'll say - I forget the details, not important now, I suppose. Something about the order of the keywords and all, and assumptions made around that..

      This newer free format offers to me a nice, cleaner syntax for the occasional use of fixed form - not as jerky as using the /free /end-free - that just clutters up things. Nice that they are gone, IMO.

      this is getting into minutiae - I'm just glad to have more free-form options.

      Delete
    10. Vernon: First, there should be very little need for any of the opcodes excluded from free form calcs. All old-style opcodes have alternatives. The next step in the evolution of RPG (if there ever is to be one) would be a fully free-form language, with all fixed form features fully deprecated. They could start that process now with a compiler option, allowing statements starting in column 1, with no limit to line length. In fact, the free-form calcs were initially designed with that eventual possibility in mind.

      Second, even if your use of fixed-form calcs is limited, having to deal with the /FREE and /END-FREE directives shouldn't be that much of a nuisance. I can understand the dislike of the need for the directives before free-form P, D, H, and F specs. But they're free form now, so the vast majority of new programs won't even need the directives.

      Third, I know how the compiler works, I can't see any reason for not having a colon syntax for tags, like other programming languages. You read an identifier, which you do anyways for any statement, then read the colon. Bingo! You define the identifier as a tag name and move on.

      Delete
    11. Hans. I totally agree that removing /free constraint can be seen a step forward by some, and a backward step by some. The /free and /end-free eye sores forced programmers to keep the coding straight. I am now afraid spaghetti logic will prevail and ultimately kill RPG, with usage of not only MOVE, MOVEL etc but also GOTO.

      On the other hand it has hastened my desire to shift to Java as I am now sure of RPGs demise before my retirement :)

      Delete
    12. Hassan, I can't disagree with you more. I don't know how having the eyesores helps prevent spaghetti code. Personally, I want the code to flow, to be clean - and I will not often mix free-form and mixed-form - but I will when it helps me. The SCAN fixed-form does something that %scan doesn't that I'm aware of - puts the positions into an array - that is VERY handy, so I'll use it. I might put it into a subprocedure, however, to keep the flow smooth in the main processing area.

      I in NO way am saying we should use GOTO and TAG and all that. Please do NOT even begin to think that.

      I do say that IF I do need a fixed-form statement, I can do it without the extra directives, which provide no help at all in reading the code, they only distract me.

      If someone wants to write spaghetti code, they will - no matter what the language encourages. If you want to write nicely structured code in RPGLE, you can, just as you always have been able to.

      I agree generally with Hans that there are free-form alternatives for the fixed-form opcodes that were excluded. I will admit that at a previous, in old code, with rather "interesting" uses of MOVE and MOVEL, I was reluctant to change that to free-form - that was more a matter of the time it takes, and often there just wasn't time to be sure the result was the same. Quality assurance matters, right?

      So Hassan, there is nothing in the new free-form that will encourage less-than-desirable programming practices. Tell you what, get the PTFs when they are available - maybe by the 15th - try the thing out - if you want to keep using the directives, you can - just as you could put SR at the beginning of a BEGSR spec if you want, it is just ignored.

      Delete
    13. Vernon: Then you're missing the whole point of free-form calcs. And I'm rather surprised I need to make this point at all. The reason people use free-form calcs is so they can indent statements to properly show the structure of the code. The advantage to indentation are so stunningly obvious that every programmer in every other language does it. However, if you mix free-form and fixed-form calcs, you lose that. It's not a matter of spaghetti code. It's a matter of clarity. This was understood by all of us a dozen years ago when we implemented /FREE. This was exactly why we wanted a speed bump a dozen years ago, to make it more difficult to mix fixed and free calcs.

      By the way, I'm not advocating for the addition of the GOTO opcode to free-form calcs. However, with SQL's WHENEVER, there does seem to be a need for a tag in free-form calcs. The one does not necessarily require the other.

      (On the other hand, it's hard to argue against GOTO in a language that also has pointers, in my opinion. I consider pointers the "goto" of data structures. But that's another argument.)

      Delete
    14. Slackers and Amateurs use GOTOs

      One "Brilliant Developer" say that is what a subroutine does. In the underlining machine code, it uses a GOTO. And he was doing "Advance Coding" just like MI (Machine-Instruction). See how "BRILLIANT" he is!!!

      Every College teaches the Evils of GOTO's. I guess some fell a sleep during that class....

      You cannot fix IGNORANT developers. Just force them to maintain their own code! It seems to work most times!

      Delete
  14. When converting character data to numeric, be sure you understand the format your character data will be in, and use the appropriate test for that format.

    TESTN tests whether a character field can be MOVEd to a numeric field properly, so it expects zoned-decimal format: all characters numeric, except the last character can also be x'C0'-x'C9' for positive numbers or x'D0'-x'D9 for negative numbers. (TESTN can also accept leading blanks since blanks become zeros when MOVEd.) TESTN considers '123N' numeric because it's the zoned decimal representation of -1235 and that's the value you'd get if you MOVEd '123N' to a numeric field, but TESTN considers '-1,235' non-numeric (the comma can't be MOVEd, and the minus sign wouldn't make the number negative when MOVing).

    RPG IV functions like %DEC are more tailored for the latter format, removing editing characters like commas and automatically handling negative signs and decimal points - but not zoned-decimal representations of negative numbers.

    ReplyDelete
  15. GOTO/TAG, >>>by the lack of something like Java labeled breaks<<< (!), can make the code more readable in error conditions.

    For example:

    chain
    if %found;
    // check value.
    if not "wrong value"
    ...
    if ok;
    // here is the interesting code, n levels deep.
    endif;
    ...
    endif;
    endif;

    too many nesting levels.


    A cleaner alternative:

    dou 0=0;

    if not %found;
    leave;
    endif;
    if "wrong value";
    leave;
    endif;
    if not ok;
    leave;
    endif;

    // interesting code

    enddo;

    Much cleaner: escape to end of block when error.


    However, if you edit this code, and you add another nested do-block (e.g. a loop) the "leave" opcode may not escape to the right point (always the innner block).

    With a labeled leave opcode, you can *explicitly" state the block to escape, so no problems changing the code.

    For this reason, i want my GOTO/TAG back (and trust me i don't use it to just jump around).

    With or without GOTO/TAG, bad programmers still create bad code. Just removing it won't help much. Besides, 99% of RPG code in production is spaghetti.

    It's not that simply not using GOTO/TAG means you always do structured programming. You can do structured programming with GOTO/TAG (or any programming language). It's a mind set. Code is divided in blocks, and each block has exactly one entry (the top), and one exit (the bottom). What syntax you use for this does not determine whether it's structured or not.

    ReplyDelete
  16. In my opinion, if you're nesting statements more than three or four levels deep, your first task is to restructure your code to reduce the nesting. With shallow nesting, the temptation to use GOTO will diminish.

    ReplyDelete
  17. Everyone - I regret that so many picked on the TAG statement I made - and this descended into the maelstrom from there!

    The ONLY thing I was speaking of was that TAG is still needed in the very narrow context of the embedded SQL WHENEVER construct.

    Peace out - and that's the end of my part in this thread. Cheers, y'all!!

    ReplyDelete
    Replies
    1. Vernon. Let me explain. Companies do have a lot of legacy code. In fact in a typical RPG shop there is many times more RPGII style legacy code than the Java style code. Now RDPi allows conversion of a lot of statements to /free, but not all of them. Especially when you use the fabulous style of converting date with a combo of MOVE & MOVEL. The eye sore of /free and /end-free forces to spend some time and convert that code into modern code using date functions. With /free & /end-free gone, developers would be tempted to not invest time in the rewriting the code but just delete the /free and /end-free directives and get rid of the eye sores.

      Delete
    2. Hassan - I see your point. I think that the different look of fixed-form specifications is a sufficient eye-sore, so that I would not need the directives to be encouraged to look at changing them.

      There was some really odd - to me, anyhow - odd use of MOVE and MOVEL at an earlier position of mine, and I just wasn't comfortable nor had the time to make the change.

      So I see how you would use these directives - me, I prefer to be rid of them completely - it's good we are both here, right?

      Delete
  18. Once I encountered a situation where TESTN recognized as valid the single characters A-F - maybe it thought it was Hex.

    ReplyDelete
  19. Blanks is taking as zero in a Field. I handled it with VALNUM and CHECK(RZ) keyword but system is throwing "Invalid character sequence in numeric only field " . Can we throw user-specific error in place of this.

    Please suggest

    ReplyDelete
    Replies
    1. If you are doing validation in the display file, CHECK(RZ) and VALNUM.

      If you want to control the error checking then you will need to remove VALNUM, and handle the validation in your program.

      Delete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.