Wednesday, July 30, 2014

Easy way to convert CSV file to DDS file

csv excel cpyfromimpf cpytostmf

I am sure we have all been given a Microsoft Excel spreadsheet and told to upload the data from it into IBM i files. The first thing I do with the Excel spreadsheet is to save is as a CSV file. CSV stands for comma-separated values, the data elements, or columns, can be of varying length and are separated with a comma. It could be described as an Excel file without all the special formatting. If you have a Windows PC and you right-click on a CSV file, select "Open with", and then "Notepad" the file will look like this:

First,Second,,"4,444",5.5
First field,Second field,Third field,4.4,"5,555,555"

In conversations with other IBM i folks I am finding that the "Send file to host" function of Client Access is being shut off on their IBM is as it is rightly regarded as a security risk. What can be used as an alternative?

I am going to describe two scenarios in this post:

  1. CSV file is in a folder in the IFS
  2. CSV exists in a “flat" file in the IBM i


CSV file is in a folder in the IFS

A user received an email with an Excel spreadsheet. They saved the Excel to their local drive, opened it and saved it as a CSV file with a given name, then copied the CSV to a shared drive on their PC, which is mapped to a folder in the IFS.

In this example the IFS folder is called "Transfer" and the CSV file is called "test.csv". I am going to copy it to a file called "OUT_FILE" in the library "MYLIB". I am using the CSV data shown above. The DDS for "OUT_FILE" looks like:

01 A          R OUT_FILER
02 A            FLD1          15A 
03 A            FLD2          15A 
04 A            FLD3          15A 
05 A            FLD4          10A 
06 A            FLD5          10A

Even though the fourth and fifth columns of the CSV are numeric I still define the fields as alphanumeric. I will explain why I do below.

02 CPYFRMIMPF FROMSTMF('/Transfer/test.csv') +
                TOFILE(MYLIB/OUT_FILE) RCDDLM(*CRLF) +
                RPLNULLVAL(*FLDDFT)
 
03 RMVLNK OBJLNK('/Transfer/test.csv')

The CPYFRMIMPF is the command to copy this CSV file to the DDS file.

The FROMSTMF parameter contains where the CSV file is.

The TOFILE parameter contains the library and file I want to copy the data to.

The RCDDLM parameter gives the code that is the record delimiter. In most cases *CRLF (carriage return and line feed) works.

I do not have the specify the field delimiter, FLDDLM parameter, in this file as the fields in the CSV are delimited with a comma, which is the command's default. I will give an example later in this post of using another delimiter character.

The CSV file that is being copied contains a null value, which occurs when a column contains no value. The third column in the first record of the CSV file is empty, the two commas are next to each other with no character separating them, this is null. I find converting null values when copying the file from the IFS is the easier that either having to use null indicators in your RPGLE code, or remembering to compile your RPGLE program with the ALWNULL set as either *INPUTONLY or *YES. The RPLNULLVAL(*FLDDFT) replaces any null values in the CSV file’s data with the default value in the DDS file. Which in this case will be a blank as FLD3 is alphanumeric.

After the CPYFRMIMPF command is run I delete the CSV file from the IFS using the RMVLNK command.

If I had defined FLD4 and FLD5 as numeric I would have received an error when performing the CPYFRMIMPF command as the fourth column in the first record and the fifth column of the second record contain a thousand separator character (,). This will have to be removed before the value in those fields can be converted to a numeric field.


CSV exists in a "flat" file in the IBM i

In this scenario a CSV file has been copied using FTP to a file in the IBM i and created what I call a "flat" file. In this case a pipe ( | ) has been used as the column separator.

01|18 inch wide broom|34.9
02|3 gallon bucket|6.78

As I used the IFS in the previous example I am going to use the QDLS file system in this one. I find there is no advantage to using the IFS or the QDLS.

This process needs to perform two copies:

  1. Copy the “flat" file to the QDLS (or IFS).
  2. Copy the file in the QDLS (or IFS) to the DDS file.
01 CPYTOSTMF FROMMBR('/qsys.lib/mylib.lib/in_file.file+
                        /in_file.mbr') +
               TOSTMF('/qdls/filetran/test.csv')

02 CPYFRMIMPF FROMSTMF('/qdls/filetran/test.csv') +
                TOFILE(MYLIB/OUT_FILE2) RCDDLM(*CRLF) +
                FLDDLM('|') RPLNULLVAL(*FLDDFT)
   
03 RMVLNK OBJLNK('/qdls/filetran/test.csv')

The CPYTOSTMF is the command to copy the “flat" file to the QDLS.

The FROMMBR parameter looks a bit strange. I have to give the path for the file’s member that contains the data I want to copy. This naming convention is called name.object-type. The first part is qsys.lib, the library QSYS is the equivalent of the root folder for the IBM i. My library, MYLIB or mylib.lib in this format, like every other library is found within QSYS. The file that contains the member is IN_FILE which needs to be given as in_file.file. And finally the member needs to be given, in this case the member has the same name as the file, and it is given as in_file.mbr.

The TOSTMF parameter is where I want the data copied to. In this example it will be copied to a file, test.csv, in the QDLS folder filetran. The QDLS is like an old DOS environment with its limits for the length of folder and file names.

Now the file is in the QDLS I need to copy it from there into the DDS file, OUT_FILE2. I use the CPYFRMIMPF as I did in the previous example. The only difference is here I have to use the FLDDLM parameter as the field delimiter is the pipe symbol.

And, as I did in the previous example, I delete the file from the folder.


Gotchas

What do I mean by gotcha? These are problems that can cause you problems you need to be aware of before using CSV files.

Column separator characters. I prefer not to use the comma. I have had situations where the data in the column was not enclosed by double quote ( " ) and a comma was contained in the data. In the example below the comma in the first record will be included in the data for the second field. The comma in the second record is recognized as a column separator, and the second record now contains an extra column.

1208,"Calendar 2014, stating in January",3.10
1219,Calendar 2014, starting in April,3.10

Double quote. The symbol or abbreviation for inches can be the double quote symbol. If the double quote is used for inches it will cause issues as the conversion process will regard it as the start or end of a string of text. You have to insist that whoever sends you file will use "in" as the abbreviation of inches.

01|"18" wide broom,black"|34.9 
02|"3 gallon bucket,blue"|6.78


You can learn more about these from the IBM web site:


This article was written for IBM i 7.1, and it should work with earlier releases too.

9 comments:

  1. "...I find there is no advantage to using the IFS or the QDLS."

    Simon - a couple of notes:

    1) Just a nomenclature: QDLS file system is one file system in the Integrated File System. They are not two separate things.

    2) Don't use QDLS if you can avoid it. QDLS is out there for compatibility with DOS. It is extremely slow for most operations compared to other areas of the IFS.

    ReplyDelete
  2. Simon, freaking awesome! I was just going to upload a file using the CA file transfer and this is SOOOO much easier. Thank you. Ann Richmond

    ReplyDelete
  3. Simon, I forgot to add your gotchas are spot on. I have found " for inches and commas within field very often. I tend to use; in lieu of comma, but | is a good idea . thanks Ann

    ReplyDelete
  4. Simon, my preferred separator is *TAB when saving Excel files to the IFS. Just specify FLDDLM(*TAB). But Excel is funny, not in a laughing sense, as it still wraps column data that contains a comma (,) with double-quotes.

    ReplyDelete
  5. I use an "add in" in MS Excel that is undoubtedly doing much of this behind the scenes. I always have to resort to inserting dummy data in the first row to mimic the attributes of each field to ensure the file that gets created on the iSeries has the field attributes I intended. For example, if a numeric field of say a dollar amount of N(7.2) I would insert 12345.67 in that column in my first row of dummy data. I read with interest that your method does away with that bit of tedium. I do a lot of uploading so I will be coming back to this article for future reference as I migrate my apps to iSeries based modules.

    ReplyDelete
  6. It all depends on the quality of the csv data you are getting
    If you are 100% sure the data is good (no data in the csv to big to fit the DDS field, no illegal characters in numeric fields, right number of fields etc) then CPYFRMIMPF will do it
    If you are unsure then you need an RPG pgm to read each record in the csv file, parse it to get the data in each field by looking for the delimiters,compare the data to the file field description (There is a system maintained file in QSYS with every field in every file you can use) and if it is bad, reject the record or the whole file. This would be a generic pgm, you don't need a special one for each file.
    if the data is verified, then use CPYFRMIMPF
    if the data is bad and you dont check it the the command will crash.

    ReplyDelete
  7. Nice tip, thanks!

    ReplyDelete
  8. I have CSV file in C drive. How can I transfer this file into IFS folder?

    ReplyDelete
    Replies
    1. Open Operations Navigator.
      Then drag and drop the CSV file to the IFS folder.
      Or...
      How to map an IFS folder as a Windows share folder
      Or...
      You could FTP from your PC to the IFS folder, providing your IBM i has its FTP started.

      Delete

To prevent "comment spam" all comments are moderated.
Learn about this website's comments policy here.

Some people have reported that they cannot post a comment using certain computers and browsers. If this is you feel free to use the Contact Form to send me the comment and I will post it for you, please include the title of the post so I know which one to post the comment to.