Skip to main content  
        iSeries home   |   Easy400     |   CGIDEV2     |   MMAIL  
Public-Source
 
Introduction
 
XLSCONVERT utility
 xlstable utility
Some CGI programs
 
Appendixes:
JVMSTARTUP
Questions & Answers
Java products table
 
 Display the Change Log
 
 PDF of this tutorial
 
 
 Download
 
 

 
 
XLSTABLE utility
Convert an Excel XLS/XLSX spreadsheet to a db file
  1. What
  2. Command XLSTABLE
  3. An XLSTABLE failure
  4. The process of command XLSTABLE
  5. Spreadsheets containing formulas
  6. About the target file
  7. (Re)Creating the target file
  8. Suggestions for a recursive use of the same XLS/XLSX spreadsheet model

1- What

After talking to some users about the XLSCONVERT utility, I realized that they needed something more.
It was not enough to provide a way to access an image of an XLS/XLSX spreadsheet (as the one created in library QTEMP from the XLSCONVERT utility) with their traditional ILE-RPG programming, via subprocedure xlsgetcell.
They simply wanted to have the an Excel XLS/XLSX spreadsheet transformed into a table: an obvious OS/400 physical file that can be read by any programming language for the System i.
This is what commands XLSTABLE is about.

2- Command XLSTABLE

Command XLPARSE2/XLSTABLE generates and fills a physical file (target file) with the data from an Excel XLS/XLSX spreadsheet.

  • The Java job execution environment must not have been already set (through the command ADDENVVAR ENVVAR(CLASSPATH) ...). If a classpath other than the expected one was set, you receive a Java escape message saying that a class was not found.
  • The user profile running command XLSTABLE must have special authority *JOBCTL. This is required by command ADDENVVAR which is executed by the program.
  • The Excel XLS/XLSX spreadsheet must be on the IFS and must be coded in ASCII characters (CCSID 819-ASCII ISO Latin 1, 1212-PC USA, or 1252-IBM PC).
  • Though the spreadsheet may contain several spreadsheets, just one spreadsheet is processed.
  • Only the first 256 cells of a row are retrieved, subsequent cells are ignored.

                        Generate PF from XLS/XLSX (XLSTABLE)

 Type choices, press Enter.

 .xls/xlsx stream file  . . . . . XLS                                             
                                                                                  
 Spreadsheet sequence number  . . SHEET       1             1-999
 'Sheet name' . . . . . . . . . . SHEETNAME   *SHEET                              
 Target file  . . . . . . . . . . FILE                      Name
   Library  . . . . . . . . . . .                           Name
 Target member  . . . . . . . . . FILEMBR     *FIRST        Name, *FIRST
 Replace or add records . . . . . MBROPT      *REPLACE      *ADD, *REPLACE
 (Re)create target file . . . . . CRTFILE     *NO           *YES, *NO
 Source file  . . . . . . . . . . SRCFILE     QDDSSRC       Name
   Library  . . . . . . . . . . .               *FILELIB    Name, *FILELIB
 Source member  . . . . . . . . . SRCMBR      *FILE         Name, *FILE
 Record format name . . . . . . . RCDFMT      *AUTO         Name, *AUTO
 Record format field names  . . . RCDFLDNMS                 Name, *NONE
                + for more values                       
 Header lines . . . . . . . . . . HLINES      0             0-99
 Columns forced as numeric  . . . FRCNUMCOLS  *NONE         1-500, *NONE
                + for more values                  
 Field names  . . . . . . . . . . FLDNAMES    *DFT          *DFT, *COLHDG
 Date columns . . . . . . . . . . DATECOL     *AUTO         Number, *AUTO, *NONE
                + for more values                  
 Date format  . . . . . . . . . . DATEFMT     *YMD          *YMD, *MDY, *DMY, *ISO...
 Time columns . . . . . . . . . . TIMECOL     *NONE         Number, *NONE
                + for more values                  
 Time format  . . . . . . . . . . TIMEFMT     *HMS          *HMS, *ISO, *USA, *EUR, *JIS
 Run in a sync submitted job  . . SBMJOB      *NO           *YES, *NO
 Job queue  . . . . . . . . . . . JOBQ        QSYSNOMAX     Name        
   Library  . . . . . . . . . . .               *LIBL       Name, *LIBL
 Type of failure message  . . . . FAILMSG     *ESCAPE       *ESCAPE, *DIAG
 Display target file  . . . . . . DSPFILE     *YES          *YES, *NO
  • .xls/xlsx stream file (XLS) - This is the qualified name of the stream file (CCSID 819-ASCII ISO Latin 1, 1212-PC USA, or 1252-IBM PC) containing the Excel XLS/XLSX spreadsheet to be transformed into the physical file specified in parameter FILE.
    IFS directory '/xlparse2/samples' contains a number of spreadsheet stream files that can be used for testing command XLSTABLE.
  • Spreadsheet sequence number (SHEET) - The sequence number (1 to 99) of the spreadsheet to be processed.
    Instead of specifying the spreadsheet sequence number in parameter SHEET, you may specify the spreadsheet name in parameter SHEETNAME. This is allowed only if:
    • the IBM i OS release is at least V6R1,
    • utility HSSFCGI is installed, and
    • service program HSSFCGI/XLPARSERJP is available.
    Note. When you specify a sheet name in parameter SHEETNAME, the value specified in parameter SHEET is ignored.
  • Sheet name (SHEETNAME) - The name of the sheet to be processed. This value is case sensitive and must be specified within quotes (').
    When you specify the spreadsheet name in parameter SHEETNAME, the sheet sequence number in parameter SHEET is ignored.
    Parameter SHEETNAME requires
    • at least IBM i OS release V6R1,
    • utility HSSFCGI must be installed, and
    • service program HSSFCGI/XLPARSERJP must be available.
  • Target file (FILE) - Qualified name of the physical file output from the process.
    If the physical files does not exist, and CRTFILE(*YES) specified, a DDS source member is generated and is used to create the physical file. The physical file is created with MAXMBRS(*NOMAX).
  • Target member (FILEMBR) - Physical file member output from the process. If the member specified does not exist,
    • if the file is defined as MAXMBRS(*NOMAX), the member is added to the file
    • otherwise a program exception is generated.
  • Replace or add records (MBROPT) - Specifies whether the new records replace or are added to the existing records. Select one of the following:
    • *REPLACE - The program clears the existing member and adds the new records.
    • *ADD - The program adds the new records to the end of the existing records.
  • (Re)create target file (CRTFILE) - Select one of the following:
    1. *YES
      • if the target file does not yet exist, or
      • if the target file exists already, but you want its record format layout be re-computed from the XLS/XLSX spreadsheet.
    2. *NO
      if the target file already exists and you want to keep its record format layout that way it currently is. If you select *NO, you must be aware that possible format changes to the Excel XLS/XLSX spreadsheet may impair the ability to correctly load the data into the existing target file.
  • Source file (SRCFILE) (only for CRTFILE(*YES)) - The source file to contain the DDS generated for the target file. If not yet existing, the source file is automatically generated.
  • Source member (SRCMBR) (only for CRTFILE(*YES)) - The name of the source member to contain the DDS generated for the target file. You may use *FILE to mean the same name as the target file name in parameter FILE.
  • Record format name (RCDFMT) - Record format name of the physical file to be created. This parameter may be used only when CRTFILE(*YES).
    Two options are available:
    1. Leave *AUTO - The record format name is automatically generated.
    2. Enter your own record format name.
  • Record format field names (RCDFLDNMS) - When CRTFILE(*YES), you may choose among three different ways to assign record format file names:
    1. Define up to 100 field names in parameter RCDFLDNMS
    2. Leave RCDFLDNMS(*NONE) and pick up field names from spreadsheet column headings, see parameters HLINES and FLDNAMES(*COLHDG).
    3. Leave RCDFLDNMS(*NONE) and FLDNAMES(*DFT). In this case field names FLD1, FLD2, ... FLDn are automatically generated.
  • Header lines (HLINES) - Number of header lines that should be ignored.
    Sometimes one or more initial rows of a sheet contain headers. Header columns must not be processed as they do not contain valid record data. This parameter allows to tell how many initial rows should be skipped.
  • Columns forced as numeric (FRCNUMCOLS) - Columns containing number-format cells and text-format cells do result into a character-data fields in the generated database file record format.
    To force these columns to generate numeric-data fields in the generated database file record format, you must mention their numbers (1 for column A, etc.) in the entries of this parameter.
    Up to 50 such columns may be specified.
    Of course, if a specified column happens to contain just number-format cells, a numeric-data record format field is generated.
    A text-format cell, to be converted to a number must containg a string where
    1. the minus ("-") sign, if needed, is the first character
    2. thousand separators are not used
    3. the "decimal point", if needed, is either character (".") or character (",") according to the national language of the Excel spreadsheet.
    Valid examples:
    • English language spreadsheet: -8927.353 or 8927.36
    • Italian language spreadsheet: -8927,353 or 8927,36
  • Field names (FLDNAMES) - This parameter is displayed only when CRTFILE(*YES) and the value of parameter HLINES (Header lines) is greater than zero.
    It establishes the naming rule for the record format fields of the physical file (defined in parameter FILE) to be created.
    Select one of the following naming rules:
    • *DFT - Field names are named FLD followed by a sequence number
    • *COLHDG - Field names are taken from the columns of the last hreading row, provided that they are valid field names.
  • Date columns (DATECOL) - In Excel spreadsheets, a date is stored as a floating point number. This number represents the number of days elapsed since January 1, 1900. In order to have numeric values converted back to dates, do one of the following:
    • Type *NONE to avoid converting any numeric value to a date value.
      This is the recommended choice unless you know that the spreadsheet contains at least one date column.
    • Enter up to 50 column numbers to identify the columns where numeric values must be converted to date values.
      This is the recommended choice when you know that the spreadsheet contains at least one date column.
    • Type *AUTO to let the program establish which columns contain numeric values to be converted to date values.
      This choice may provide unexpected results on numeric columns and is suggested only for test purposes, in order to find out if any date columns exist in the spreadsheet.
  • Date format (DATEFMT) - Date format (DATEFMT) - This parameter is made available when CRTFILE(*YES) and parameter DATECOL does not specify *NONE.
    It is used to establish the format of date fields (data type L) when creating the target database file (specified in parameter FILE). Select one of the available formats:
    formatrepresentation
    *ISOyyyy-mm-dd
    *USAmm/dd/yyyy
    *EURdd.mm.yyyy
    *JISyyyy-mm-dd
    *YMDyy?mm?ddwhere character ? is the date separator used at job level
    *MDYmm?dd?yywhere character ? is the date separator used at job level
    *DMYdd?mm?yywhere character ? is the date separator used at job level
  • Time columns (TIMECOL) - In Excel spreadsheets, the internal representation of a time value (example: 16:48:56) is a number (example: 700648). This number measures the time in millionths of a day (example of computation: there are 84,600 seconds in 24 hours; time 16:48:56 is 69,536 seconds; therefore the Excel internal representation of time 16:48:56 is (69,536/84,600)*1,000,000=700648 .
    In order to have Excel time numeric values converted back to a conventional time format (example: hh.mm.ss), do one of the following:
    • Type *NONE to avoid converting any numeric value to a time value. This is the recommended choice unless you know that the spreadsheet contains at least one time column.
    • Enter up to 50 column numbers to identify the columns where numeric values must be converted to time values. This is the recommended choice when you know that the spreadsheet contains at least one time column.
  • Time format (TIMEFMT) - This parameter is made available when CRTFILE(*YES) and parameter TTMECOL does not specify *NONE.
    It is used to establish the format of time fields (data type T) when ctreating the target database file (specified in parameter FILE). Select one of the available formats:
    • *HMS - Time representation is hh:mm:ss
    • *ISO - Time representation is hh.mm.ss
    • *USA - Time representation is hh:mm AM or hh:mm PM
    • *EUR - Time representation is hh.mm.ss
    • *JIS - Time representation is hh:mm:ss
  • Run in a sync submitted job (SBMJOB) - If you specify SBMJOB(*YES), the XLSTABLE command is executed in a separate submitted job. The current job waits until the submitted job completes, then it resumes execution.
    This can be useful in two cases:
    • you do not want XLSTABLE to start a Java Virtual Machine in the current job, OR
    • you do not want XLSTABLE to run in the current job because it may conflict with a Java Virtual Machine already active in this job.
    Should you absolutely need to run this command in the current job, though the JVM has already been started by a previous Java application, read about command JVMSTARTUP in Appendix JVMSTARTUP.
  • Job queue - If you specify SBMJOB(*YES), this is the job queue where the synchronized job is submitted.
  • Type of failure message (FAILMSG) - When you specify SBMJOB(*YES), the XLSTABLE command is executed in a synchronized submitted job.
    It may however happen that the XLSTABLE command fails during its execution. In such a case a message is sent to the current program in the current job. You may choose whether this message would be an *ESCAPE or a *DIAG (diagnostic) one.
  • Display target file (DSPFILE) - whether the target file should be displayed at the end of the process.

3- An XLSTABLE failure

In some cases, XLSTABLE command may send the escape message "No significant cells detected in this sheet".
This may happen when a spreadsheet was generated from some software tool and some properties are missing.
You can manually fix the Excel workbook on a PC in this way:
  • Open the spreadsheet with Excel (2003 or subsequent)
  • Save the spreadsheet; Excel will add the missing properties.
Should you need that be done in an IBM i procedure of yours, you may bypass this problem by specifying in parameter SHEETNAME the name of the spreadsheet you want to process.
Of course, utility HSSFCGI with service program XLPARSERJP is required.

4- The process of command XLSTABLE

It may be helpful to understand what goes on under the covers:

  1. Some initial checks are done.
  2. Command XLPARSE2/XLSCONVERT is executed, thus creating file QTEMP/XLSOUTF from the Excel XLS/XLSX spreadsheet.
    Command XLPARSE2/XLSCONVERT really reads all the spreadsheet cells, using the Scott Klements XLPARSER4 utility (Scott's xlparse Java class and POI Java classes).
    As a result from this process, file QTEMP/XLSOUTF contains a database representation of all the spreadsheets in the XLS/XLSX workbook.
  3. If CRTFILE(*YES) specified
    1. Subprocedure xlsgetcell is used to receive from file QTEMP/XLSOUTF all the cells from the requested sheet (see parameter SHEET).
      At the end of this process, the data type and the size of each column in the sheet are known, a DDS member is generated (any previous version of the target file member is replaced by the new one (!!!)) and the target file is created.
  4. If CRTFILE(*NO) specified
    1. DDS are not regenerated, and the target file is not rebuilt.
  5. Cells data are used to build, in a temporary stream file, an "imported file" script. The "imported file" script is used by command CPYFRMIPMF to upload the target file.
  6. Next, the load of records to the target file takes place:
    • The "imported file" script is used by command CPYFRMIPMF to upload the target file.
  7. Last, if DSPFILE(*YES), the contents of the target file are displayed.
    A far better insight of the target file can be displayed by another open-source Easy400.net utility: CGI_WRKDBF.

5- Spreadsheets containing formulas

Values resulting from formulas may have conversion problems. A simple way to bypass these problems is documented in this topic.

6- About the target file

In the record format of the target file, each field corresponds to a column in the XLS/XLSX sheet.
There are four type of fields:
  • Numeric fields - A spreadsheet column containing only numeric data (formulas are considered be numeric) generates a numeric field.
    A numeric field is always generated as zoned, 30 digits, of which 6 are decimal digits.
  • Character fields - A spreadsheet column not containing text data generates a character field.
    Character fields have a critical factor, their sizes. The size of a character field is computed from the largest cell data in a sheet column.
    This means that, if you process via command XLSTABLE an XLS/XLSX sheet similar (same type and number of columns) the resulting re-created target file record format may result different simply because the largest character cell in a given column has a size different from the largest character cell in the same column of the previous similar sheet.
  • Date fields - These fields are generated from the spreadsheet columns listed in parameter DATECOL. All generated date fields share the same format, the one specified in parameter DATEFMT.
  • Time fields - These fields are generated from the spreadsheet columns listed in parameter TIMECOL. All generated time fields share the same format, the one specified in parameter TIMEFMT.
  • Special conversion cases
    • Case 1 - Generic format cell containing a positive number, example 1234,56.
      This cell is converted to a decimal packed(30 6) field.
    • Case 2 - Generic format cell containing a negative number number with sign minus as first character, example -1234,56.
      This cell is converted to a decimal packed(30 6) field.
    • Case 3 - Generic format cell containing a negative number number with sign minus as last character, example 1234,56-.
      This cell is converted to a text field.
    • Case 4 - Columns containing both numeric format and text format cells. All cells are converted to text fields. Numeric data are right aligned, text data are left aligned.
      However, if you mention these columns in XLSTABLE command parameter FRCNUMCOLS, all cells are converted to a numeric field.
    • Case 5 - Columns containing both numeric format and generic format cells. All cells are converted to text fields. Generic data and text data are left aligned.
      However, if you mention these columns in XLSTABLE command parameter FRCNUMCOLS, cells may be converted to a numeric field.

7- (Re)Creating the target file

Use parameter CRTFILE(*YES) to re-create the target file.
When re-creating the target file, you have some options regarding
  1. The record format name.
    Two choices are available:
    1. Leaving the default value RCDFMT(*AUTO) creates a record format name made of the first (max 7) characters of the file name, followed by "RCD".
    2. You may specify your record format name in parameter RCDFMT.
  2. The record format field names.
    Three choices are available:
    1. Up to 100 field names can be specified in parameter RCDFLDNMS.
    2. If the spreadsheet has column headings that can be used as field labels, you may obtain such field names by
      • Leaving RCDFLDNMS(*NONE), and
      • specifying for parameter HLINES a number of heading rows higher than zero, and
      • specifying FLDNAMES(*COLHDG)
      In such a case, field names are taken from the columng headings of the last heading rows specified in parameter HLINES.
    3. If you leave RCDFLDNMS(*NONE) and FLDNAMES(*DFT), default field names FLD1, FLD2, ... FLDn are automatically generated.
Note 1. The following field data types can be generated when CRTFILE(*YES):
  • Character (data type A), any length.
  • Zoned numeric (data type Z), always 30 digits, 6 decimal positions.
  • Date (data type L), with the format specified in parameter DATEFMT.
  • Time (data type T), with the format specified in parameter TIMEFMT.
Note 2. When CRTFILE(*NO) character, numeric, date and time fields are mapped to the format of the corresponding fields in the receiving target file.
Note 3. Assigning your own field names may be of great help in copying a generated target physical file to an existing database file with identical field names, with the same data type, but different field length and/or precision. That is easily done using command CPYF ... FMTOPT(*MAP).

8- Suggestions for a recursive use of the same XLS/XLSX spreadsheet model

In most cases you will be planning for a recurrent use of command XLSTABLE for some XLS/XLSX spreadsheet models.
What you must expect is that a worksheet of a given model is uploaded via XLSTABLE to its appropriate target file member, so that the target file can then be processed by a given program of yours, without the need of re-compile it to match the record format of the target file.
In order for this to happen without troubles, you must make sure that
  1. A given worksheet has always the same number of columns
  2. All columns in the spreadsheet maintain their data type (Character/Numeric; however, see NOTE 1 below.)
  3. The character fields in the record format of the target file are large enough to account for the largest case. This can simply be done - once for ever - in the following way:
    1. review the DDS:
      1. You CAN assign all fields the names that best fit your process.
      2. You CAN add field level keywords.
      3. You CAN add keys.
      4. You are strongly recommended to increase the lengths of the character fields in order to fit the largest case.
      5. You MUST NOT change data type, digits and decimal positions of the numeric fields. They must always specify 30S 6 .
        See however NOTE 1 below.
      6. You MUST NOT resequence fields.
    2. recreate the target file
    3. run command clrlib qtemp
    4. run command xlstable ... crtfile(*no) ...
  4. Make sure that CRTFILE(*NO) is specified in command XLSTABLE. If you specify CRTFILE(*YES) your DDS member is cleared and new DDS are generated from the program !!!
NOTE 1- There might be cases where some columns of a spreadsheet may some times show up as numeric and some other time show up as character. In such cases, the best thing to do is to
  1. redefine via DDS the corresponding target database fields as 30 char fields
  2. run command clrlib qtemp
  3. run command xlstable ... crtfile(*no) ...