[comp.lang.pascal] Reading files

dsc@drutx.ATT.COM (DavisCS) (11/08/87)

Help a beginner in Pascal??

Using Turbo Pascal on a PC - Is there a simple way (w/o parsing) to
read a file (on disk) where the values are a mixture of reals and
strings (on the same line). I DO know the layout, but only the maximum
length of the strings. For example the file might look like

22.33 0.1234 12.19
January 1.09
22.54 99.22 Strawberries
999.0 1.2345 Blueberries
1.1 2.2 Apples
    .
    .
    .
99.44 0.56 Ivory
0.0 0.0 LastAlpha

I need to be able to pull out the reals and the strings for further
processing.

Thanks in advance.

catone@dsl.cis.upenn.edu (Tony Catone) (11/12/87)

In article <5852@drutx.ATT.COM> dsc@drutx.ATT.COM (DavisCS) writes:
>
>Using Turbo Pascal on a PC - Is there a simple way (w/o parsing) to
>read a file (on disk) where the values are a mixture of reals and
>strings (on the same line). I DO know the layout, but only the maximum
>length of the strings. For example the file might look like
>
>22.33 0.1234 12.19
>January 1.09
>22.54 99.22 Strawberries
>999.0 1.2345 Blueberries
>1.1 2.2 Apples
>    .
>I need to be able to pull out the reals and the strings for further
>processing.

The following program should do it for you.  I am posting it because
it illustrates a very simple and elegant method of parsing using the
built in Turbo functions paramcount and paramstr in conjunction with
an absolute variable declared over the unformatted parameter area
of the Program Segment Prefix.  There was a letter about this some
time back in PC-Mag's Turbo column; credit to the original innovator,
whoever s/he was.  Good luck!

					- Tony
					  catone@dsl.cis.upenn.edu
					  catone@wharton.upenn.edu

{============================ Cut here ===============================}

{$R+}  (* Turn on range checking for safety *)

program Demonstrate_Parsing_using_Absolute_Variables(input, output);
(*
 *  File: AbsParse.PAS
 *  Created on: 11/11/97 - TC
 *  Author: Tony Catone (TC)
 *          catone@dsl.cis.upenn.edu
 *          catone@wharton.upenn.edu
 *
 *  Purpose: This file demonstrates a very simple and elegant method of
 *           parsing input using Turbo Pacal 3.01a's paramcount and
 *           paramstr functions.  Tokens are separated by either spaces
 *           or tabs.  Other separators could be used if the program first
 *           scanned the input string and replaced these separators with
 *           blanks.  The significance of using different separators would
 *           then of course be lost.  From an idea I read about once in
 *           PC-Mag's Turbo column; credit to the original innovator,
 *           whoever s/he was.
 *)

const
  (*
   *  Arbitrary choice of maximum string length,
   *  change to suit your application.
   *)
  String_Token_Length = 20;
  Escape_Character = ^[;  (* chr(27) *)
  (*
   *  64 bytes is the longest possible fully qualified file name
   *  DOS function calls will except.
   *)
  Longest_Possible_Fully_Qualified_File_Name = 64;

var
  (*
   *  DOS limitation makes the maximum line length 127.
   *  Lines greater than this length cannot be processed.
   *  Even attempting to do so may really foul things up by
   *  overwriting code with data (I have not checked this out).
   *  CSeg: $80 is where the unformatted parameter area of
   *  the Program Segment Prefix (PSP) stores command line
   *  parameters, which is why this strategy works.
   *)
  Line_to_Parse : string[127] absolute cseg:$80;
  Input_File : text;
  Input_File_Name : string[Longest_Possible_Fully_Qualified_File_Name];
  Index : integer;
  Valid_Input_File_Obtained : boolean;
  Real_Number_Token : real;
  String_Token : string[String_Token_Length];
  Numeric_Conversion_Return_Code : integer;
  Current_Line_Number : integer;
  Throw_Away_User_Response_Character : char;

begin

  writeln('This program prompts for a file to be parsed into tokens.');
  writeln('The tokens are labelled according to type, numeric or string,');
  writeln('and echoed to the console.  No other processing is done.');
  writeln('This program is for demonstration purposes only.');
  writeln;
  writeln('Please press any key to continue');

  read(kbd, Throw_Away_User_Response_Character);
  if (Throw_Away_User_Response_Character = Escape_Character)
     and keypressed then
    (*
     *  Got an extended ASCII key.  Clear the input buffer with
     *  another read from the keyboard.
     *)
    read(kbd, Throw_Away_User_Response_Character);
  writeln;
  writeln;

  Valid_Input_File_Obtained := false;
  repeat
    write('Please enter input file name: ');
    readln(Input_File_Name);
    assign(Input_File, Input_File_Name);
    {$I-}
    reset(Input_File);
    {$I+}
    if ioresult = 0 then
      (*
       *  Got a valid file opened.
       *)
      Valid_Input_File_Obtained := true
    else
      begin
        (*
         *  Open failed.  Close the file handle.
         *)
        close(Input_File);
        writeln('Unable to open file.  Please try again.');
      end;
  until Valid_Input_File_Obtained;
  writeln;


  Current_Line_Number := 0;
  while not eof(Input_File) do
    begin
      readln(Input_File, Line_to_Parse);
      Current_Line_Number := Current_Line_Number + 1;
      writeln('There are ',
              paramcount,
              ' parameters on line ',
              Current_Line_Number);

      for Index := 1 to paramcount do
        begin
          (*
           *  Use the built in Turbo 3.01a procedure val to
           *  convert the string into a number, in this case a real.
           *  An error code is returned if the conversion is
           *  unsuccessful.  The numeric variable can also be an
           *  integer, which could be useful if you needed to
           *  separate out integer and real input; attempting
           *  to convert a real number string into an integer will
           *  yield an error code.  I don't bother to do that here;
           *  I treat all numbers as reals.
           *)
          val(paramstr(Index),
              Real_Number_Token,
              Numeric_Conversion_Return_Code);
          if Numeric_Conversion_Return_Code = 0 then
            begin
              writeln('Token ',
                      Index,
                      ' is numeric and has a value of: ',
                      Real_Number_Token:0:0);
            end
          else
            begin
              String_Token := paramstr(Index);
              writeln('Token ',
                      Index,
                      ' is a string with contents: ',
                      String_Token);
            end;
        end;
      writeln;
    end;

  close(Input_File);
end.

abcscnuk@csun.UUCP (Naoto Kimura) (11/14/87)

In article <5852@drutx.ATT.COM> dsc@drutx.ATT.COM (DavisCS) writes:
>
>Help a beginner in Pascal??
>
>Using Turbo Pascal on a PC - Is there a simple way (w/o parsing) to
>read a file (on disk) where the values are a mixture of reals and
>strings (on the same line). I DO know the layout, but only the maximum
>length of the strings. For example the file might look like
>
>22.33 0.1234 12.19
>January 1.09
>22.54 99.22 Strawberries
>999.0 1.2345 Blueberries
>1.1 2.2 Apples
>    .
>    .
>    .
>99.44 0.56 Ivory
>0.0 0.0 LastAlpha
>
>I need to be able to pull out the reals and the strings for further
>processing.
>
>Thanks in advance.

Is the format of the file going to be just like the file above ?
That is, sort of like this (# denotes real number, month, desc are
strings) :

# # #
month #
# # desc
...

The first line and the last set of lines aren't any problem, since you
can just use readln

readln(num1,num2,num3);  (* reads three numbers on first line *)

readln(n1,n2,str);       (* reads two numbers then a string *)

to read them (Pascal takes care of things like that).

The second line of the input could be a problem since you probably
have a varying-length string is followed by a number.   Lines like this
one you'll have to parse.   Just read the line into a string, then
process it.   All you have to do is to scan through the line, copying
the characters into a string variable, until you hit a space (which I
am assuming you are using as a delimiter).

    read(buf);
    i := 1;
    done := i > length(buf);
    while not done do
	begin
	    s[i] := buf[i];
	    i := i + 1;
	    if i > length(buf) then
		done := true
	    else
		done := buf[i] = ' ';
	end;

After extracting the string, you should have the number in the string
after the space, which could be extracted by using a procedure or
function (I don't recall the name, but it exists, just look in the
manual) which converts a string into a number.

                //-n-\\					Naoto Kimura
        _____---=======---_____				(csun!abcscnuk)
    ====____\   /.. ..\   /____====
  //         ---\__O__/---        \\	Enterprise... Surrender or we'll
  \_\                            /_/	send back your *&^$% tribbles !!