[comp.sys.mac.hypercard] Reading/Writing Text files - some helpful discoveries

psych@watdcsu.waterloo.edu (R.Crispin - Psychology) (06/02/89)

	I was working with Hypertalk to read/write data from/to text 
files. I wanted to determine the best method to do this. I found 
no information in several hypertalk books that I checked so I 
devised the following 3 methods to read/write the data.
1) read/write the file/field one line at a time and put "it" into 
the field/file a line at a time.
2) read/write the file/field one line at a time and put each line 
into a line of a variable then at the end put the variable into 
the field/file
3) read/write the entire file/field at once and put "it" into the 
field/file at once.

I also designed 2 tests to see what influence line processing has.
1) read the file/field and write the field/file as indicated above
2) read the file/field but only write words 3 and 4 of each line 

I used a text file that had 346 lines totaling 14,682 characters. 
NOTE: the size of the file is important since there is a 32K limit 
on fields and, I discovered, a 16K limit for variables.

I put the data into a scrolling field for the read tests. I then 
used the field to write the data to a file for the write tests.
 
The following table shows the results

			Test 1			Test 2
			ticks	secs		ticks	secs
Method 1	read	17,046	284		10,908	182
		write	 3,540	 59		 3,652	 61
Method 2	read	 3,752	 62		 2,595	 43
		write	 5,975	100		 4,693	 78
Method 3	read	    76	  1		 3,679	 61
		write	    14	  0		------  ---


To test each method I created a card with two fields and a button. 
One field was called "junk" and received the data. The second 
field was called "nlines" and reported the time and the size of 
"junk"(to be sure that everything matched). The button had the 
following script for method 1 test 1

on mouseUp
  put "Richard:Janet Logs:Junk Input" into infile
  open file infile
  put empty into field "junk"
  put empty into field "nlines"
  put empty into out
  set cursor to watch
  put ticks() into starttime
  -- start modifications here
  put 0 into inline
  repeat forever
    read from file infile until return
    if it is empty then exit repeat
    add 1 to inline
    put it into line inline of field "Junk"
  end repeat
  -- end modifications here
  put ticks() into endtime
  set cursor to arrow
  put (endtime-starttime)&&"ticks" B 
     into line 1 of field "Nlines"
  put (the number of lines in field "Junk")&&"lines" B 
     into line 2 of field "Nlines"
  put (the length of field "junk")&&"chars" B 
     into line 3 of field "Nlines"
end mouseUp

For method 2 substitute the following for the centre portion

  put 0 into inline
  repeat forever
    read from file infile until return
    if it is empty then exit repeat
    add 1 to inline
    put it into line inline of out
  end repeat
  put out into field "Junk"
  
For method 3 substitute the following for the centre portion
  
  read from file infile for 32000
  put it into field "Junk"

For test 2 the only change in methods 1 & 2 was substituting "word 
2 to 3 of it" for "it" in the line previous to "end repeat"

Method 3 was modified to the following:

  put the number of lines in it into nlines
  repeat with i=1 to nlines
    put (word 2 to 3 of line i of it) into line i of out
  end repeat
  put out into field "Junk"

After the read tests, I duplicated the stack and did the needed 
modifications to do the write tests. Note that method 3 test 2 
ended up being similar to method 2 test 2 so I didn't bother.

From the table it is easy to see that method 3 is the fastest way 
to get data in to or out of a HC stack as long as you are not 
processing anything. Method 2 will be the most consistent in that 
it will behave about the same no matter what. Method 1 is the 
worst for reading a file but the best for writing to the file. I 
cannot explain the large time differences that exist between 
methods 1 and 2 when reading. I would have thought they would be 
closer. I was also quite surprised at the speed of the write test 
using method one. 

This information I believe is accurate. If you find a problem with 
what I have done please let me know (email to an address below).

Richard Crispin
Dept. of Psychology             Bitnet: psych@watdcs 
University of Waterloo          Unix  : psych@watdcsu.UWaterloo.ca 
Waterloo, Ont.   Canada   N2L 3G1
(519)885-1211 ext 2879