[comp.sys.mac.programmer] Line-oriented File IO

earleh@eleazar.dartmouth.edu (Earle R. Horton) (04/14/89)

In article <4015@ece-csc.UUCP> jnh@ece-csc.UUCP (Joseph Nathan Hall) writes:
>In article <6987@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes:
>>In article <4012@ece-csc.UUCP> jnh@ece-csc.UUCP (Joseph Nathan Hall) writes:
>>
>>>I dunno, this isn't
>>>nearly as difficult as dealing with the low-level File Manager routines.
>>>For example, if you want to read a text file, using the newline flag
>>>(reported to exist and occasionally mentioned in the documentation)...
>>

	PBClear(pb);			/* zeroes a ParamBlockRec 1st time */
	pb->ioParam.ioRefNum = _iofrefnum; /* obtained from FSOpen earlier */
	pb->ioParam.ioBuffer = tmpline;	   /* buffer to hold input line */
	pb->ioParam.ioReqCount = NSTRING;  /* size of tmpline */
	pb->ioParam.ioPosMode = 0x0080 | (EOLCHAR << 8);
	PBRead(pb,FALSE);
	/* Repeat until you get 'em all. */

>>Why would you want to do that?  Can you say "slow"?  Surely reading a
>>text file in a block-by-block way is not beyond the skills of a
>>professional programmer!  Remember, the key to speed is to read in
>>as much as possible into the largest buffer possible.  If you follow
>>this rule, the actual code calling the File Manager is nearly trivial.

>	4) Anyway, I disagree with the simple assertion that the "key to speed"
>	   is reading as much as possible at a time from disk.  This is just
>	   not true...  
>	   ...In systems that provide fast character I/O, it can
>	   be SLOWER to do the "raw" block reads yourself if you have to
>	   write the supporting code in a HLL.
>

     I have to agree with Tim on this one.  Microemacs, which I ported
to the Mac, uses newline mode to read in text files.  In fact, the
code you see above comes from my (highly modified) copy of fileio.c.
I timed it against QUED, another RAM-based editor.  I do not have
source to QUED, but MacsBug tells me it reads in files the way Tim
recommends: it gobbles up the whole file at once.  I used a Mac II
with internal HD SC 80 and with the RAM cache turned off.  I made
several tests with each editor, reading into a buffer the file
{MPW}AStructMacs:FlowCtlMacs.a from MPW 2.0.2.  This is a moderately
large assembler header file with 2756 lines and 137259 characters.

     Microemacs read in the file in an average of 5 seconds.  QUED did
the job in 1.5.  You really shouldn't say that something is "just not
true" unless you have tested it yourself.  The MacOS supported line
oriented IO is definitely slower.  Of course, with a small to medium
source file (~10-50k) the difference is so small as to be not worth
mentioning, even when reading from floppies.

     I have been distributing microemacs for the Mac for two years.
It has used newline mode of reading text files the entire time, and I
have yet to receive a complaint about speed.  I do, however, expect to
get lots of them now that I have publicly admitted that QUED is more
than 3 times as fast at disk IO!

>The lack of explicit Toolbox support for non-block-oriented file I/O is, at
>best, a weird omission.  Sure, it's there, but >hiss< >boo< it's buried in
>the low-level routines and it's not well documented.

     I have to agree that it's "buried in the low-level routines," but
I disagree with your comment on documentation:

	"...Bit 7 of ioPosMode is the newline flag...
	 The high-order byte of ioPosMode contains the newline character."

     Looks pretty durn straightforward to me!  Note that on the Mac,
the newline character can be anything from '\0' to '\377' inclusive.
Bet you can't get fgets() to do that for you!
Earle R. Horton

Graduate Student.  Programmer.  God to my cats.

jnh@ece-csc.UUCP (Joseph Nathan Hall) (04/14/89)

In article <13054@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle R. Horton) writes:
>In article <4015@ece-csc.UUCP> jnh@ece-csc.UUCP (Joseph Nathan Hall) writes:
>>	   ...In systems that provide fast character I/O, it can
>>	   be SLOWER to do the "raw" block reads yourself if you have to
>>	   write the supporting code in a HLL.

>     I have to agree with Tim on this one.  Microemacs, which I ported
>to the Mac, uses newline mode to read in text files.  In fact, the
... [microemacs is slower than another RAM-based editor which uses
block reads]

I won't argue this point in the case of the Macintosh; however, your results
will vary depending upon operating system and language support.

It's mere coincidence, also, if your internal text format happens to match
up with your disk file format.  The ubiquitous TextEdit example where the
code to load and save a text file is about 10 lines long per function is
elegant for precisely this reason.  I presume, also, that Microemacs and
several other text-only editors use this format for convenience.

The advantage of using the O/S-supplied newline mode (to me) is that it is
more versatile than the corresponding "C" and Pascal equivalents without
being that much more complex.  It's reasonably fast and easier to prototype
with.  Then again, your personal preferences may vary.


-- 
v   v sssss|| joseph hall                      || 201-1D Hampton Lee Court
 v v s   s || jnh@ece-csc.ncsu.edu (Internet)  || Cary, NC  27511
  v   sss  || joseph@ece007.ncsu.edu (Try this one first)
-----------|| Standard disclaimers and all that . . . . . . . . . . . . . .

oster@dewey.soe.berkeley.edu (David Phillip Oster) (04/15/89)

In article <4019@ece-csc.UUCP> jnh@ece-csc.UUCP (Joseph Nathan Hall) writes:
_>The advantage of using the O/S-supplied newline mode (to me) is that it is
_>more versatile than the corresponding "C" and Pascal equivalents without
_>being that much more complex.  It's reasonably fast and easier to prototype
_>with.  Then again, your personal preferences may vary.

Warning! Warning! The newline mode is _only_ implemented by the file
manager. In particular, it is not implemented by the Serial port Manager.
I was disgusted when I found this out, since there is no no reasonable way
to do serial asynchronous i/o from a device that sends variable length
lines without doing a PBRead() for each character (or periodically polling
the receive buffer, which is just as bad.)

--- David Phillip Oster            --"When we replace the mouse with a pen,
Arpa: oster@dewey.soe.berkeley.edu --3 button mouse fans will need saxophone
Uucp: {uwvax,decvax}!ucbvax!oster%dewey.soe.berkeley.edu --lessons." - Gasee