[comp.lang.pascal] file windows

schwartz@shire.cs.psu.edu (Scott Schwartz) (07/12/89)

Alastair Milne writes:

| >	while (input^ in [space, tab, newline, cr, formfeed]) do
| >		get (input)
| 
|     How does this constitute not commiting to read the file variable?  You are
|     doing exactly that when you dereference the file variable to get its
|     window.

I didn't say you don't commit to reading the the window variable.  The
thing you don't commit to is advancing the file pointer: commiting to
read from the FILE.

|  I don't see how this differs at all from
| 	repeat
| 	  read( input, ch);
| 	until not (ch in [space, tab, newline, cr, formfeed]);
|     except that you first copy what would have been the file variable
|     reference into another variable.

Right!  In particular, I use the file variable to decide _not to do_
the read.  Thus:

	while (input^ in [...]) do
		read (input, ch);  (* ignore ch *)

Is fine, but you have to use while, not repeat, since you have to
check the file window first.  If you don't, then the last read in your
loop eats up and discards one of the valid inputs you are waiting for.

(I hope it is clear that I interpreted the question "is get better than
read" to mean "read-sans-file-window", since that's what Borland
implemented.  If you have file windows and get and put, read and write
are just syntatic sugar; although the reverse isn't true.)

I don't claim you should never use "read"/"write", just that you
shouldn't always (have to) use them.

Just to make this concrete, assume the above code were wrapped in a
procedure, like this:
	procedure skipwhite;
	begin
		<< code as above >>
	end;

then you could do the following:
	...
	writeln ('I have declared Russia illegal.  Begin bombing? ');
	skipwhite;
	read (ch); 
	if (ch in ['n', 'N']) then safe else sorry;
	...

Now, if we use my code, and type "No" to the prompt everything is
fine.  If we the code you gave above, then the read will return 'o'
instead of 'N', with dire consequences.

Having "skipwhite" return the last char it sees doesn't work either,
since you could be looking for a number to read next.

| >This works without side effects, so that it can be bundled up in a
| >black box and called by anyone.  

|     I/O almost always has side effects, because there's almost always
|     something that can go wrong.  [ discussion of error conditions deleted ]

Ok, granted that pascal has poor I/O error handling semantics.  That wasn't
quite the issue.

|     To the extent, however, that your while loop is free of side effects, and
|     can be called as a module, so is and can the repeat.  

No, your loop had different semantics than mine. 

[ more deleted ]
| >I would say a lot harder.  ....This requres
| >checking all calls to read and write to make sure they read from your
| >pushed-back data if it is available.  
| 
|     Again, I'm not following: I see no pushed-back data in the loop you gave.

The file variable handles this.  That's the point:  I don't have to handle it.

|     The series of get's the while does is exactly equivalent to the series of
|     read(ch)'s in the repeat I added.  In fact, if the correct definition of
|     read is used, they must be, because the read is simply "ch := input^; get
|     (input);"

Except for the last "read", whose corresponding "get" never happens because
the current character is not whitespace.

[ deletions ]
|     Anybody else like to see UNIX translated to Pascal?  I must admit it's
|     been a pet dream of mine for quite some time.

Someone once told me that Apollo's Aegis was mostly written in Pascal.

--
Scott Schwartz		<schwartz@shire.cs.psu.edu>

dat0@rimfaxe.diku.dk (Dat-0 undervisningsassistent) (07/12/89)

schwartz@shire.cs.psu.edu (Scott Schwartz) writes:
>Alastair Milne writes:
>| >	while (input^ in [space, tab, newline, cr, formfeed]) do
>| >		get (input)
>| 
>|     How does this constitute not commiting to read the file variable?  You are
>|     doing exactly that when you dereference the file variable to get its
>|     window.
>I didn't say you don't commit to reading the the window variable.  The
>thing you don't commit to is advancing the file pointer: commiting to
>read from the FILE.

As somebody already have pointed out, "read(input, ch)" is exactly 
equivalent to "ch:=input^;get(input)" (by definition of the ISO standard).

The interesting thing about this is (IMHO) that using the filewindow you 
can check what you are about to read, not just what you have already read.
I have seen a couple of examples, where this saves you a lot of trouble.

BTW it is the file-window that allows you to check eoln and eof. How do you 
manage to check that, if you don't have some kind of look-ahead?




Kristian Damm Jensen 
(dat0@diku.dk)
Institute of datalogi, University of Copenhagen (DIKU)

scl@virginia.acc.virginia.edu (Steve Losen) (07/25/89)

In article <4623@freja.diku.dk> dat0@rimfaxe.diku.dk (Dat-0 undervisningsassistent) writes:
>schwartz@shire.cs.psu.edu (Scott Schwartz) writes:
>>Alastair Milne writes:
>>| >	while (input^ in [space, tab, newline, cr, formfeed]) do
>>| >		get (input)
>>| 
>>|     How does this constitute not commiting to read the file variable?  You are
>>|     doing exactly that when you dereference the file variable to get its
>>|     window.
>>I didn't say you don't commit to reading the the window variable.  The
>>thing you don't commit to is advancing the file pointer: commiting to
>>read from the FILE.
>
>As somebody already have pointed out, "read(input, ch)" is exactly 
>equivalent to "ch:=input^;get(input)" (by definition of the ISO standard).
>
>The interesting thing about this is (IMHO) that using the filewindow you 
>can check what you are about to read, not just what you have already read.
>I have seen a couple of examples, where this saves you a lot of trouble.
>
>
...
>
>Kristian Damm Jensen 
>(dat0@diku.dk)
>Institute of datalogi, University of Copenhagen (DIKU)

Here's a good example of how convenient the file buffer variable is:

Suppose you want to allow the user to input a bunch of integers, separated
by any sequence of non-numeric characters.  For example, the following
atrocity is legal input:

,,100 ,20,
,

   30
40,, 5050
,,,

The program should process the integers 100, 20, 30, 40, and 5050.
Here is a trivial piece of Standard Pascal that does the job correctly.

while not eof do begin
	if (input^ >= '0') and (input^ <= '9') then begin
		read(number);
		process(number);
	end
	else
		get(input);
end

I admit that for most programming languages it is possible to come up
with simple and elegant solutions to selected bizarre problems.  Too
bad there isn't one language that is universally applicable.
-- 
Steve Losen     scl@virginia.edu
University of Virginia Academic Computing Center

schwartz@shire.cs.psu.edu (Scott Schwartz) (07/25/89)

In article <4623@freja.diku.dk> Kristian Damm Jensen writes:

| >| >	while (input^ in [space, tab, newline, cr, formfeed]) do
| >| >		get (input)

| BTW it is the file-window that allows you to check eoln and eof. How do you 
| manage to check that, if you don't have some kind of look-ahead?

Well, actually the above should have been written as

	while (not eof (input)) and (input^ in [...]) do
		get (input)

since eof and eoln are "out of band" conditions that can't necessarily
be detected just by looking at input^.   

--
Scott Schwartz		<schwartz@shire.cs.psu.edu>

eru@tnvsu1.tele.nokia.fi (Erkki Ruohtula) (07/26/89)

Kristian Damm Jensen (dat0@diku.dk) writes
>As somebody already have pointed out, "read(input, ch)" is exactly 
>equivalent to "ch:=input^;get(input)" (by definition of the ISO standard).
>
>The interesting thing about this is (IMHO) that using the filewindow you 
>can check what you are about to read, not just what you have already read.
>I have seen a couple of examples, where this saves you a lot of trouble.
>                                        -------------------------------
>BTW it is the file-window that allows you to check eoln and eof. How do you 
>manage to check that, if you don't have some kind of look-ahead?

I have seen many examples where it gives you no end of trouble...
This file window business is the reason why there is no such thing
as a portable interactive Pascal program. I have seen two ways of dealing
with it: lazy I/O (do the I/O when the file buffer variable is referenced) and
implicitly prepending an Eoln to a text file, thus making it the value
of the file buffer variable when the program starts up. An interactive
program written for one convention acts strangely when compiled with a compiler
using the other convention... No doubt there are other variations around.

Having an "unread" operation would be a much better way of dealing with
lookahead, because then you get lookahead only when you want it. Maybe this
could be added to the rumoured new Pascal standard?

Erkki Ruohtula    ! Nokia Telecommunications
eru@tele.nokia.fi ! P.O. Box 33 SF-02601 Espoo, Finland

lmiller@venera.isi.edu (Larry Miller) (07/27/89)

In article <2929@virginia.acc.virginia.edu> scl@virginia.acc.Virginia.EDU (Steve Losen) writes:
>In article <4623@freja.diku.dk> dat0@rimfaxe.diku.dk (Dat-0 undervisningsassistent) writes:
>>schwartz@shire.cs.psu.edu (Scott Schwartz) writes:
>>>Alastair Milne writes:
>>>| >	while (input^ in [space, tab, newline, cr, formfeed]) do
>>>| >		get (input)
>>>| 
>>>|     How does this constitute not commiting to read the file variable?  You are
>>>|     doing exactly that when you dereference the file variable to get its
>>>|     window.
>
>Here's a good example of how convenient the file buffer variable is:
>
>Suppose you want to allow the user to input a bunch of integers, separated
>by any sequence of non-numeric characters.  For example, the following
>atrocity is legal input:
>
>,,100 ,20,
>,
>
>   30
>40,, 5050
>,,,
>
>The program should process the integers 100, 20, 30, 40, and 5050.
>Here is a trivial piece of Standard Pascal that does the job correctly.
>
>while not eof do begin
>	if (input^ >= '0') and (input^ <= '9') then begin
>		read(number);
>		process(number);
>	end
>	else
>		get(input);
>end
>
>I admit that for most programming languages it is possible to come up
>with simple and elegant solutions to selected bizarre problems.  Too
>bad there isn't one language that is universally applicable.

I've been of the school that Pascal is ugly in this sense: it seems that
the language is lacking because you don't say what you want to do.  Here is
the same code in C, more or less

	#include <stdio.h>
	#include <ctype.h>

	main()
	{
	  int	c = 0;
	  int	val;

	  while (1)
	  {
	    while (!isdigit(c = getchar()))
	      if (c == EOF)
		return 0;
	    ungetc(c, stdin);		/* read one too many character */
	    scanf("%d", &val);
	    printf("%d\n", val);
	  }
	}

In C, we can't surreptitiously read a character (Pascal: input^).  We have
to really and truly read it using getchar().  Once we discover that we've
read one too many characters, we "push it back" into the input stream:
ungetc(c, stdin).  Of course, the input stream must be buffered.

The advantage of C's method is that we don't get obscure behavior: if we
want to read a character, we read it (with getchar).  If we want to "unread"
it, we call ungetc.

Larry Miller				lmiller@venera.isi.edu (no uucp)
USC/ISI					213-822-1511
4676 Admiralty Way
Marina del Rey, CA. 90292

diamond@csl.sony.co.jp (Norman Diamond) (07/27/89)

In article <SCHWARTZ.89Jul24202811@shire.cs.psu.edu> schwartz@shire.cs.psu.edu (Scott Schwartz) writes:

>Well, actually the above should have been written as
>	while (not eof (input)) and (input^ in [...]) do
>		get (input)

In ISO Extended Pascal,
	while (not eof (input)) and_then (input^ in [...]) do
		get (input)

I don't like the underscores in the new keywords and_then and or_else
but at least these operators are available.

--
-- 
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

diamond@csl.sony.co.jp (Norman Diamond) (07/27/89)

In article <437@mjolner.tele.nokia.fi> eru@tnvsu1.UUCP (Erkki Ruohtula) writes:

>Having an "unread" operation would be a much better way of dealing with
>lookahead, because then you get lookahead only when you want it. Maybe this
>could be added to the rumoured new Pascal standard?

I would not object to "unread," but the new standard is about 6 years
past the rumour stage.  Sorry to say that it's far too late.

Every Pascal compiler that I've used implements lazy I/O.  Many of these
are broken in other respects, and many of them remain blissfully unaware
that there was a previous standard 10 years ago, but they implement the
de-facto standard of lazy I/O.

--
-- 
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

abcscnuk@csuna.csun.edu (Naoto Kimura) (07/29/89)

In article <2929@virginia.acc.virginia.edu> scl@virginia.acc.Virginia.EDU (Steve Losen) writes:
> ... (text deleted) ...
>while not eof do begin
>	if (input^ >= '0') and (input^ <= '9') then begin
>		read(number);
>		process(number);
>	end
>	else
>		get(input);
>end

You shouldn't inspect the file buffer before checking for eof.  It is
an error to access the file buffer if you are at the end of file.

procedure SkipJunk( ver f : text;  Legal : set of char );
    var
	done : boolean;
    begin
	if not eof(f) then
	    repeat
		if f^ in Legal then
		    done := true
		else begin
		    get(f);
		    done := eof(f)
		  end
	    until done
    end;

Also, you can't inspect the file buffer to see if you're at the end of
a line.  According to standard, you'll end up seeing a space.  The end
of line marker is just that - a marker and not necessarily a character.

Interactive I/O in pascal is a pain in the neck because pascal wasn't
designed to be interactive.  On some implementations, I've encountered
problems if too many calls were made to eof or eoln when doing
interactive I/O (it would lose characters).  I've had to do silly things
like force flushing of the buffer by outputting enough null characters
to fill the output buffer in some compilers.

>I admit that for most programming languages it is possible to come up
>with simple and elegant solutions to selected bizarre problems.  Too
>bad there isn't one language that is universally applicable.
>-- 
>Steve Losen     scl@virginia.edu
>University of Virginia Academic Computing Center

                //-n-\\			 Naoto Kimura
        _____---=======---_____		 (abcscnuk@csuna.csun.edu)
    ====____\   /.. ..\   /____====
  //         ---\__O__/---         \\	Enterprise... Surrender or we'll
  \_\                             /_/	send back your *&^$% tribbles !!

filbo@gorn.santa-cruz.ca.us (Bela Lubkin) (07/30/89)

In article <2929@virginia.acc.virginia.edu> scl@virginia.acc.Virginia.EDU
(Steve Losen) writes:
>Here's a good example of how convenient the file buffer variable is:
>
>Suppose you want to allow the user to input a bunch of integers, separated
>by any sequence of non-numeric characters.  For example, the following
>atrocity is legal input:
>
>,,100 ,20,
>,
>
>   30
>40,, 5050
>,,,
>
>The program should process the integers 100, 20, 30, 40, and 5050.
>Here is a trivial piece of Standard Pascal that does the job correctly.
>
>while not eof do begin
>	if (input^ >= '0') and (input^ <= '9') then begin
>		read(number);
>		process(number);
>	end
>	else
>		get(input);
>end
>
>I admit that for most programming languages it is possible to come up
>with simple and elegant solutions to selected bizarre problems.  Too
>bad there isn't one language that is universally applicable.

My experience with "Standard Pascal" is pretty ancient, but I think that
at least in this case you are fooling yourself.  The file buffer may be
"convenient", but you're not actually using it -- you commit to reading
the data whether or not it's a digit.  A >simpler< and >more< elegant
piece of Standard Pascal code that does the same thing is:

while not eof do begin
        read(number);
        if (number >= '0') and (number <= '9') then process(number);
           { else don't do anything, thus disposing of the invalid char }
end

In >most< situations where you might use the file buffer, it's not difficult
to store away the item in your own one-item buffer; in this case, not even
that is necessary.  There are also cases where the file buffer is really a
Good Thing; I'm not trying to excuse Turbo Pascal's unfortunate omission.
In general I think that there are very few places where a file buffer would
be more useful than true random access as in TP, and if TP wasn't
pretending to be Pascal I wouldn't mind the omission.  As it stands, it's
an annoying portability issue.

>Steve Losen     scl@virginia.edu
>University of Virginia Academic Computing Center

Bela Lubkin     * *   filbo@gorn.santa-cruz.ca.us    CIS: 73047,1112
     @        * *     ...ucbvax!ucscc!gorn!filbo     ^^^the slowest route
R Pentomino     *     Filbo @ Pyrzqxgl (408) 476-4633 & XBBS (408) 476-4945

schwartz@shire.cs.psu.edu (Scott Schwartz) (07/30/89)

In article <1.filbo@gorn.santa-cruz.ca.us>  Bela Lubkin writes:
+-----
|   A >simpler< and >more< elegant
|   piece of Standard Pascal code that does the same thing is:
|
|   while not eof do begin
|	   read(number);
|	   if (number >= '0') and (number <= '9') then process(number);
|	      { else don't do anything, thus disposing of the invalid char }
|   end
+-----

In the previous posting, Number was of type REAL (or INTEGER), not
CHAR.  Why write code to "process" it when the language will do it for
you?  That's what file windows buy you.  Your example is incomplete,
then, whereas the other posting contained complete code to do the
operation, in approxomately the same number of lines.  A telling point
in favor of using language features to their fullest.




--
Scott Schwartz		<schwartz@shire.cs.psu.edu>

filbo@gorn.santa-cruz.ca.us (Bela Lubkin) (07/31/89)

In article <SCHWARTZ.89Jul30113356@shire.cs.psu.edu> schwartz@shire.cs.psu.edu (Scott Schwartz) writes:
>
>In article <1.filbo@gorn.santa-cruz.ca.us>  Bela Lubkin writes:
>+-----
>|   A >simpler< and >more< elegant
>|   piece of Standard Pascal code that does the same thing is:
>|
>|   while not eof do begin
>|	   read(number);
>|	   if (number >= '0') and (number <= '9') then process(number);
>|	      { else don't do anything, thus disposing of the invalid char }
>|   end
>+-----
>
>In the previous posting, Number was of type REAL (or INTEGER), not
>CHAR.  Why write code to "process" it when the language will do it for
>you?  That's what file windows buy you.  Your example is incomplete,
>then, whereas the other posting contained complete code to do the
>operation, in approxomately the same number of lines.  A telling point
>in favor of using language features to their fullest.

My mistake; as I said, my Standard Pascal's a bit rusty.  I didn't catch the
implications of Input^ (type Char) vs. Number (type Real).  I had the
impression that the inputs were being put together by the procedure Process.

I disagree about "approximately the same number of lines": my version is
definitely shorter and would generate smaller, faster code; >IF<, that is,
it was solving the same problem.
>--
>Scott Schwartz		<schwartz@shire.cs.psu.edu>

Bela Lubkin     * *   filbo@gorn.santa-cruz.ca.us    CIS: 73047,1112
     @        * *     ...ucbvax!ucscc!gorn!filbo     ^^^the slowest route
R Pentomino     *     Filbo @ Pyrzqxgl (408) 476-4633 & XBBS (408) 476-4945

scl@virginia.acc.virginia.edu (Steve Losen) (07/31/89)

Sorry folks.  The example I posted earlier was incomplete and has
caused quite a bit of confusion.  Here is a complete Pascal program.
It looks for integers in a file and sums them.  It doesn't care
how many integers are are on a line nor what separates them.

program sum(input, output);
var
	number, total : integer;
begin
	total := 0;
	while not eof do begin
		if (input^ >= '0') and (input^ <= '9') then begin
			read(number);
			total := total + number;
		end
		else
			get(input);
	end;
	writeln(total);
end.

Here is the sample input:

,,100 ,20,
,

   30
40,, 5050
,,,

The only point that I was trying to make was that Standard Pascal
handles this particular problem very nicely.  It certainly doesn't
do everything, however.

Furthermore, if a Pascal compiler either fails to compile this or
fails to print 5240 with the above input, then that compiler does
not conform to the ISO Standard.

Again, I apologize for any confusion that I have caused.
-- 
Steve Losen     scl@virginia.edu
University of Virginia Academic Computing Center