[net.lang.mod2] Modula-2 I/O

powell@decwrl.UUCP (Mike Powell) (09/06/84)

I would like to provoke a little discussion about the I/O facilities present in
(or absent from) Modula-2.  There seem to be 3 + N possible alternatives:
    1) Pascal-like I/O
    2) What Wirth has proposed (module InOut)
    3) No defined I/O
    N) Pick your favorite I/O mechanism

1)  Pascal I/O (especially the "I") has been widely discussed and criticized,
so I won't rehash it.  Using it would at least make Modula-2 more compatible
with Pascal, and would have guaranteed a facility no worse than Pascal's.

2)  What Wirth has proposed is not adequate even for many simple programs.
There is only one input stream and one output stream for formatted operations.
Although it is possible for people to define their own I/O modules as needed,
this degenerates into the next alternative.

3)  Not defining I/O even in the easy cases will make it impossible to write
portable Modula-2 programs.  No interesting textbooks will be written and the
language will not be taught on a large scale.  

N)  It seems to me that 2 and 3 are both worse than 1.  The question is, is
some N better than 1?  How about this:

Define a builtin I/O module patterned after Unix/C printf and scanf.  I believe
this same (or similar) I/O interface has been used in a variety of languages on
a variety of machines and systems, so it seems pretty portable.  Formatted
input and output are accomplished with procedures that accept a variable number
of parameters.  The first two parameters are always the file variable and a
format string.  Additional parameters may be specified, and they are matched
against the format string to see how to read or write the variables.  Although
in C the parameters are not checked (of course, C never checks parameters) and
are blindly used by the library at runtime, Modula-2 could check the parameters
against the format, and even generate calls to implementation-dependent library
routines to perform the I/O.

I liked this idea so much I implemented it in my compiler, and I include the
pseudo-definition module (since it must be built into the compiler to handle
the variable-length parameters lists) below.  I'm interested in reactions or
suggestions for other alternatives.  I would also like to know what other
operations you think are essential for a "standard" Modula-2 I/O library.

					Michael L. Powell
					Digital Equipment Corporation
					Western Research Laboratory
					Los Altos, CA  94022
					{decvax,ucbvax}!decwrl!powell
					powell@decwrl
Appendix:  io.def

This module should not be compiled, but is supplied for documentation purposes.

definition module io;
export qualified
    File, Open, Close, Readf, Writef, SReadf, SWritef,
    Readc, Writec, Readb, Writeb;

type File = pointer to FileRec;		(* Open file variable type *)

var
    (* Standard files connected to Unix standard input, output, and error *)
    input, output, terminal : File;

procedure Open(name : array of Char; mode : array of Char) : File;
    (* open a file *)
    (* name : file name; mode = "r" for input, "w" for output *)
    (* return value : opened file or nil *)

procedure Close(f : File);
    (* close a file *)

procedure Readf(f : File; (* constant *) format : array of Char;
		var arg1 : ArgType1; var arg2 : ArgType2; ...) : integer;
    (* read a list of values from a file according to a format string *)
    (* f : an open file; format : constant string format (like Unix scanf) *)
    (* argn : variable for corresponding format item, type must match *)
    (* return value : number of values read,  < 0 for end of file *)

procedure Writef(f : File; (* constant *) format : array of Char;
		arg1 : ArgType1; arg2 : ArgType2; ...);
    (* write a list of values to a file according to a format string *)
    (* f : an open file; format : constant string format (like Unix printf) *)
    (* argn : value for corresponding format item, type must match *)

procedure Readc(f : File; var c : Char) : integer;
    (* read the next character from the file *)
    (* f : an open file; c : variable to read next char into *)
    (* return value : >= 0 if read OK, < 0 if end of file *)

procedure Writec(f : File; c : Char);
    (* write a character to a file *)
    (* f : an open file; c : value for next char to write *)

procedure SReadf(s : array of Char; format : (* constant *) array of Char;
		var arg1 : ArgType1; var arg2 : ArgType2; ...) : integer;
    (* read a list of values from a string according to a format string *)
    (* s : a string; format : constant string format (like Unix scanf) *)
    (* argn : variable for corresponding format item, type must match *)
    (* return value : number of values read *)

procedure SWritef(var s : array of Char; format : (* constant *) array of Char;
		arg1 : ArgType1; arg2 : ArgType2; ...);
    (* write a list of values to a string according to a format string *)
    (* s : a string; format : constant string format (like Unix printf) *)
    (* argn : value for corresponding format item, type must match *)

procedure Readb(f : File; var buff : array of byte; size : cardinal) : integer;
    (* read binary data from a file *)
    (* f : an open file; buff : variable to read into *)
    (* size : number of bytes to read *)
    (* return value : if read OK, = number of bytes read, < 0 if end of file *)

procedure Writeb(f : File; buff : array of byte; size : integer);
    (* write binary data to a file *)
    (* f : an open file; buff : variable to write *)
    (* size : number of bytes to write *)

end io.

cca@pur-phy.UUCP (Charles C. Allen) (09/10/84)

Here are a couple of things I would like to see in ANY implementation of a
Modula-2 I/O library:

    *	The ability to have an 8-bit CHAR.  The range of type CHAR could be
	extended to 377C, or CHAR could be made a subrange of a new type
	that goes to 377C.

    *	I/O for enumerated types.  The usefulness of enumerated types is
	drastically reduced if you can't do I/O with them.

I'd be interested in hearing whether existing compilers have these
capabilities.

Charlie Allen
UUCP:		pur-ee!Physics:cca, purdue!Physics:cca
INTERNET:	cca@pur-phy.UUCP

merlin@emory.UUCP (09/10/84)

Michael Powell has raised some important questions in his discus-
sions  of  modula-2 I/O.  I agree with the thrust of his comments
but am opposed to some of his specific remedies.

I strongly support the need for a standard  I/O  module.   It  is
essential  for  the  study and promotion of the language.  Such a
standard should  meet   several  criteria.   It  must  support  a
comprehensive  set  of input/output capabilities, including, at a
minimum, file  and  terminal  oriented  operations.   A  standard
should  define  unambiguously  the results of both successful and
unsuccessful I/O requests.  Reliable error-detection  and  error-
recovery must be supported in all I/O capabilities.

I feel that Wirth's "InOut.def" is an example  of  a   reasonable
approach to standardization.  It encompasses most of the I/O con-
structs that conventional applications require.  It needs a  con-
siderable amount of refinement in order to qualify as a practical
standard, since there are  many  ambiguous  or  ill-defined  com-
ponents.

I am not championing "InOut.def" as the  modula-2  I/O  standard.
It clearly is a contender and, unless someone can demonstrate how
it is woefully inadequate on *functional* grounds, it appears  to
me to be the appropriate starting point.

I must admit a strong aversion to modula-2  clones  of  the  UNIX
standard "printf" and "scanf".  My grounds for objection are man-
ifold.

In a fundamental way this  approach opposes much of what modula-2
is  intended to accomplish.  If the language is to enforce strong
type-checking then procedure calls with "cross-coupled" parameter
lists  much  be  validated during some phase of compilation.  The
compiler must recognize these specially designated procedures  as
well  as  their  *internal*  syntactical  structures.  These pro-
cedures become aspects of the language itself.  It is not  incon-
ceivable  to me that we may wish to consider language-defined I/O
mechanisms, but I strongly suggest that this be  evaluated  as  a
language  design consideration and not as an ad hoc method of in-
troducing an I/O standard.

There are other serious problems  created  by  this  formulation.
The  code  generated  for  a program requiring, for example, only
simple integer I/O may incorporate facilities for many unused I/O
conversions.   To the extent that the format string is itself de-
fined only during program execution, all possible I/O conversions
must  be  made  available at run-time.  The inability of the pro-
grammer to specify only those functions requisite  for  her  pur-
poses may hamper development of a variety of systems.

Moving on to other concerns.  Format-string directed I/O tends to
create  a  need for explicit control-character representation (as
with C's backslash notation).  The problem is that  this  couples
the  logical functioning of input-output procedures with specific
characters in data strings.  Representations of record,  line  or
file  terminators  should  be  details hidden beneath appropriate
procedure calls such as Wirth's "WriteLn".  This  allows  a  much
more  effective  way  to  adapt  to variations in system-specific
control-character conventions.

My other objections are more matters of  personal  taste.   These
functions  exemplify an extreme control-coupled approach to argu-
ment transmission that transform procedure calls  into  veritable
input-output  *vegematics*.   The  odd  thing is that, apart from
familiarity, there is very little  to  recommend  them.   Granted
they  do capture an image of the structure of the input or output
in a compact form but only at a substantial cost.

We need to act now to establish a simple reliable  standard.   It
does  not  have to do everything.  It is important to remember is
that the language is intentionally *modular*.  New solutions  can
be incorporated at a later time as long as we refrain from incor-
porating I/O support in the language definition through elaborate
compiler enhancements.

Marc Merlin 
Emory University
Atlanta, GA
(akgau!emory!merlin)

sjc@angband.UUCP (Steve Correll) (09/23/84)

I like Mike Powell's idea of adapting Unix "printf" for use with Modula-2.
Despite its shortcomings, I like the combination of convenience and
(even more important) readability which "printf" provides.

Perhaps the objection to variable-length argument lists could be
answered by taking a hint from the Unix "execv" system call and making
"printf" take exactly two arguments, the first of them being a format
string and the second being an open array of addresses of the desired
data items.  (The Unix "argv" argument employs a similar notion.)

As in the Unix implementation, the user would be expected to terminate
the list of addresses with a sentinel like NIL so that the callee knows
where to stop; although this is unnecessary in "printf", where the
format string dictates the number of items required, the redundancy
would allow runtime checking of the number of items. (Note that relying
on the Modula-2 "high" mechanism rather than a sentinel would force the
caller to declare an array of exactly the right size for each call.)

Admittedly, this is a bit less convenient than the C language version,
particularly when the caller must store constants into variables so
as to obtain addresses for them. But I rarely find I want to pass
constants to "printf".

As for the objection that "\n" in the C language poses portability
problems in comparison with "writeln" in Pascal, perhaps the Modula-2
"printf" should regard "\n" as meaning exactly what "writeln" does:
generate a new line in whatever manner is appropriate to the underlying
file system.

While the "\nnn" notation (where n is an octal digit) for specifying an
unprintable character is part of the C language rather than the Unix
"printf", I'd like the Modula-2 "printf" to retain it, since it seems
consistent with spirit of the Modula-2 "nnC" notation.


                                                           --Steve Correll
sjc@s1-c.ARPA, ...!decvax!decwrl!mordor!sjc, or ...!ucbvax!dual!mordor!sjc
-- 
                                                           --Steve Correll
sjc@s1-c.ARPA, ...!decvax!decwrl!mordor!sjc, or ...!ucbvax!dual!mordor!sjc

russell@muddcs.UUCP (Russell Shilling) (09/27/84)

> I like Mike Powell's idea of adapting Unix "printf" for use with Modula-2.
> Despite its shortcomings, I like the combination of convenience and
> (even more important) readability which "printf" provides.

Quick note:
DECWRL Modula has readf, writef, fprintf, ...
VERY nice to use.  But the standard Modula InOut is not there,
so there would be some porta/compati-bility problems here.

rcd@opus.UUCP (Dick Dunn) (09/28/84)

A couple of comments on printf (or something like it) for Modula-2:

In some sense, building a "real" formatting and I/O system for a language
(if it's not built into the language) is a test of the power of the
language.  If you can't do it because the language won't let you express it
in a reasonable fashion, that's a defect in the language that needs to be
fixed.  (I'm not saying that Modula-2 is deficient, only that you can apply
this test.)

However, keep in mind that the idea of a format string (as in C or FORTRAN)
which is bound to the I/O list is NOT the only way to do it; in fact it's
probably a poor way to go.  It's very seldom that you need a variable
format string and even less common to need the dynamic binding of I/O list
elements to format elements.  Pascal's approach shows a hint of a better
way to go, although it comes out as a somewhat clunky special case and it
has to be implemented specially by the compiler.  ("Write" in Pascal is NOT
a procedure, no matter what you say:-)  Anyway, if you bind format elements
to (input) variables or (output) expressions at compilation time, you can
get much-needed type checking.  A general mechanism here would have much
wider applicability than just I/O.  In fact, the main problem isn't really
an I/O problem anyway--it has to do with control over the formatting that
happens before the I/O.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...Cerebus for dictator!

chris@umcp-cs.UUCP (Chris Torek) (09/28/84)

Also, have a care for ``internal I/O'' (stuff like F66 ENCODE, DECODE
or F77s array READ and WRITE statements).  One of the most annoying
things about Pascal is that it is impossible to write a version of
WRITE that doesn't actually go to a FILE or OUTPUT.  (Yes I know it
*can* be done, it's just extremely painful:  `wrint' `wrbool' ....)

(Give me sprintf, any day!)  (Oops, sorry, this isn't net.lang.c :-) )
-- 
(This page accidently left blank.)

In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland