[net.lang.ada] Text_io considered error prone?

Bakin@MIT-MULTICS.ARPA ("David S. Bakin") (09/11/85)

Dear Ada users:

I said last week I would mention this week the error-prone construct
of text_io.page.  Well, I was going to, but after playing around with
text_io some more I decided I would just call text_io itself an
error-prone construct. 

My discussion will take the form of a challenge:

Using only the I/O described in Chapter 14 of the Ada LRM provide a
program which accurately copies an arbitrary text file.  (To make it
easy, lets assume this arbitrary text file contains only the
characters in the 128 element ASCII set.)  For extra credit, make sure
it is an efficient program. 

[Now, part of the problem may be that I am using Dec Ada on VMS, which
has a record-oriented file system.  You may be better off if you are
on Unix or MS-DOS and can use sequential I/O instantiated on a
character. Still you should try it and make sure sequential I/O works
for your system, then try it using text_io alone.  Furthermore, I
arbitrarily rule that your use of sequential I/O instantiated on a
character gets no points for the extra credit:  efficiency.]

[OK, so lets assume you're using Dec Ada on VMS, or that you're going
to use text_io anyway:] 

Try your program on the following three test cases:

1.  A file which ends in a non-blank line.
2.  A file which ends in an explicit page-terminator (for Dec Ada,
    the last line in the file contains one character, an ASCII form
    feed).
3.  A file which ends in a blank line followed by an explicit page-
    terminator (for Dec Ada, the next to last line in the file has no
    characters, the last line has only an ASCII form feed).

In case you think I'm just causing trouble:  I haven't been able to
write a program using text_io that correctly handles the above three
cases.  Sequential_IO instantiated for character doesn't work since
for Dec Ada each sequential_io.read reads a record of the file. (A
reasonable interpretation of sequential_io.read for a record oriented
system.) 

By the way, because of the way unconstrained records work in Ada you
can't instantiate Sequential_IO with an unconstrained type for
element_type (like String) and expect to READ a variable length file. 
[Even if your compiler lets you instantiate Sequential_IO with an
unconstrained type for element_type.]  (Someone please correct me if I
am wrong here.) 

I ASSERT that Ada I/O should enable you to write a program that can
copy any file created with Ada I/O.  This assertion is to my mind a
reasonable inference from the first two paragraphs of the LRM
Introduction where it is asserted that Ada is to be useful for systems
programming.  OK, I want to write a text editor, or a compiler.

(Note that this is a rather weak requirement in real life, since it
doesn't even enable you to write a program which can copy any text
file on your system (much less binary files), just those which are
written by an Ada program.)  (Note that this was once a big deal for
things like Fortran compilers on IBM mainframes:  Could you write a
file using Fortran I/O which the compiler could read?  For all I know 
it is still a big deal for things like Fortran compilers.)

[Note:  If you are using a compiler which doesn't implement Ada I/O as
described in the LRM you get 0 points.  If you don't have access to
any Ada compiler, you may not be able to do this problem at all, but
if you do, make sure you are reading the LRM very very carefully.] 

-- Dave  (Bakin @ mit-multics.arpa)

[Disclaimer:  I work for Alsys, a company deeply involved with Ada,
and highly supportive of Ada as a programming language.  These
opinions are my own, not to be attributed to Alsys or any other person
at Alsys.  I too am highly supportive of Ada as a programming
language.  Nevertheless, I don't think its contradictory or wrong to be
supportive of the Ada effort and yet believe that it isn't perfect.  I
also think it useful to discuss problems with the Ada language, in the
hope that it can be improved, or that my own misperceptions can be
corrected.  This disclaimer probably won't appear on future messages I
send ... I find these disclaimers obvious and unnecessary in general
... but I thought I should mention it at least once.] 

Evans@TL-20B.ARPA ("Art Evans") (09/11/85)

As I read Chapter 14, Bakin's challange is unfair.  In 14.3(6) and (7),
there is a careful definition of what Ada means by "text file".  TEXT_IO
provides for I/O from such a file, and for nothing else.  The last
sentence of (7) says
    The effect of input or output of control characters (other than
    horizontal tabulation) is not defined by the language.
Thus it's not surprising that copying an arbitrary string of ASCII
characters is beyond the capabilities of TEXT_IO.

As for the three cases Bakin wants to test for, #2 and #3 refer to "an
ASCII form feed".  Since that's an ASCII control character, the sentence
I just quoted makes it clear that it cannot be dealt with.

I think Bakin's subject field for his message ("text_io considered error
prone?") is unfair to Ada, as his message points out deficiencies in
TEXT_IO but does not address the question of whether or not TEXT_IO is
error prone.  It's not surprising that TEXT_IO is "deficient", in the
sense that there are things one might want to do with arbitrary files of
ASCII characters that cannot be done with TEXT_IO.  So what?  TEXT_IO
provides useful functionality for an important subset of all possible
ASCII files, and I expect that expanding its capabilities would have
unacceptable costs.  Moreover, as Bakin points out, since Ada provides
other ways to accomplish what is beyond TEXT_IO, there is no loss of
functionality.

As an additional point: When Bakin said he was concerned with an
"arbitrary text file", I expected he would worry about bare CR or bare
LF (one of those characters other than in the CR-LF sequence).  But,
that was before I reread 14.3.  Such files are beyond many I/O systems.

Art Evans/Tartan Labs
-------

Bakin@MIT-MULTICS.ARPA ("David S. Bakin") (09/11/85)

A flippant response to Art's comment about the subject line "text_io
considered error prone" is that a construct that doesn't do what I expect
it to do when I use it is error prone for me ... and since I expect to
be able to use Ada text_io to read and write text files and it doesn't
do a good job that causes errors.

However, I will quickly concede that the thrust of my message is that
text_io is inadequate, as opposed to error prone, and maybe the discussion
can just go on from there.

By the way ... I don't understand Art's comment that I mentioned other
ways to process these files in Ada.  I mentioned using sequential_io
and commented that it didn't work on VAX/VMS.  I could further comment
that even on Unix and MS-DOS systems where you could read a file a
character at a time using sequential_io you don't have the facilities
of TEXTUAL IO -- namely, the ability to instantiate integer_io and the
rest of the textual IO modules.

Finally, even within the limitations of text_io as defined in the LRM
as Art pointed out, I really do find text_io error prone.  It took me
a long time to be able to write a file copy program to copy a text_io
file that included page terminators ... and it still doesn't handle the
difference between two output files, one which ends:

     text_io.new_line;
     text_io.close;

and one which ends:

     text_io.new_line;
     text_io.new_page;
     text_io.close;

I think those are two different files.  And VMS thinks they are two
different files ... they have different contents.  But when I read
those files with text_io I can't tell them apart.

I hope this discussion will continue under two different subject lines:

1)  text_io considered error prone?

    Use this subject line for comments related to the error-prone-ness
    of text_io WITHIN the restrictions of text_io files as defined in the
    LRM.

2)  text_io considered inadequate?

    Use this subject line for comments related to what you'd really
    like to see in an Ada I/O facility for text_io'ish files.  This
    is a reasonable point for discussion since it is clear (I hope)
    that text_io really is inadequate when you're trying to write
    system programming tools on current operating systems -- one of
    the stated goals of the Ada programming language.

For my own curiosity, are there still implementations of Ada extant
that don't use 'real' text_io as defined in the LRM?   And, are there
implementations of Ada that, somehow, go beyond text_io and the LRM
in defining I/O?  (For example, Dec Ada has a set of "mixed" I/O
packages that might be worth discussing.)

-- Dave  (Bakin -at mit-multics)

KFL@MIT-MC.ARPA (Keith F. Lynch) (09/12/85)

    Date: Tue 10 Sep 85 22:34:22-EDT
    From: "Art Evans" <Evans@TL-20B.ARPA>

    As I read Chapter 14, Bakin's challange is unfair.  In 14.3(6) and (7),
    there is a careful definition of what Ada means by "text file".  TEXT_IO
    provides for I/O from such a file, and for nothing else.  The last
    sentence of (7) says
        The effect of input or output of control characters (other than
        horizontal tabulation) is not defined by the language.
    Thus it's not surprising that copying an arbitrary string of ASCII
    characters is beyond the capabilities of TEXT_IO.

 I think that what 14.3.7 means is that the effect on the output device
is not defined.  For instance some control character may make the output
device go into graphics mode, and that would not be a violation of the Ada
specs.  The context can be judged by 2.1, which defines the Ada character
set as consisting of the letters, numerals, several special symbols, space
bar, and the control characters HT, VT, CR, LF, and FF.  These characters
all have meaning to Ada.  All other Ascii characters are merely arbitrary
bit patterns as far as Ada is concerned.  It does NOT follow from that that
Ada is unable to copy a file containing such characters.
								...Keith

hilfingr@UCBRENOIR (Paul Hilfinger) (09/12/85)

>    I think that what [14.3(7)] means is that the effect on the output
>    device is not defined. For instance some control character may
>    make the output device go into graphics mode, and that would not
>    be a violation of the Ada specs.... It does NOT follow from that
>    that Ada is unable to copy a file containing such characters.

I'm afraid not.  The powers that be interpret the final sentence of 14.3(7),

   "The effect of input or output of control characters (other
    than horizontal tabulation) is not defined by the language."

to mean that, for example, outputing ASCII.NUL via TEXT_IO can result in a
file that ends at the NUL, or can output an 'A' instead, or....

Paul Hilfinger

Bakin@MIT-MULTICS.ARPA ("David S. Bakin") (09/17/85)

[This missive contains a, well, lets call it a flame, about Ada
text_io.  If you don't want to hear about it, send mail to me now and
stop reading this message.  -- Dave]

I'm wondering a little about the silence now present on the subject
of text_io in particular and Ada I/O in general.  I thought I'd be
hearing a bit more about it.  If I'm the only one who thinks that
text_io is inadequate then at least send me some private mail saying
that you think I'm all wet and I'll shut up.  The responses to date
seem to be of the Henry Ford variety:  "Text_io does everything you
want it to do, as long as what you want to do can be done by text_io."

I still don't know the way to copy a varying length file with standard
Ada I/O.  In fact, I don't know how to read one:  Farokh Morshed told
me that the way to do it with sequential_io is to instantiate sequential_io
with an unconstrained array type, declare a large buffer (larger than
an record in the file) of that type, then loop as follows -- fill it
with a 'known' value, do a read, check the buffer from the end to see
where the 'known' value stops and thus the real data ends, then wow,
you know how long the record is, so write it.  He neglected to say
that you need to have checks off when you do it, and even so it is
probably implementation-dependent whether the I/O package raises
USE_ERROR or something else anyway.  Is everybody happy with this
solution?

Dec seems to think Ada I/O is missing something.  They implemented
some packages to do "mixed" I/O, that is, you could mix binary
integers and enum types and arrays and stuff in one file.

Rational seems to think Ada I/O is missing something.  They have
implemented some packages to do "polymorphic" I/O, that is, you could
mix binary integers and enum types and arrays and stuff in one file.

Does anyone else but me wonder why the two packages can't be the
same for the sake of portability?

Rational also has a package to do bit and byte stream I/O.  I think
that's useful too.

I'd like to be able to write a file using text_io on our VAX and
transfer it to our PC and read it with another Ada program.  And
reverse it.  Isn't that one part of program portability?  Here's
an opinion:  If text_io goes to all the trouble of specifying line
boundaries and page boundaries and file boundaries on the one hand,
and Ada goes to a lot of trouble defining the whole language to work
with the ASCII character set (e.g., STANDARD.ASCII and the definition
of string and character not even letting you get to your machine's
underlying character type), then it is surprising to me that the
definition of text_io DOESN'T include a way to transport text_io
files from one machine to another.  Sequential_io I can understand ...
you have byte-ordering problems, etc.  There are defined symbols in
the language called standard.ascii.ff and standard.ascii.lf, not to
mention standard.ascii.rs and standard.ascii.fs.  Why go to all the
trouble of defining these and ignore the definitions given to them
in the ASCII standard by letting implementations choose incompatible
ways of defining line, page, and file terminators?

OK folks, why am I complaining?  Well, the Ada language isn't fixed
for all time.  Like all other ANSI languages it'll be reviewed from
time to time and improvements suggested.  I know there is a scheme
for making comments -- you send mail to ada-comment or something, and
then you can poke around in the comment-archives and see what other
people have to say.  Only I believe that the kinds of comments sent to
ada-comment are much more precise, or at least, definite, than the
kinds of comments I'm willing to make now.  I don't know the answer
to my questions about Ada I/O ... my feeling is it needs discussion
now, we're not even at the stage of writing papers for Ada Letters
or comments to be read by the language maintenence committee.

So, I bring the matter up once more.  Now's your chance:  If you
don't think ada-info is the proper place to discuss this send me
private mail, I'll comply with the general opinion.  Or if you
think text_io isn't worthy of this much discussion, let me know that
too.  Or finally, if you think of another way for interested people
to discuss this with me I'd like to hear about it (bearing in mind
that I don't go to Ada-ish conferences to talk about it with you
face-to-face).

Remember:  I'm not interested in complaint with no purpose, I'd
like to improve Ada.  Furthermore, I'm right now using Ada to do
almost everything I want to do ... I just want to do it better, and
in this case, with standard Ada I/O of some sort.

Thanks for letting me flame on (that is, if you've read this far)
-- Dave  (Bakin -at mit-multics.arpa)