[comp.lang.ada] get_line and partial lines

ecragg@GMUVAX.GMU.EDU ("EDWARD CRAGG") (08/18/90)

Return-Path: <vrdxhq!svug-request@uunet.UU.NET>
Received: from uunet.uu.net by gmuvax.gmu.edu with SMTP ; 
          Fri, 17 Aug 90 20:32:38 EDT
Received: from vrdxhq.UUCP by uunet.uu.net (5.61/1.14) with UUCP 
	id AA15208; Fri, 17 Aug 90 20:29:32 -0400
Received: by vrdxhq.verdix.com (5.52++/16Aug90) 
	id AA06223; Fri, 17 Aug 90 20:15:19 EDT
Received: from localhost by vrdxhq.verdix.com (5.52++/10Aug90) 
	id AA06206; Fri, 17 Aug 90 20:14:50 EDT
Message-Id: <9008180014.AA06206@vrdxhq.verdix.com>
To: uunet!src.honeywell.com!beck@uunet.UU.NET (Todd Beckering)
Cc: svug@uunet.UU.NET
Subject: Re: get_line and partial lines 
In-Reply-To: Your message of Mon, 13 Aug 90 17:07:37 -0500.
             <9008132207.AA08720@futility.src.honeywell.com> 
Date: Fri, 17 Aug 90 20:14:46 EDT
From: vrdxhq!deller@uunet.UU.NET

=...
=I appreciate your reply on this problem.
=
=When I tried the solution it didn't work in my program, although
=your sample program did.  I haven't had time until now to follow it up,
=but it looks like it doesn't work because the value of 'last'
=is being stored in a register, whose value isn't being stored when
=the exception occurs.
=
=...
=Is there a known solution to this problem?

Todd,
The code I gave you is bogus as discussed on the net and as you seem to have 
discovered.  It incorrectly expects output parameters to be valid after an
exception -- that is simply not guaranteed.  Sorry.  We all make mistakes :-{.

Below is an alternative I worked out.  It is pretty much OS independent and
Ada system transferrable.  It does not suffer from any "non-Ada dependency"
that I know of.  There are two "OS/vendor specific" dependencies I know of: 
    1. ascii.lf as the EOL indicator (a UNIX dependency that is NOT required
       in all UNIXes).
    2. tiny_integer for the size of a byte (a VADS dependency -- there are
       reasons why "range 0.255" is not desirable).
Each of these is easily changed for other systems.

The package provides an enhanced get_line that works just like TEXT_IO get_line
with an additional parameter "eol" which is set whenever an ascii.lf is not
the reason for line completion (the other reasons being end of the string
provided for input, and end of the input without a final ascii.lf).

One unfortunate consequence is that each byte requires a UNIX system call, so 
the thruput is only about 3K characters per second.  If that is an issue (which
is likely), then os_files.a and unix.a in VADS "standard" can be used to 
develop a buffered I/O that reads lots more bytes in with each step and thus 
can provide 10x to 100x or better speedups.  I do not have the time right now
to develop this enhancement myself.

Pity that Ada has no block I/O so we could get speed AND portability.  The
POSIX Ada binding does have a portable block I/O facility.  Once that binding
is a standard and is supported, then you could use that facility for the block
I/O.

Steve

with sequential_io ;
package read_unix_reliably is
   package byte_io is new sequential_io( tiny_integer ) ;
   procedure get_line ( file : byte_io.file_type ;
                        text : out string ;
                        last : out integer ;
                        eol : out boolean ) ;     -- true if line ends w/ EOL
   END_ERROR : exception ;
   pragma inline( get_line ) ;
end ;

with unchecked_conversion ;
package body read_unix_reliably is
   function to_char is new unchecked_conversion( tiny_integer, character ) ;
   procedure get_line ( file : byte_io.file_type ;
                        text : out string ;
                        last : out integer ;
                        eol : out boolean ) is 
      l : integer := text'first - 1 ;
      b : tiny_integer ;
      c : character ;
   begin
      if byte_io.end_of_file(file) then
         raise END_ERROR ;  -- True end error, not just a short line
      end if ;

      begin  -- reading text
         while l < text'last loop 
            byte_io.read( file, b ) ;
            c := to_char( b ) ;
            exit when c = ascii.lf ;
            l := l + 1 ;
            text(l) := c ;
         end loop ;
         eol := l < text'last ;
      exception 
         when byte_io.END_ERROR => eol := false ; -- EOF without ascii.lf
      end ;

      last := l ;
      return ;
   end ;
end ;

with text_io ;
with read_unix_reliably ; 
procedure readshort is
    use read_unix_reliably ; use byte_io ;

    f : file_type ;
    line_num : integer := 0 ;
    line : string (1..100) ;
    chars : integer ;
    eol : boolean ;

begin
    open ( f, in_file, "SHORT" ) ;
    while not end_of_file( f ) loop  
        get_line( f, line, chars, eol ) ;
        line_num := line_num + 1 ;
        text_io.put_line( "Line" & integer'image(line_num) & 
                          ", length" & integer'image(chars) ) ;
        if not eol then 
           text_io.put_line( "SHORT line, no EOL ends the text" ) ;
        end if ;
        text_io.put_line( line(1..chars) ) ;
    end loop ;
end ;