[comp.lang.c++] Newline hook needed in streams

benson@odi.com (Benson I. Margulies) (12/18/89)

I've needed to implement two special kinds of ostreams that seem to
need "line buffering" support. That is, I need to have code of mine
get control (via virtual function or function pointer) whenever
a newline is inserted into the buffer.

The two examples are an indenting stream (add a settable amount
of indentation to the left margin) and a syslog stream.
The indenting stream needs to know newline boundaries to know
when to insert the indentation. The syslog stream wants to
actually call syslog(3) for each line of output.

Perhaps the next release of c++ could enhance the base classes
(ios and streambuf) to have line buffering in addition to 
just plain buffering and non-buffering.


-- 
Benson I. Margulies

jss@jra.ardent.com (12/22/89)

In article <1989Dec18.132745.5187@odi.com> benson@odi.com (Benson I. Margulies) writes:
>I've needed to implement two special kinds of ostreams that seem to
>need "line buffering" support. That is, I need to have code of mine
>get control (via virtual function or function pointer) whenever
>a newline is inserted into the buffer.
>
>The two examples are an indenting stream (add a settable amount
>of indentation to the left margin) and a syslog stream.
>The indenting stream needs to know newline boundaries to know
>when to insert the indentation. The syslog stream wants to
>actually call syslog(3) for each line of output.
>

This request came up a lot while I was designing iostreams, but
I was never able to solve the fundamental problem that doing
line bufferring required some action on every character
to check if it is a newline and (since there are already complaints 
about the amount of code inlined by put) would amount to making 
the stream unbuffered.

That does not mean there is no way to solve the problems, but
it means you have to do some extra work.  I will describe one
approach that I have found useful.

First, the easiest place to do this kind of thing is in the
streambuf, rather than the ostream class.  What I have typically
done is derive a new class from filebuf with a redefined "flush"
member.  This member does whatever extra processing it required 
and then call filebuf::flush to really output the data.  This approach
will probably be sufficient for the syslog.  

For an indenting ostream this is not adequate since some means
is probably required to tell the streambuf the current indent 
The amount ought to be kept as formatting information associated 
with the ostream (use xalloc and iword) and passed on to the
streambuf as appropriate (presumably using a member function).
Also it is a bit restrictive to assume that the indenting
can only be done on a file, so rather than deriveng a new 
class derived from filebuf, it might be better to have a 
more general approach that would allow indenting to be 
inserted between the ostream and an abritrary streambuf.  

I'd be interested to hear from anybody who has written this
kind of streambuf class about what approaches they have taken.

Some easily overlooked parts of the iostream library that 
are relevant to these problems.

    There is a "unitbuf" flag in the ios interface.
    When this is set it causes flush() to be called 
    frequently (although not on every character).  

    There is an ostream constructor that takes an explicit
    streambuf as an argument.  It is therefore not necessary
    to declare a new ostream class in order to use a streambuf
    that does something special.

    streambuf::flush is not required to consume data. If
    it is called while there is still room for more characters in
    the arrays it can return without doing anything.  

Jerry Schwarz
jss@ardent.com

benson@odi.com (Benson I. Margulies) (12/22/89)

In article <9804@ardent.UUCP> jss@jra.ardent.com () writes:
>In article <1989Dec18.132745.5187@odi.com> benson@odi.com (Benson I. Margulies) writes:
>>I've needed to implement two special kinds of ostreams that seem to
>>need "line buffering" support. That is, I need to have code of mine
>>get control (via virtual function or function pointer) whenever
>>a newline is inserted into the buffer.
>>
>>The two examples are an indenting stream (add a settable amount
>>of indentation to the left margin) and a syslog stream.
>>The indenting stream needs to know newline boundaries to know
>>when to insert the indentation. The syslog stream wants to
>>actually call syslog(3) for each line of output.
>>
>
>This request came up a lot while I was designing iostreams, but
>I was never able to solve the fundamental problem that doing
>line bufferring required some action on every character
>to check if it is a newline and (since there are already complaints 
>about the amount of code inlined by put) would amount to making 
>the stream unbuffered.
>
>That does not mean there is no way to solve the problems, but
>it means you have to do some extra work.  I will describe one
>approach that I have found useful.
>
>First, the easiest place to do this kind of thing is in the
>streambuf, rather than the ostream class.  What I have typically
>done is derive a new class from filebuf with a redefined "flush"
>member.  This member does whatever extra processing it required 
>and then call filebuf::flush to really output the data.  This approach
>will probably be sufficient for the syslog.  
>

That's exactly what I did, except that I didn't use filebuf.
I used streambuf directly. Unfortunately, it isn't good enough for syslog.
It is not acceptible to wait for a flush to get the log messages
out. They have to get out as generated.

>For an indenting ostream this is not adequate since some means
>is probably required to tell the streambuf the current indent 
>The amount ought to be kept as formatting information associated 
>with the ostream (use xalloc and iword) and passed on to the
>streambuf as appropriate (presumably using a member function).
>Also it is a bit restrictive to assume that the indenting
>can only be done on a file, so rather than deriveng a new 
>class derived from filebuf, it might be better to have a 
>more general approach that would allow indenting to be 
>inserted between the ostream and an abritrary streambuf.  
>
>I'd be interested to hear from anybody who has written this
>kind of streambuf class about what approaches they have taken.
>
>Some easily overlooked parts of the iostream library that 
>are relevant to these problems.
>
>    There is a "unitbuf" flag in the ios interface.
>    When this is set it causes flush() to be called 
>    frequently (although not on every character).  
>

Unfortunately, ostream_withassign::operator= busts this.
It runs the non-virtual function "init" on the streambuf, which 
clears unitbuf.

>    There is an ostream constructor that takes an explicit
>    streambuf as an argument.  It is therefore not necessary
>    to declare a new ostream class in order to use a streambuf
>    that does something special.
>
>    streambuf::flush is not required to consume data. If
>    it is called while there is still room for more characters in
>    the arrays it can return without doing anything.  
>

In general, a half-dozen "virtual" markings in streambufs and streams
would make this sort of thing trivial. Since the stream library is
supposed to be the basis of whatever wild flights of imagination the
rest of us can come up with, IMHO allmost everything should be
virtual.



-- 
Benson I. Margulies

jss@jra.ardent.com (12/23/89)

In article <1989Dec22.124143.19874@odi.com> benson@odi.com (Benson I. Margulies) writes:
>
>In general, a half-dozen "virtual" markings in streambufs and streams
>would make this sort of thing trivial. Since the stream library is
>supposed to be the basis of whatever wild flights of imagination the
>rest of us can come up with, IMHO allmost everything should be
>virtual.
>

There is a school of thought that member functions ought to
be virtual unless there is good reason not to.  I subscribe
to the opposite opinion.  Member functions ought to be 
non-virtual unless there is a good reason to make them virtual.
There are costs associated with making functions virtual.
(in terms of the cleanliness of the interface, not just the 
overheads of calling a virtual function.)

In C++, designing the "protected interface" of a class (the
interface seen by derived classes) is an important part of
the design of a class.  It is one of the ways that C++ differs
from other "object oriented languages" is that the public 
interface and the protected interface will normally be different.

When I describe the interface for a non-virtual public member function 
I state certain things about what it is supposed to do.  These are
gurantees for the callers.  They imposes constraints, but 
within those constraints there is a significant amount of freedom
on the implementation.  I don't specify how it manipulates the class'es 
private data, when it is called by other member functions, what malloc 
calls it makes, and so on.  When I specify a virtual member for
the protected interface I must specify much more detail, and that
specification imposes constraints on future implementation.

For example, suppose opfx were virtual. Could it change
formatting flags?  Would that have an effect on the formatting 
that is currently going on?  Many inserters call opfx more
than once (because they recursively call other inserters) are
we constraining when they look at the formatting flags?

I suppose this kind of interface could be worked out, but it would
be a mistake to just make opfx a virtual without thinking very
hard about it.

And, of course there is the performance issue.  At the moment
iostream output is slower than stdio's and there are people who
refuse to use it for that reason.  However, I can imagine an improved
implementation (or improved compilers) that would improve the
performance, but if the interface imposes a function call for every 
character then there would be no hope.

Jerry Schwarz

Bob.Stout@p6.f506.n106.z1.fidonet.org (Bob Stout) (12/24/89)

I've written an I/O stream class (actually it hasn't been fully "classified"  
yet since the initial application required C rather than C++) which allows  
user-installable stream filters analogous to internal pipes which could offer  
one solution. Currently I've implemented everything from simple case  
conversion filters though data encryption and compression/expansion filters.  
Filters are stackable with some restrictions. For example, one of my test  
files performs case conversion prior to encryption by simply installing the  
two filters back-to-back after the stream is opened. In another application, I  
wrote a screen capture utility which had to work in any PC text or graphics  
mode (there are a bunch of them). I used the new stream I/O to write the  
captured screen out and after everything else was debugged, I added one line  
to install a simple RLE compression filter which dropped my 115K EGA screen  
dumps down into the 10-25K range.

As a general purpose package, it's surprisingly efficient as written, but I'd  
really love to give it first-class library status by removing its reliance on  
an underlying layer of existing code. It would be fairly easy to do with  
Zortech since I'm a beta tester for them and have their existing library  
source, but they haven't shown any interest in pursuing it after having been  
given first crack at it - any takers? If anyone is interested in parallel or  
co-operative development, the structures and installation mechanisms are  
published - contact me either at the address above or rbs@lnic2.hprc.uh.edu.