[comp.unix.wizards] preserving message boundaries on named pipes - System V

murrey@lehi3b15.csee.Lehigh.EDU (Erik Murrey) (05/17/89)

System V.3.1 Unix

Is it possible to preserve message boundaries on named pipes under
System V?  For instance, if a process sends two separate messages (via
two write()'s) down a named pipe, how can the receiver read those as
two separate messages?  It seems that right now, if the reading
process unblocks after both writes have completed, then it gets *both*
messages in one read.

I really don't wan't to muck with terminating null characters, and
the like...

... Erik

-- 
Erik Murrey
Lehigh University
murrey@csee.Lehigh.EDU
erik@mpx.com

prc@erbe.se (Robert Claeson) (05/18/89)

In article <571@lehi3b15.csee.Lehigh.EDU> murrey@lehi3b15.csee.Lehigh.EDU (Erik Murrey) writes:

>Is it possible to preserve message boundaries on named pipes under
>System V?  For instance, if a process sends two separate messages (via
>two write()'s) down a named pipe, how can the receiver read those as
>two separate messages?

You can't, as far as I know. I'd suggest you to take a look at message
queues instead. They are also more efficient than pipes since they don't
deal with the file system at all.
-- 
          Robert Claeson      E-mail: rclaeson@erbe.se
	  ERBE DATA AB

wietse@wzv.UUCP (Wietse Z. Venema) (05/18/89)

In article <571@lehi3b15.csee.Lehigh.EDU> murrey@lehi3b15.csee.Lehigh.EDU (Erik Murrey) writes:
>System V.3.1 Unix
>
>Is it possible to preserve message boundaries on named pipes under
>System V?  For instance, if a process sends two separate messages (via
>two write()'s) down a named pipe, how can the receiver read those as
>two separate messages?  

Under UNIX, the reader just reads what the writer wrote into the pipe.
There are no magic cookies between the `data'. What you seem to want
is provided by the message-queue (the msgop(2) facilities.
-- 
work:	wswietse@eutrc3.uucp	| Eindhoven University of Technology
work:	wswietse@heitue5.bitnet	| Mathematics and Computing Science
home:	wietse@wzv.uucp		| Eindhoven, The Netherlands

les@chinet.chi.il.us (Leslie Mikesell) (05/19/89)

In article <571@lehi3b15.csee.Lehigh.EDU> murrey@lehi3b15.csee.Lehigh.EDU (Erik Murrey) writes:

>System V.3.1 Unix

>Is it possible to preserve message boundaries on named pipes under
>System V?  For instance, if a process sends two separate messages (via
>two write()'s) down a named pipe, how can the receiver read those as
>two separate messages?  It seems that right now, if the reading
>process unblocks after both writes have completed, then it gets *both*
>messages in one read.

>I really don't wan't to muck with terminating null characters, and
>the like...

The easy way is to use fixed length write()'s and matching read()'s.
If the input is sufficiently varied that this approach would impose
a lot of overhead you might precede the data with a length value.  This
should be done as a single write() if multiple processes are writing
to the same FIFO to avoid the possibility of interleaving between the
length and data.  The reader would do a fixed length read() to find
out the size the next read() should be for the data.


Les Mikesell

libes@cme.nbs.gov (Don Libes) (05/20/89)

In article <8486@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes:
>In article <571@lehi3b15.csee.Lehigh.EDU> murrey@lehi3b15.csee.Lehigh.EDU (Erik Murrey) writes:
>>Is it possible to preserve message boundaries on named pipes ...
>
>The easy way is to use fixed length write()'s and matching read()'s.
>If the input is sufficiently varied that this approach would impose
>a lot of overhead you might precede the data with a length value.

I wrote two functions that preserve message boundaries on any stream
using just this idea.

	sized_write(fd,buffer,nbytes) - just like write()
	sized_read(fd,buffer,nbytes) - just like read() except that
					it returns what was written
					by one call to sized_write()

They work on any stream, not just named pipes.  I originally used them
on top of TCP.

>This
>should be done as a single write() if multiple processes are writing
>to the same FIFO to avoid the possibility of interleaving between the
>length and data.

I didn't go this far, mainly because I can't see a simple way to do
this without copying and reserving or mallocing a potentially enormous
auxiliary space.

Don Libes          libes@cme.nbs.gov      ...!uunet!cme-durer!libes


----------------------------cut here---------------------------------
/* sized_io.c - Preserve message boundaries in stream io.

These two routines enable us to use stream io, but still detect end of
record marks.  Each call to sized_read() returns a complete buffer, that is,
what was written by one call to sized_write().

Notes:

The IPC system seems to be a confusing mess.  I.e. unusual conditions are
handled in all different ways.  Specifically,

While we are reading, if the writer goes away, we sometimes get a read()
== -1 && errno == ECONNRESET.  Sometimes we get a read() == 0.  Why the
difference?

While we are writing, if the reader goes away, we get a signal (SIGPIPE).


Don Libes
National Institute of Standards and Technology
(301) 975-3535
libes@cme.nist.gov
...!uunet!cme-durer!libes

*/

#include <stdio.h>
#include <errno.h>
extern int errno;
#include <sys/types.h>
#include <netinet/in.h>

int	/* returns number of bytes read or -1 if error (i.e. EOF) */
sized_read(fd,buffer,maxbytes)
int fd;
char *buffer;
int maxbytes;	/* unlike read(), this parameter is the maximum size of */
		/* the buffer */
{
	int size;	/* size of incoming packet */
	int cc;
	int rembytes;	/* remaining bytes */
	u_long netlong;	/* network byte ordered length */

	/* read header */
	if (sizeof(size) != (cc = read(fd,(char *)&netlong,sizeof(netlong)))){
		/* if the connection is broken, we end up here */
#ifdef DEBUG
		fprintf(stderr,"sized_read: expecting buffer size but only read %d chars\n",cc);
#endif
		if (cc == -1)
			if (errno != ECONNRESET) perror("read");
		return(-1);
	}

	size = ntohl(netlong);

	/* read data */
	if (size == 0) return(0);
	else if (size > maxbytes) {
		fprintf(stderr,"sized_read: buffer too small.  ");
		fprintf(stderr,"buffer size was %d  actual size was %d\n",
			maxbytes,size);
		return(-1);
	}

	/* handle buffers to large to fit in one transfer */
	rembytes = size;
	while (rembytes) {
		if (-1 == (cc = read(fd,buffer,rembytes))) {
			fprintf(stderr,"sized_read(,,%d) = read(,,%d) = %d\n",
							size,rembytes,cc);
			if (errno != ECONNRESET) perror("read");
			return(-1);
		}

		/* new! */
		if (0 == cc) {	/* EOF - process died */
			return(-1);
		}

#ifdef DEBUG
		if (rembytes != cc)
			fprintf(stderr,"sized_read(,,%d) = read(,,%d) = %d\n",
							size,rembytes,cc);
#endif
		/* read() returned more bytes than requested!?!?!?! */
		/* this can't happen, but appears to be anyway */
		if (cc > rembytes) {
			fprintf(stderr,"sized_read(,,%d) = read(,,%d) = %d!?!?!\n",
							size,rembytes,cc);
			fprintf(stderr,"read() returned more chars than requested!  Aborting program.\n");
			abort();
		}
		buffer += cc;
		rembytes -= cc;
	}
	return(size);
}

int	/* returns number of data bytes written or -1 if error */
sized_write(fd,buffer,nbytes)
int fd;
char *buffer;
int nbytes;
{
	int cc;
	int rembytes;
	u_long netlong;	/* network byte ordered length */

	/* write header */
	netlong = htonl(nbytes);
	if (sizeof(nbytes) != (cc = write(fd,(char *)&netlong,
							sizeof(netlong)))) {
#ifdef DEBUG
		/* this can never happen (SIGPIPE will always occur first) */
		fprintf(stderr,"sized_write: tried to write buffer size but only wrote %d chars\n",cc);
#endif
		if (cc == -1) perror("write");
		return(-1);
	}

	/* write data */
	if (nbytes == 0) return(0);

	rembytes = nbytes;
	while (rembytes) {
		if (-1 == (cc = write(fd,buffer,rembytes))) {
		      fprintf(stderr,"sized_write(,,%d) = write(,,%d) = %d\n",
							nbytes,rembytes,cc);
			perror("write");
			return(-1);
		}
#ifdef DEBUG
		if (rembytes != cc) 
		      fprintf(stderr,"sized_write(,,%d) = write(,,%d) = %d\n",
							nbytes,rembytes,cc);
#endif
		buffer += cc;
		rembytes -= cc;
	}
	return(nbytes);
}

cowan@marob.MASA.COM (John Cowan) (05/20/89)

In article <683@maxim.erbe.se> prc@maxim.UUCP (Robert Claeson) writes:
>In article <571@lehi3b15.csee.Lehigh.EDU> murrey@lehi3b15.csee.Lehigh.EDU (Erik Murrey) writes:
>
>>Is it possible to preserve message boundaries on named pipes under
>>System V?  For instance, if a process sends two separate messages (via
>>two write()'s) down a named pipe, how can the receiver read those as
>>two separate messages?
>
>You can't, as far as I know. I'd suggest you to take a look at message
>queues instead. They are also more efficient than pipes since they don't
>deal with the file system at all.


Claeson is technically correct.  However, it is an undocumented, but reliable,
property of the 'write' system call on a pipe that it is atomic if the
number of bytes written is less than a certain magic number, typically
10240.  Therefore, a message protocol can be superimposed on the regular
named pipe bytestream by some such scheme as "prefix each message by its
length" or "delimit each message with a reserved character".  No mixing of
messages will occur, as long as the above length restriction is observed.

These remarks apply equally to named and regular pipes, and to all versions
of Unix, modulo the exact value of the magic number.
-- 
John Cowan <cowan@marob.masa.com> or <cowan@magpie.masa.com>
UUCP mailers:  ...!uunet!hombre!{marob,magpie}!cowan
Fidonet (last resort): 1:107/711
Aiya elenion ancalima!

dave@micropen (David F. Carlson) (05/20/89)

> In article <571@lehi3b15.csee.Lehigh.EDU> murrey@lehi3b15.csee.Lehigh.EDU (Erik Murrey) writes:
 >System V.3.1 Unix
 >
 >Is it possible to preserve message boundaries on named pipes under
 >System V?  For instance, if a process sends two separate messages (via
 >two write()'s) down a named pipe, how can the receiver read those as
 >two separate messages?  
 
Although inefficient and contrived, if limits.h has PIPE_BUF == PIPE_MAX
and the reader and writer used *ONLY* that length, each read would
be an atomic message with no magic cookies required.


-- 
David F. Carlson, Micropen, Inc.
micropen!dave@ee.rochester.edu

"The faster I go, the behinder I get." --Lewis Carroll

mike@BRL.MIL (Mike Muuss) (05/20/89)

What you seek is provided by a subroutine library that we call LIBPKG --
a way of moving messages of arbitrary sizes between processes, without
having to worry about preserving message boundaries, buffer allocation,
connection establishment, etc.  All those things are handled by the
library.  It's free, write me if you want a copy.
	Best,
	 -Mike