[comp.sys.sgi] network data transfers scrambled

loki@NAZGUL.PHYSICS.MCGILL.CA (Loki Jorgenson Rm421) (03/27/91)

	Hey ho....

	I am trying to figure out why network machine-to-machine data
transfer is scrambling and I can't make much sense of it.  Here are the
details:

	I am transferring about 250 kbytes at a time between two processes
which are connected via tcp (AF_INET) sockets.  When the two processes are
running on a 4D/25G, there is no trouble with the transfer.  When they are
running on our other IRIS, a 4D/20, there is no trouble.  Of course, both of
these cases are trivial.  

	In the important case, one process on each of the two machines,
the transfer works going from the 4D/25G to the 4D/20 but.... it seems
to be scrambling going the other way.  I can't deduce anything from this
since there doesn't seem to be anything to conclude.  It doesn't seem
to relate to load average or any other obvious machine-dependent aspect.

	Is there something so different about the 4D/20 hardware that something
is not working correctly?

	One thing I have considered but can't resolve is:  Is it
important to write/read to/from socket with an eye to a minimum/maximum
buffer size?  I am currently reading/writing in 8192 byte chunks.  I
have tried varying this without noticing any difference.  I have heard
that it is imperative that one write/read in 1024byte chunks on a SUN
but that this wasn't important for a 4D IRIS.

	The only other clue I have is that the 8192 byte chunks "seem"
to be out-of-order.... as if the tranferred packets arrived out of
sequence.  I haven't yet verified this rigourously.  I thought that
TCP handled the transfers so that this was impossible.

	Does anyone have any suggestions?  What to look for?

                             __          __
Loki Jorgenson              / /          \ \  node:  loki@Physics.McGill.CA
Grad, Systems Manager      / //////  \\\\\\ \ BITNET: PY29@MCGILLA
Physics, McGill University \ \\\\\\  ////// / fax:   (514) 398-8434
Montreal Quebec CANADA      \_\          /_/  phone: (514) 398-7027

mike@BRL.MIL (Mike Muuss) (03/28/91)

There is a slight chance that your problem may be due to not checking
the return code from your read() system call.  When sending data over
the network, it does not always arrive in the same size "chunks" as
you sent it in.  (The same thing can also happen on pipes, but is
seen less often, for a bunch of complicated reasons).

I attach a very handy subroutine written by Bob Miles which you can
use to aleviate this difficulty.  I hope this solves your problem.

	Best,
	 -Mike

------

/*
 *			M R E A D
 *
 * This function performs the function of a read(II) but will
 * call read(II) multiple times in order to get the requested
 * number of characters.  This can be necessary because pipes
 * and network connections don't deliver data with the same
 * grouping as it is written with.
 */
static int
mread(fd, bufp, n)
int fd;
register char	*bufp;
unsigned	n;
{
	register unsigned	count = 0;
	register int		nread;

	do {
		nread = read(fd, bufp, n-count);
		if(nread == -1)
			return(nread);
		if(nread == 0)
			return((int)count);
		count += (unsigned)nread;
		bufp += nread;
	 } while(count < n);

	return((int)count);
}