[comp.protocols.tcp-ip] Batching vs lock-step

urlichs@smurf.sub.org (Matthias Urlichs) (10/29/90)

In comp.protocols.tcp-ip, article <1990Oct25.165545@envy.bellcore.com>,
  karn@thumper.bellcore.com writes:
< 
< While we're on the subject of piggybacking, another thing I would
< really like to see is widespread use of batched SMTP on the Internet.
< I think the number of packets it takes for most SMTP implementations
< to transfer a short mail message is criminal, especially when the
< message has several recipients on the same system.  There's no reason
< that you shouldn't be able to send a series of SMTP commands in a
< single TCP segment and receive a series of responses, except that many
< SMTP servers inexplicably blow up when you try this. Given that TCP is
< supposed to be a reliable byte stream protocol, the designers of these
< systems must have gone well out of their way to keep this from working.
< 
The problem is that on top of TCP there's the C stdio library which tends to
buffer your data when a program does an fgets().

After reading the first request, said program is likely to say "wait on the
 _connection_ for ten minutes until data become available".
No programmer bothers to examine the stdio buffer first becase
- the protocol is supposed to be lock-step anyway,
- examining a stdio buffer on whether it contains data is not standardized,
- the alternative is to use alarm() and signal(), but longjmp()ing from a
  signal handler back into a program, bypassing the stdio library on the way
  out, may not be what the system designers had in mind.

A side effect of this is that sending a single character to such an
implementation, and leaving the TCP stream open, will hang it indefinitely.
;-)

-- 
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de     /(o\
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330)   \o)/

enag@ifi.uio.no (Erik Naggum) (10/29/90)

In article <cyt^f2.ck6@smurf.sub.org> urlichs@smurf.sub.org (Matthias Urlichs) writes:

   The problem is that on top of TCP there's the C stdio library which tends to
   buffer your data when a program does an fgets().

   After reading the first request, said program is likely to say "wait on the
    _connection_ for ten minutes until data become available".
   No programmer bothers to examine the stdio buffer first becase
   - the protocol is supposed to be lock-step anyway,
   - examining a stdio buffer on whether it contains data is not standardized,
   - the alternative is to use alarm() and signal(), but longjmp()ing from a
     signal handler back into a program, bypassing the stdio library on the way
     out, may not be what the system designers had in mind.

   A side effect of this is that sending a single character to such an
   implementation, and leaving the TCP stream open, will hang it indefinitely.
   ;-)

I didn't get this.  Are you saying you switch between stdio routines
and raw socket reads at some point after the first fgets?

Why would anyone do this?

I've used the stdio library, which presents to me a stream of
characters, to read from sockets bound to a TCP port, and this maps
very elegantly on top of TCP, which is also a stream.  I haven't had
problems of any kind.

In particular, I can't think of any real situation where your "side
effect" would occur.

What are you doing?

Phil Karn wrote:
> Given that TCP is supposed to be a reliable byte stream protocol,
> the designers of these systems must have gone well out of their way
> to keep this from working.

If the above scenario is real, I agree completely with Phil, here.

I've seen NNTP get slightly confused when GNUS (a GNU Emacs news
reader) stuffed a whole bunch of command lines down its throat, and
the cause was that NNTP used select(2) to wait for input.  Apparently,
it read the data into a large buffer, strchr'ed for '\n', replaced it
with a '\0', and parsed the command.  Of course, this failed miserably
when GNUS was waiting for more than one reply code.  I don't think
this implies that the programmer has gone out of his way to keep it
from working, since he's only supposed to get one command line at a
time.  Anyway, it works with a conforming NNTP client, although a bit
on the "be conservative in what you accept and liberal in what you
send" side.

--
[Erik Naggum]		Naggum Software; Gaustadalleen 21; 0371 OSLO; NORWAY
	I disclaim,	<erik@naggum.uu.no>, <enag@ifi.uio.no>
  therefore I post.	+47-295-8622, +47-256-7822, (fax) +47-260-4427
--

urlichs@smurf.sub.org (Matthias Urlichs) (10/30/90)

In comp.protocols.tcp-ip, article <ENAG.90Oct29110031@hild.ifi.uio.no>,
  enag@ifi.uio.no (Erik Naggum) writes:
< In article <cyt^f2.ck6@smurf.sub.org> urlichs@smurf.sub.org (Matthias Urlichs) writes:
< 
<  The problem is that on top of TCP there's the C stdio library which tends to
<  buffer your data when a program does an fgets().
< 
<  After reading the first request, said program is likely to say "wait on the
<   _connection_ for ten minutes until data become available".
<  [...]
<    A side effect of this is that sending a single character to such an
<    implementation, and leaving the TCP stream open, will hang it indefinitely.
<    ;-)
< 
< I didn't get this.  Are you saying you switch between stdio routines
< and raw socket reads at some point after the first fgets?
< 
No, I said (and quite a few programs actually do it) that the programs
alternate between reading via stdio and using select() on the raw socket.

< In particular, I can't think of any real situation where your "side
< effect" would occur.
< 
Consider a program which does:
- while(TCP stream OK)
-   select(TCP stream, 10 minutes)
-   fgets(stream, data)
-   process (data)

Now if you send a single character to this program, the select call will say
OK, but the fgets() will hang, waiting for a Newline.
Alterntely, if you send two short lines, the fgets will read both into its
buffer (and return the first to the program), and next time 'round the select
will block.

< I've seen NNTP get slightly confused when GNUS (a GNU Emacs news
< reader) stuffed a whole bunch of command lines down its throat, and
< the cause was that NNTP used select(2) to wait for input.  Apparently,
< it read the data into a large buffer, strchr'ed for '\n', replaced it
< with a '\0', and parsed the command.  [...]

It's far simpler -- see above.

<   Anyway, it works with a conforming NNTP client, although a bit
< on the "be conservative in what you accept and liberal in what you
< send" side.
< 
That seems to be the problem. ;-)

-- 
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de     /(o\
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330)   \o)/