[comp.protocols.iso] TP0-4 question

cliff@violet.berkeley.edu (Cliff Frost) (10/17/89)

Hi,
I was in Marshall Rose's (very well done) class at InterOp, and that
is essentially what I know about OSI (plus some OSI sessions and random
stuff picked up from this newsgroup).  Please bear this in mind if this
turns out to be a wildly stupid question.

I think I understand (at least some of) the controversy between
connectionless and connection-oriented Network layers, but I don't
understand why OSI has to have 5 Transport standards.  TP4 could certainly
run over X.25 as well as over connectionless couldn't it?

So, why not just run TP4 everywhere and leave the controversy in the 
Network layer?  

If you have two ES's connected with X.25, what cost would TP4 add if
you used it in place of TP0?  Wouldn't header prediction algorithms work
100% of the time in that case?  So, the extra cost would be the
checksumming and some timer maintanence.  But, Van Jacobson originally,
and now a whole host of other folks have shown that this processing is
NOT a bottleneck if your transport is properly implemented (and they've
also shown how to implement it).

It can't be this simple, I must be missing some important TECHNICAL
argument.  (I'm not interested in Political reasons at the moment.)
	Thanks very much,
			Cliff Frost <cliff@cmsa.berkeley.edu>

mckenzie@bbn.com (Alex McKenzie) (10/17/89)

The cost people were trying to avoid is coding (and memory space) cost,
not execution cost.  I won't argue whether or not people should have
been concerned about these costs, but they were.  The argument was that
if they could count on a network layer that either delivered data
reliably or reset the connection, why should they be obliged to
implement a complicated Transport Layer.  They similarly argued for
"null" or highly specialized Session, Presentation, and Application
layers in many cases.  Of course politics and religion are also present
in these discussions, but you said you weren't interested in those.

Alex McKenzie

jh@tut.fi (Hein{nen Juha) (10/17/89)

In article <1989Oct16.173618.25068@agate.berkeley.edu> cliff@violet.berkeley.edu (Cliff Frost) writes:

   It can't be this simple, I must be missing some important TECHNICAL
   argument.  (I'm not interested in Political reasons at the moment.)

The only arguments there are are European politics and that PTTs know
how to take your money (= do accounting) in X.25 world.
--
--	Juha Heinanen, Tampere Univ. of Technology, Finland
	jh@tut.fi (Internet), tut!jh (UUCP), jh@tut (Bitnet)

Christian.Huitema@MIRSA.INRIA.FR (Christian Huitema) (10/18/89)

Juha, you tend to speak too fast, with a bit of a bias against European
PTT, and other European bodies in general. It is true that their main
argument for not using TP-4 is historical: they specified TP-0 for the
TELETEX service back in 1980 and want to "maintain compatibility".
Given the failure of the TELETEX service to gain any significant market
share (users bought faxes instead), this is a very weak argument. It is
also true that their second argument, that of simplicity (TP-0 means 50
times less code than TP-4), does not hold water, as TP-4 codes are
already available and just need to be copied. But the real difference
is not between TP4 and TP0, it is between CONS and CLNS. In practice,
TP-4 codes are distributed in a CLNS environment -- assuming a
stateless, fully connected, network.

The difference between CONS and CLNS is indeed largely political, as
one could very well carry isograms over X.25 virtual circuits. But this
will have a cost, i.e. the volume charges of carrying 50 extra bytes of
headers per packet, and also paying for extra acknowledge packets: TP
ACK are carried as data, and charged for, while X.25 RRs are not
accounted. To give an order of magnitude, we have observed that running
TCP over IP over X.25 involves approximately 25% of overhead compared
to running the application straight over X.25; as our X.25 bill runs
here to approx 500,000FF per year, that would mean spending an extra
125,000FF (16000 ECUs, 20000 US$) per year. We would rather not. And
the solution for not paying any overhead is called TP-0, not TP-4.

Christian Huitema

jh@TUT.FI (Hein{nen Juha) (10/18/89)

   one could very well carry isograms over X.25 virtual circuits. But this
   will have a cost, i.e. the volume charges of carrying 50 extra bytes of
   headers per packet, and also paying for extra acknowledge packets: TP
   ACK are carried as data, and charged for, while X.25 RRs are not
   accounted. To give an order of magnitude, we have observed that running
   TCP over IP over X.25 involves approximately 25% of overhead compared
   to running the application straight over X.25; as our X.25 bill runs
   here to approx 500,000FF per year, that would mean spending an extra
   125,000FF (16000 ECUs, 20000 US$) per year. We would rather not. And
   the solution for not paying any overhead is called TP-0, not TP-4.

Christian,

I agree with the estimate of 25% extra, but who says that the traffic
has to be volume charged and who asks you to use X.25 at all at the
bottom?  We have educated our PTT to accept charging by the bandwith
one is using to connect to the network and there is no X.25 bullshit
involved.

Right now the Finnish State PTT has started to marked a service called
DATANET which is a public TCP/IP (later also ISO/IP) service based on
Cisco routers and fixed bandwidth based charging.  I feel that this is
the right approach which should also be adopted by the EC monopolistic
PTT (like yours).

-- Juha

larry@pdn.paradyne.com (Larry Swift) (10/18/89)

In article <1989Oct16.173618.25068@agate.berkeley.edu> cliff@violet.berkeley.edu (Cliff Frost) writes:
>If you have two ES's connected with X.25, what cost would TP4 add if
>you used it in place of TP0?  Wouldn't header prediction algorithms work
>100% of the time in that case?  So, the extra cost would be the
>checksumming and some timer maintanence.  But, Van Jacobson originally,
>and now a whole host of other folks have shown that this processing is
>NOT a bottleneck if your transport is properly implemented (and they've
>also shown how to implement it).

In article <46993@bbn.COM> mckenzie@labs-n.bbn.com (Alex McKenzie) writes:
>The cost people were trying to avoid is coding (and memory space) cost,
>not execution cost.  I won't argue whether or not people should have

Both of you seem to be saying that the TP4 flow control (over X.25's)
doesn't contribute a performance hit over a single L4 & L3 connection.
Since I have heard of contrary experiences, can you explain?  In
particular, what are "header prediction algorithms"?

Larry Swift                     larry@pdn.paradyne.com
AT&T Paradyne, LG-132           Phone: (813) 530-8605
8545 - 126th Avenue, North
Largo, FL, 34649-2826           She's old and she's creaky, but she holds!

cliff@violet.berkeley.edu (Cliff Frost) (10/19/89)

Larry,
Well, I have heard that if one does TCP/IP/X.25 there can be
problems due to flow control conflicts between TCP and X.25.  But, my
(naive) thought was that a TP4/X.25 stack might be engineered to avoid
this particular problem.  (Seems simpler than having 5 transport
protocols, one incompatible with the others...)

By "header prediction algorithms" I was referring to a part of Van
Jacobson's work with TCP.  I don't have the exact citation, but there
is an article in a very recent Journal of the IEEE by Van Jacobson,
Dave Clark and others which gives details.  (I stupidly leant my copy
to someone who appears to have lost it...)  This work should be
directly applicable to TP4, but it has nothing to do with the flow
control issue that you raise.

So far this flow control problem is the only argument I've heard that
is really technical.  Has it been spelled out somewhere, or is it
obviously insuperable to anyone very familiar with the protocols
involved?

	Thanks,
		Cliff
Re:
In article <6670@pdn.paradyne.com> larry@pdn.paradyne.com (Larry Swift) writes:
> ...
>Both of you seem to be saying that the TP4 flow control (over X.25's)
>doesn't contribute a performance hit over a single L4 & L3 connection.
>Since I have heard of contrary experiences, can you explain?  In
>particular, what are "header prediction algorithms"?
>
>
>Larry Swift                     larry@pdn.paradyne.com

craig@bbn.com (Craig Partridge) (10/19/89)

The header prediction algorithms are described in an article in
the July 1989 issue of IEEE Communications by Clark, Jacobson,
Romkey and Salwen.  The larger point of this article is that
protocol processing costs in hosts are less expensive than you think
(and the numbers they use for analysis are fairly conservative).

As for TCP over X.25 -- my impression from experience (now some years
ago) with CSNET's IP over X.25 implementation is that the key problem
is proper round-trip estimation, given  that X.25 introduces additional
variance into the estimate.  An example may help here:

    If TCP introduces 8 segments into an X.25 network with a maximum
    outstanding window of 2 packets (say because you go through an
    X.75 gateway), then if the round-trip time of the X.25 network
    is T, the observed TCP round-trip times will be T,T, 2T,2T,
    3T,3T, 4T, 4T.  Wilder scenarios occur in the face of mistaken
    retransmissions by the TCP.  But a good round-trip time
    estimator (e.g. Jacobson's described in SIGCOMM '88) should
    detect this  and keep your retransmission interval correct
    for the X.25 network characteristics.

Note that users, however, may notice this bizzare pattern.  Until the
Nagle algorithm was in common use, users of TELNET would find that
the echo time on their characters varied widely.  It is still true
that redrawing a screen can have an oddly syncopated look to it.
One might speculate that because TP4 doesn't allow re-aggregation of data
in the transport layer, TP4 may look worse (but then again, so should TP0).

Craig

larry@pdn.paradyne.com (Larry Swift) (10/19/89)

In article <1989Oct19.005408.236@agate.berkeley.edu> cliff@violet.berkeley.edu (Cliff Frost) writes:
>Larry,
>Well, I have heard that if one does TCP/IP/X.25 there can be
>problems due to flow control conflicts between TCP and X.25.  But, my

No doubt the same problem as with TP4 and X.25.

>So far this flow control problem is the only argument I've heard that
>is really technical.  Has it been spelled out somewhere, or is it
>obviously insuperable to anyone very familiar with the protocols
>involved?

The obviousness of it is arguable, but in any case I can't put my
finger on any publication at the moment that explains or cites
statistics.  I think you'll find that a model of the protocols will
show a double delay (often, if not always) while first Layer 3 waits
for an ACK, then Layer 4.  This injects a significant delay into ONE
transport/network connection.  It is possible to reduce the delay by
using more connections for the one application path and tuning window
sizes, but it always seems to me like fixing symptoms instead of the
problem -- redundancy of function in the two layers.

Larry Swift                     larry@pdn.paradyne.com
AT&T Paradyne, LG-132           Phone: (813) 530-8605
8545 - 126th Avenue, North
Largo, FL, 34649-2826           She's old and she's creaky, but she holds!

huitema@JERRY.INRIA.FR (Christian Huitema) (10/20/89)

Craig,

The outcome of "TELNET + TCP-IP" should be compared to either "X.29 +
X.25" or "VTP+PRES+SES+TP[0-4]", not to "TELNET+TP[0-4]". Both X.29 and
VTP+PRES implement character buffers, so that the size of the packets
will grow when the TP (or net) window based control becomes a
bottleneck. I know that this work in practice with X.28/X.29, but must
confess I never saw anyone using VTP...

In any case, you are right for the particular handling of timers which
is necessary with TCP-IP over X.25. You get strange patterns at low
load condition due to the window mechanism, which were however somewhat
hidden here because we use large windows + dont use end to end window
on X.25 (no ``D'' bit). We also use to get straight congestion patterns
if too many stations competed for the X.25 vc linking two subnets: if
the effective throughput dropped under 9.6k, the whole thing would fall
apart. That was with berkeley-4.2 implementations, and I suppose that
better time-out prediction algorithms would have helped, provided they
could converge on a window of less than 2Kbytes and a timer over 5 secs.

Chrsitian Huitema

craig@NNSC.NSF.NET (Craig Partridge) (10/20/89)

Christian:

    Thank you for the clarification. I had not realized that VTP+PRES
had character buffers that could adapt to transport throughput.

Thanks!

Craig