[comp.protocols.tcp-ip] Pseudo-Headers & Checksumming

Stevens@A.ISI.EDU (Jim Stevens) (01/11/88)

Spencer Garrett responded to my message about TCP/IP Precedence &
Preemption and raised a very interesting point about pseudo-headers
and checksumming.  Garrett's message follows:

    > I think one of the biggest warts in TCP/IP is the stupid
    > checksum spec.  Including noncontiguous, and sometimes
    > nonexistent, fields in the checksum insures that they will be
    > slow and awkward to compute.  If you're designing a new
    > protocol, PLEASE have the checksum include all of, and only,
    > your header and data.  If you feel you can't trust your IP
    > level to check its own header checksum, then please recompute
    > the ip checksum yourself; don't pluck bits of that header out
    > and pretend they're your own.  If the IP header got mangled,
    > you shouldn't even see the packet.  If that's not true, you
    > need to fix your IP module.  Let's hear it for layering.

Since I am designing a new protocol, I am interested in people's
comments on the implementation difficulty of using pseudo-headers. 

In addition, I am interested in any responses on why we have
pseudo-headers which are checksummed at all in TCP.  Especially in
light of the fact that there are other IP fields which TCP could use
but does not checksum.  Two such IP fields are precedence (within
IP type of service) and security options.

Is the old "End to End Argument" a reason for having pseudo-headers?
(Reference "End-to-End Arguments in System Design" by J.H. Saltzer,
D.P. Reed, and D.D. Clark, ACM Transactions on Computer System, Nov
1984.) 

Does the pseudo-header checksumming issue have anything in common
with the issue of whether the ARP packets should have their own
checksum in addition to relying upon the Ethernet checksum?
(Remember all the messages on this subject on TCP-IP about a year or
so ago.)

If we are worried about implementation efficiency, should we place
the checksum field at the end of the protocol packet (i.e. a protocol
data unit for a particular protocol) as well as make the data to be
checksummed contiguous?  For example, David Cheriton's Versatile
Message Transaction Protocol (VMTP) places the checksum at the very
end of the VMTP packets (even after the user data) to allow the
checksum "to be calculated as part of a software copy or hardware
transmission or reception as expected in an intelligent network
interface".   (Quotes are from Cheriton's July 1987 Preliminary
Version 0.3 description of VMTP.)

Jim Stevens
-------

cheriton@PESCADERO.STANFORD.EDU ("David Cheriton") (01/11/88)

VMTP has no need for pseudo headers a la TCP because the "entity identifiers",
the transport-level endpoints, are (inter)network level independent as is the
rest of the packet.  The data to be checksummed is contiguous as well.
Putting the checksum anywhere except at the very end of the packet seems
disasterous if you want to look good on a 100 megabit network.

David Cheriton

narten@PURDUE.EDU (Thomas Narten) (01/11/88)

>VMTP has no need for pseudo headers a la TCP because the "entity identifiers",
>the transport-level endpoints, are (inter)network level independent as is the
>rest of the packet.

One other important implication of the independence between the VMTP
and IP layer concerns ICMP errors. The ICMP spec requires that the
first 64 bits (8 bytes) of the transport level header be returned
along with the IP header. In protocols like TCP/UDP, the 8 byte
source/destination pair is actually part of the IP header, and by
careful arrangement of the transport header fields, source/destination
ports reside at the beginning of the packet. Because of the pseudo
header, TCP/UDP actually get back 12 bytes of useful transport header
information.  Errors such as src quench and port unreachables can then
matched up with the protocol control block that originates the
offending datagram.

In newer protocols like VMTP, I wonder if 8 bytes of transport header
is sufficient.  According the packet format given in the 86 SIGCOMM
paper, entity identifiers are 32 bits long, hence the
source/destination identifiers would use up all 64 bits of data. This
leaves no room for other possibly important information like the
forwarded entity identifier.  Without the forwarded entity identifier
field, ICMP error processing would appear to be much more difficult if
not impossible in some cases.

For instance, consider a request to read a file. The request would
first go to the directory server, which might forward the request to a
second server which answers the query directly. If the second server
is unreachable, where would ICMP errors go? Most likely, they go back
to the first server (not the originating client). The server would not
be able to match the source entity indentifier with any of its own,
and the ICMP error would likely be ignored. Meanwhile, the client
originating the request retransmits a few times and finally times out.

Is ICMP useful to newer protocols, and if not, can the Internet
operate effectively without it?

Thomas

ron@TOPAZ.RUTGERS.EDU (Ron Natalie) (01/12/88)

You seem to imply that you can throw the IP addresses away when you
get to the TCP layer.  The TCP port numbers are only significant when
paired with the ip addresses as well.  You must remember this
information somewhere, so you need the pseudoheader.  Since the idea
of the checksum is to prevent insidious little errors that "can't"
happen (it's a little useless for error correction as it is a very
weak), you really want to cover all the relevent

mckenney@distek4..istc.sri.com (Paul E. McKenney) (01/12/88)

In article <8801111505.AA02117@percival.cs.purdue.edu> narten@PURDUE.EDU
(Thomas Narten) writes:
>Is ICMP useful to newer protocols, and if not, can the Internet
>operate effectively without it?

Better yet:  Can ICMP be used by newer protocols that cannot fit the
source information into eight bytes?

High-speed hosts with lots of memory to burn could use the IP
`Identification' field to help match ICMP replies with recently-sent
packets, but only as long as they never send more than 65536 packets per
RTT.  This would not interfere with the normal use of this field (as an
aid in reassembling datagrams).  Furthermore, a host could unilaterally
make this use of the field, no cooperation is required.

If it is necessary to support more than 65536 packets per RTT, a new IP
OPTION field analogous to the SATNET `Stream Identifier' could be defined --
at a cost of an additional 32 to 64 bits per packet, of course!

				Thanx, Paul

Stevens@A.ISI.EDU (Jim Stevens) (01/13/88)

Ron, 

Sorry if I did not explain myself sufficiently.

The IP RFC 791 Receive Packet interface to Higher Level Protocols
specifies that the following information is to be made available to the
Higher Level Protocols:

RECV (BufPTR, prot, => result, src, dst, TOS, len, opt)
 
 where:

  BufPTR = buffer pointer
  prot = protocol
  result = response
    OK = datagram received ok
    Error = error in arguments
  len = length of buffer
  src = source address
  dst = destination address
  TOS = type of service (including precedence)
  opt = option data (including security/compartment information).

Now although TCP can use all of this information, in actual common use
only the prot, len, src, and dst parameters are normally used.

Note that TCP does NOT need a pseudoheader to receive this information,
rather the pseudoheader is ONLY used to verify that the most common
information passed to TCP from IP is End-to-End correct.

Thus the question(s) under consideration is(are) not whether TCP (and other
transport protocols) need the information which is passed from IP to
TCP.

Rather the questions under consideration are:

1.  Since TCP (and other transport protocols) need this information
from IP, should TCP trust the information to be correct and not have
been corrupted along the way.

2.  If TCP cannot trust IP to delivery the information 99.9999999%
correct, why does TCP only specify part of the IP information in the
pseudo-header instead of all of the passed information.


Jim Stevens

-------

deering@PESCADERO.STANFORD.EDU (Steve Deering) (01/14/88)

	From: Thomas Narten <narten@purdue.edu>:

	     ...According the packet format given in the 86 SIGCOMM
	paper, entity identifiers are 32 bits long, hence the
	source/destination identifiers would use up all 64 bits of data. This
	leaves no room for other possibly important information like the
	forwarded entity identifier.  Without the forwarded entity identifier
	field, ICMP error processing would appear to be much more difficult if
	not impossible in some cases.

The VMTP header format has changed significantly since the SIGCOMM paper
was written.  Entity identifiers are now 64 bits long, and the client
entity id is the first field of the header, which means it (and it alone)
is returned in ICMP messages.  That works out well, because the client
entity id is all that is needed to identify the appropriate protocol
control block, both at the client end and the server end, as well as
in any intermediate forwarders.

Steve Deering

narten@purdue.EDU (Thomas Narten) (01/14/88)

Where can one find more recent information on VMTP than the SIGCOMM
paper?

Thomas

braden@VENERA.ISI.EDU (01/15/88)

Thomas,

"Real soon now" Dave Cheriton will submit an RFC on VMTP [we hope].

    Bob Braden

CERF@A.ISI.EDU (01/18/88)

Jim,

I am going through a bunch of messages serially so I haven't yet
seen any responses to your checksum query (if any). 

The pseudo-header was an attempt to fashion a true end/end checksum
at the TCP level which included everything necessary at the TCP level
to be sure you were getting a valid packet from the putative origin.

We considered replicating information from the IP header in the TCP
header as a way of making the TCP header easier to checksum, but
the header was already so big, we decided to try the pseudo-header
approach instead. 

It's possible that we just went too far on the "end/end" road and
could have left some of the information covered in the TCP checksum
out (that is, left it to IP level), but at the time, there was
great concern that the IP level would only be checked host-gateway,
gateway-gateway and gateway-host, not really end/end. So many times
we found problems with lower level subsystems by doing end/end checking
that we allowed ourselves the "awkward" luxury of the TCP pseudo-header.

I seem to recall a recent exchange on TCP-IP in which the TCP level
checksum proved very helpful in protecting against some LAN problems
arising at the IP level, but that only underscores the value of
end/end, not necessarily making an argument for the pseudo-header
we chose for TCP. Really, my recollection of the pseudo header was
to avoid making the TCP and IP headers redundant.

Vint Cerf