[net.lan] OSI protocols in a socket environment

ch@gipsy.UUCP (Christian Huitema) (04/15/85)

..............................................................................

While trying to implement the OSI transport and session layer in a "socket"
environment, we encountered several problems, that I suppose may have been
solved elsewhere.

1) Connection phase:
	Contrarily to the "usual" (IP - TCP - UDP) protocols, the OSI
	architecture is strongly connection oriented. Each connection should
	be acknowledged. Connection request and connection acknowledge may
	carry "user" data (e.g. layer N+1 connection rq/ak). Connection
	acknowledge has an end to end significance, and carry the result of
	a negociation (e.g. end to end throughput class, message length,
	etc). Thus, we propose the following improvements:
		1) Allow the passing of (limited size) user data in the
		"CONNECT" primitive.
		2) create a "CONFIRM" primitive, similar to "CONNECT", that
		would be invoiced after "ACCEPT".
	Between the invocation of ACCEPT and CONNECT, the user could
	"SET_SOCK_OPT" to fix negociated parameters. The "CONNECT" primitive
	should return the "user_data" sent within the CONFIRM.

2) Transfer phase:
	The OSI connections enables the users to transmit "messages". The
	frontiers of these messages are part of the syntax, and should be
	preserved during the transfer.
	User messages above the transport layer are not limited in size. More
	precisely, their size is only limited by an end to end agreement
	between the applications. Exemples of likely sizes are 1 to 3 Kbytes
	for "Teletex", 100 Kbytes to 1 or 2 Mbytes for image transfers.
	This messages will be truncated into manageable size segments at the
	transport level.

	If we were writing the OSI protocols as v7 character oriented
	drivers, this would not be a problem. It would be sufficient to
	accumulate the incoming segments (OSI connections preserve their
	order), within a "read", into the "u_area". The end to end
	negociation of message sizes implies that the user programs would
	be able to forecast the maximum size of the messages, and pass the
	correct arguments to the "read" command. They will be happy if they
	are only waken up after an "end of message" segment has been
	received.

	This is less easy to do in a socket environment. The "correct" thing
	would be to accumulate all segments into a chain of "mbufs" until an
	"end-of-message" indication has been received, and then pass the
	whole chain to the upper layer. But this implies that, depending
	only of the application process, a huge amount of memory would have
	to be stored at intermediate levels. It would be much "nicer" to
	pass the incoming segments to the upper levels, along with an
	"end-of-message" mark. A possible way would be to terminate a chain
	of "mbufs" by a pointer "(int)m_next = -1" if more data is excepted,
	or "m_next = 0" if this is ths last segment of a message.

There are also some minor problems when implementing all the features of the
OSI session layer, e.g. checkpoints or activities. However, I would like to
know the opinion of Unix experts about these two extensions, i.e "confirm"
and "end-of-message mark".

Christian HUITEMA			()!mcvax!inria!gipsy!ch
GIPSI/INRIA
Rocquencourt BP105
78153 Le Chesnay - FRANCE

chris@umcp-cs.UUCP (Chris Torek) (04/26/85)

Well, you can actually do almost anything you want between a socket()
and a connect() or accept() call, by using (or abusing) the
setsockopt() system call at the SOL_PROTO level (yes I know this
doesn't work in 4.2 as distributed).  This is how we set the stream
type and EOM bit in our XNS implementation.

As for reading and writing arbitrary sized messages, this would take
some hacking.  The kernel really wants to move all the data into mbufs
before passing it to (or receiving it from) the protocol level.  This
makes the protocol code cleaner, since it need not deal with sockbufs.

It might be nice if the write code told the protocol level code what
uio_resid was after it put the data into the mbuf chain that goes with
the PRU_SEND request; this would at least let the protocol estimate the
amount of data left to be moved.  A new mbuf field on the input side
for holding the amount of data yet to be received might also then be
useful.  Of course, this means you'd have to pretend that your atomic
protocol was actually a stream:  more kludgery.  Oh well.

In any case, it is clear that the existing code is not up to dealing
with multi-kilobyte sends and receives.  Pick a real solution and try
it, but please don't store funny integers in mbuf pointers!  (I removed
the "m_act = 1" kludge for atomic protocols when we implemented XNS.
I was amused to discover that Sam Leffler also removed the same kludge
from the UCB kernels at about the same time.  Isn't there some quote
about minds that applies? :-) )
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

sylvain@lvbull.UUCP (Sylvain Langlois RCG-ARS) (05/04/85)

 In article <151@gipsy.UUCP>, ch@gipsy.UUCP (Christian Huitema) says:
>	Contrarily to the "usual" (IP - TCP - UDP) protocols, the OSI
>	architecture is strongly connection oriented. 
Not really! I always thought TCP was connection oriented. And now, we've
got a Connectionless Network Protocol (ISO DIS 8473) very similar to IP in
some ways!

>		1) Allow the passing of (limited size) user data in the
>		"CONNECT" primitive.
>		2) create a "CONFIRM" primitive, similar to "CONNECT", that
>		would be invoiced after "ACCEPT".
As you know, Christian, I'm working on this problem for a few months now. I
don't think you really need to pass down user-data in the "connect" call. This
needs to much changing in the existing code (even if this not *hudge* change).
The way I use it is passing user-data in a "setsockopt" call, prior to 
any action that may need it (before the "connect" when I'm about to send
a CR TPDU, and and after an "accept" when a CC is about to be sent.
I agree on the fact that *negociation* of connection parameters is the main
point. In fact, nobody tells you to use the "connect" and "accept" the way
Berkeley people did when coding TCP/IP. In particular, the "accept" doesn't
have to mean that connection is already established! I do use a new system
call (called "confirm", believe it or not !!) which tells the kernel if
Tranmsport Service User really accept the connection or not. That gives
me time to perform negociation through the "setsockopt/getsockopt" mechanism.

>	Between the invocation of ACCEPT and CONNECT, the user could
>	"SET_SOCK_OPT" to fix negociated parameters. The "CONNECT" primitive
>	should return the "user_data" sent within the CONFIRM.
Same point as above, "confirm" doesn't have to carry any user-data field in
it. I can always do a "getsockopt" with an appropriate flag to see if there
were any user-data in the previously received TPDU.
I let my "confirm" call as small as possible to make it portable easily. All
it does is saying to the kernel: "Right, Big Chief, I'm done with feeding you
with junk qos parameters! Build your TPDU and sent to my friend over the net!"

>	.... The "correct" thing
>	would be to accumulate all segments into a chain of "mbufs" until an
>	"end-of-message" indication has been received, and then pass the
>	whole chain to the upper layer. But this implies that, depending
>	only of the application process, a huge amount of memory would have
>	to be stored at intermediate levels. It would be much "nicer" to
>	pass the incoming segments to the upper levels, along with an
>	"end-of-message" mark. A possible way would be to terminate a chain
>	of "mbufs" by a pointer "(int)m_next = -1" if more data is excepted,
>	or "m_next = 0" if this is ths last segment of a message.
What about non-blocking read? I will not disagree: this is the worst point of
OSI Services. It may be worst (yet!?!) if you try to do a clean interface
flow control to regulate data passing. But maybe we could use some socket
state flags to let the user know if "message" is over or not.
One other problem you did not mention is the "expedited data" handling. I
still don't know if "out-of-band" facility is a good way to do it. Anyway,
the OOB stuff needs to be readapted to what OSI TS User is expecting.

Why don't you cross the road, and come over here? We could have a cup of tea 
(sorry, no beer herein!) and discuss about all these tiny vicious problems 
we've got in common: you are welcome at any time!


PS: Sorry for not speaking french to you, but I've been told that some
of other news readers didn't understand our dialect!

PPS:
>	However, I would like to know the opinion of Unix experts ... 
I'm not a really one but don't tell anybody, in case they think I am!


-- 
Sylvain "Panic Trap" Langlois		
UUCP Address:	(...!mcvax!vmucnam!lvbull!sylvain)
Postal Address:	BULL, PC 33/05, 68 route de Versailles,
		F-78430 Louveciennes, France.

mark@cbosgd.UUCP (Mark Horton) (05/06/85)

In article <391@lvbull.UUCP> sylvain@lvbull.UUCP (Sylvain Langlois) writes:
> In article <151@gipsy.UUCP>, ch@gipsy.UUCP (Christian Huitema) says:
>>	Contrarily to the "usual" (IP - TCP - UDP) protocols, the OSI
>>	architecture is strongly connection oriented. 
>Not really! I always thought TCP was connection oriented. And now, we've
>got a Connectionless Network Protocol (ISO DIS 8473) very similar to IP in
>some ways!

My understanding is that there are two contingents in the ISO standardization
effort: the Telcos are pushing the connection oriented protocols and
the computer scientists are pushing the connectionless protocols.  The
result is that the ISO network layer has both: CLNS for IP style datagrams
and CONS for X.25 style virtual circuits.

If I correctly interpret this, it means that internetworking is suddenly
4 times as complex as it used to be - connection oriented transport protocols
have to be prepared to deal with either CLNS or CONS network protocols,
and connectionless transport protocols have to deal with both also.  IP
makes internetworking a reasonable problem - only datagrams need to be
sent, so there is no state in the internetwork layer.  I shudder to think
of routing a datagram across potentially several networks, some of which
are connectionless and some connection oriented.

The problem is further complicated when you look at the protocols that
fit underneath CLNS/CONS.  For LAN's, it's not a problem, you can use
Ethernet or any number of datagram based networks.  However, for a long
haul network, I understand that X.25 is the only game in town, and that
therefore CLNS will have to be implemented on top of X.25 virtual circuits.
This, of course, defeats the whole purpose of CLNS, which is to cheaply
pass a datagram to an arbitrary host without the overhead of setting
up a connection.  Given these cheap (but unreliable) datagrams, it is
then possible to cleanly construct a reliable transport layer protocol,
ala TCP on top of IP.

I gather you folks in France are more closely tied into what's really
going on than I am.  Is there a nice solution to these problems in the
works, or is it really this big a mess?

	Mark