[comp.sys.hp] rpc registration and how to tell of a cnode death ?

markl@hpbbi4.HP.COM (#Mark Lufkin) (05/29/90)

> What!!?  Are you suggesting that Greg rewrites `amd' to use NCS??

	I must admit I have no idea what 'amd' is or where it comes
	from but the idea is not as stupid as you might think ... it does
	not take long to convert a program (it has been done in house ...
	could have something to do with the fact that we are pushing NCS
	now though and having SUNrpc based applications does not look
	good :-) Being serious, the exercise is not as stupid as you would
	like to imply.

> Yawn.  Look, I've got nothing against HP/Apollo NCS, and I've heard that
> it has some technical improvements over SunRPC.  Great - I have some
> tentative thought of my own about SunRPC failings.  However, I've never
> heard the particular technical arguments in favour of NCS (despite asking
> HP in January), and *would* be grateful if someone would post a brief
> account of the differences, or points me at an HP document.  Enquiring
> minds want to know.
> 
> To be brutal, I get the feeling that your posting is simply OSF posturing.
> Fine, there are certainly people out there that like this sort of tosh
> -- but please don't post it out in the guise of a non-answer to some guy's
> question about a vaguely related problem.  If you want to advertise
> NCS vs. SunRPC in this forum, *please* tell us why it's better.
> ``Because OSF says it is'' is not really good enough!

	OK, so I asked for that a bit. I promise I don't work for OSF though.
	It was not OSF posturing but I would admit to NCS posturing - it is
	my opinion though that NCS is better.

	The reasons why NCS was chosen over the Netwise/SUNrpc offering were:
	There is a fuller description the following documents:

	Apollo NCA and SUN ONC: A Comparison (Nathaniel Mishkin)

	A Comparison of Commercial RPC Systems (Joshua Levy)
	A response to the above by Nathaniel Mishkin

	I do not have any of these electrponically - if I find them I will
	make them available - if anyone else has them, they could do the
	same (the last two were in comp.misc some time back.

	Just as a summary of why OSF chose NCS over SUNrpc/Netwise offering:

	- minimal programming complexity - NCS adheres more closely to
	  the local procedure model making it easier to use.

	- the RPC protocol is clearly defined and not subject to change by
	  the user (Netwise RPCTool IDL compiler).

	- uniform transport behaviour. NCS does not depend on transport
	  characteristics. An example - if UDP is chosen as the underlying
	  protocol limits are imposed on the size and number of arguments
	  as well as the reliebility. NCS allows the choice of a variety of
	  transports without affecting the operation.

	- allows integration with a threading package, also integrated with
	  authentication s/w (Kerberos). Also allows a 'pipe' capability for
	  bulk transfer of data.

> >> Q-2) If your on a cluster server, is there a way you can tell when/how a cnode
> >> dies (loses contact with the server) ?  This would seem to be so unusual of a 
> >> request.
> 
> >       Diskless nodes don't core dump (unless
> >       they have local swap) so you will not be able to get more information
> >       on why the crash occurred.
> 
> That's really neat.  Any reason why?

	Easy ... the crash is put into swap, the swap is on the server which
	we have just lost contact with ...
> 
> >       ... As far as the
> >	server is concerned it simply that it can no longer communicate with
> >	the client.
> 
> Quite.  Getting the client to say ``I've died'' is a bit tricky once its
> dead :-) Detecting when it dies is a bit more feasible - you can
> periodically ping the client workstation to check that it's still
> responding.  There are a few ways to do this:

	Actually, diskless DOES use pinging. When it finds that it has not
	any messages from a client it sends a ping package and expects a
	reply. If after a kernel definable number of tries the client still
	has not responded, then it is declared dead.
> 
> 
> tim marsland,   <tpm@cam.eng.ac.uk>
> information engineering division,
> cambridge university engineering dept.,
> ----------

	OK, that's enough from me. I will now shut for a while and try
	not to offend anyone else (seems to be difficult these days).
	I will admit to being biased in my opinions but then so is
	everyone else (and the world would be a boring place if we weren't).

tschuess,
Mark Lufkin
Tech Support
HP GmbH

tpm@eng.cam.ac.uk (tim marsland) (05/30/90)

In article <1720009@hpbbi4.HP.COM> markl@hpbbi4.HP.COM (#Mark Lufkin) writes:
>> What!!?  Are you suggesting that Greg rewrites `amd' to use NCS??
>
>	I must admit I have no idea what 'amd' is or where it comes
>       from but the idea is not as stupid as you might think

Well, speaking for myself, not knowing anything about a program generally
stops me from posting replies to people's questions about it :-)

>       ... it does
>	not take long to convert a program (it has been done in house ...
>	could have something to do with the fact that we are pushing NCS
>	now though and having SUNrpc based applications does not look
>       good :-)

Applications like NFS you mean?  Or is that going now?  This is one of the
more serious undercurrents which underlies this discussion.

>       Being serious, the exercise is not as stupid as you would
>	like to imply.

Also being serious, I concur that one can often rewrite programs to use
different RPC paradigm's.  However, I'm not exactly sure how useful
gratuitous rewriting is in the context of the original query (summary:
"How do I register amq?").  Note that `amd' is only(!) about 17000+
lines of cunning C that has been ported to at least ten different vendors
Unix variants ;-)

However, the main reason why I was so shocked by the suggestion (and
shocked I was) is not the idea of rewriting per se, but more that `amd' is
itself an NFS server i.e. it has to be based around SunRPC.
Specifically, rewriting `amd' to use a different RPC paradigm would have
meant rewriting the NFS code in the kernel of the local machine and of all
the mount daemons of the file servers it was accessing, *as* *well* *as*
amd itself.  Quite a few man-months ``conversion'' work for poor Greg
methinks :-)

Perhaps now you understand why I tried to imply rewriting was stupid in
this case, and the general ferocity of my flame?

>	The reasons why NCS was chosen over the Netwise/SUNrpc offering were:
>	There is a fuller description the following documents:
>
>	Apollo NCA and SUN ONC: A Comparison (Nathaniel Mishkin)
>
>	A Comparison of Commercial RPC Systems (Joshua Levy)
>	A response to the above by Nathaniel Mishkin
>
>	I do not have any of these electrponically - if I find them I will
>	make them available - if anyone else has them, they could do the
>	same (the last two were in comp.misc some time back.

and

>	Just as a summary of why OSF chose NCS over SUNrpc/Netwise offering:
>       ...

This is more like it Mark -- thanks.  I'd very much like to read these
documents, and your summary of the factors that influenced the decision is
much appreciated.

>> >> Q-2) If your on a cluster server, is there a way you can tell when/how a cnode
>> >> dies (loses contact with the server) ?  This would seem to be so unusual of a 
>> >> request.
>> 
>> >       Diskless nodes don't core dump (unless
>> >       they have local swap) so you will not be able to get more information
>> >       on why the crash occurred.
>> 
>> That's really neat.  Any reason why?
>
>	Easy ... the crash is put into swap, the swap is on the server which
> we have just lost contact with ...

That's not what I meant i.e. the crash might not be *caused* by a network
error.  For example, there might have been a bug in a pseudo-device I was
trying to add to the kernel (say).  Is it still the case that the client
can't core dump when it panics for non-network-related problems?

>	I will admit to being biased in my opinions but then so is
>	everyone else (and the world would be a boring place if we weren't).

Yes, but.. this is a technical forum, not a marketing forum :), so we've
all got to try and keep our opinions under control wherever possible :-)

>	OK, that's enough from me. I will now shut for a while and try
>	not to offend anyone else (seems to be difficult these days).

Hey, no problem Mark -- it's been fun.  Just hope everyone else has
enjoyed it.  I'll shut up now too.

tim marsland,   <tpm@cam.eng.ac.uk>
information engineering division,
cambridge university engineering dept.,

pae@athena.mit.edu (Philip Earnhardt) (06/13/90)

In article 1720009@hpbbi4.HP.COM Mark Lufkin writes:

> In article 8606@rasp.eng.cam.ac.uk tim marsland writes:
>> Yawn.  Look, I've got nothing against HP/Apollo NCS, and I've heard that
>> it has some technical improvements over SunRPC.  Great - I have some
>> tentative thought of my own about SunRPC failings.  However, I've never
>> heard the particular technical arguments in favour of NCS (despite asking
>> HP in January), and *would* be grateful if someone would post a brief
>> account of the differences, or points me at an HP document.  Enquiring
>> minds want to know.
>> 
>> To be brutal, I get the feeling that your posting is simply OSF posturing.
>> Fine, there are certainly people out there that like this sort of tosh
>> -- but please don't post it out in the guise of a non-answer to some guy's
>> question about a vaguely related problem.  If you want to advertise
>> NCS vs. SunRPC in this forum, *please* tell us why it's better.
>> ``Because OSF says it is'' is not really good enough!

Well, some time has passed since Mark's response to the request; I
wanted to present another point of view.  Disclaimer: I'm an engineer
for Netwise, and my opinions may not be impartial.  In any case, my
opinions do not reflect any official position of Netwise.

> OK, so I asked for that a bit. I promise I don't work for OSF though.
> It was not OSF posturing but I would admit to NCS posturing - it is
> my opinion though that NCS is better.
>
> The reasons why NCS was chosen over the Netwise/SUNrpc offering were:
> There is a fuller description the following documents:
>
> Apollo NCA and SUN ONC: A Comparison (Nathaniel Mishkin)
>
> A Comparison of Commercial RPC Systems (Joshua Levy)
> A response to the above by Nathaniel Mishkin
>
> I do not have any of these electrponically - if I find them I will
> make them available - if anyone else has them, they could do the
> same (the last two were in comp.misc some time back.

At the time, Mishkin was an employee of Apollo. In June of 1989, Tony
Andrews, a Netwise employee, posted "A Review of Current Product
Offerings", which has various benchmarks of current Sun/Apollo/Netwise
offerings (I believe this was to comp.protocols.tcp-ip). This would be
an appropriate document to look at, along with the ones that Mark
mentioned. I will re-post it if desired.

> Just as a summary of why OSF chose NCS over SUNrpc/Netwise offering:

Mark's reference appears to be the "OSF Distributed Computing Equipment
Rationale", dated May 14, 1990, from the Open Software Foundation. OSF is
specifying an entire Distributed Computing environment--Naming, Security,
Threads, as well as RPC.

OSF chose NCS 2.0: "...a joint submission of Digital and Hewlett-Packard."
This is not a shippable product. It is not appropriate to compare an
unavailable NCS 2.0 against Sun's RPCGen. Sun has announced availability of
the Netwise RPC Tool in the Second Half of 1990 ("Distributed Computing Road
Map--An Outlook on the Future of Open Network Computing" Sun Microsystems,
Inc., dated April 30, 1990). A comparison of NCS 2.0 and these new Sun
offerings would be appropriate. Based on Mark's highlights from the OSF
document:

> - minimal programming complexity - NCS adheres more closely to
>  the local procedure model making it easier to use.

I'm not sure what this is a reference to. Both NCS 1.x and the current
Netwise RPC Tool have been generating C code for client and server
stubs. These stubs handle packing and unpacking the data and
networking operations, making an RPC call appear as if it's a local
subroutine call. Both will continue to use stub generation in future
versions.

Besides the joint Netwise/Sun submission to OSF, Sun submitted RPCGen
separately. Perhaps the OSF is talking about the non-transparency of RPCGen.
I would be interested in seeing some clarification of what OSF is talking
about--it's not clear from their Rationale.

> - the RPC protocol is clearly defined and not subject to change by
>  the user (Netwise RPCTool IDL compiler).

The OSF rationale is referring to Netwise's customization feature. Both
the server and client sides of an RPC call are modeled as a state
machines. A user can add hooks to modify what happens in a particular
state or change the state transitions.

There are 2 implications in the OSF document. The first is that
customization of the RPC specification is not valuable. Netwise's
expericnce has been that customization is important to our customers.
The second implication is that customization creates possible
interoperability problems. 

Since this is a technical forum, an appropriate way to explain
Netwise's philosophy is to provide a real example of customization.
I'll apologize in advance for glossing over some of the details, but
hopefully the general point will be clear:

One of our customers wanted to have a multi-tasking server (i.e., a
server that can handle requests from multiple clients simultaneously).
Multi-tasking servers are quite straightforward in a UNIX
environment--Netwise provides "canned" code to provide a multitasking
server for UNIX systems.  Unfortunately, our customer's OS does not
support the concept of copying network file descriptors on a fork()
call. Saying the problem another way, the OS's fork() mechanism does
not permit multiple processes to use the same address for connections.

Netwise created customization so that a client's first RPC call would
instead perform two RPC calls. The first call is to the named server,
which returns the (unnamed) address of a worker to the client. The
second call is from the client to the worker which establishes a
connection for subsequent RPC calls.

From the point of view of the applications programmer, a single RPC
call has been performed. The customization, which is declared in the
RPC Specification File, provides a means of containment for the custom
functionality of the call. The application *could* have made a
separate RPC call to get the address of the worker, but that doesn't
permit the applications programmer to maintain the abstraction of the
RPC call. If our customer moves his server to a UNIX environment, he
can simply remove the customization, re-compile, and run with no
modification to the application code.

An RPC Toolkit could include some "standard" set of customization
features, but which ones would you include? Customizations could
include modifications to security, naming, auditing, asynchronous RPC
calls, etc. Which do you include with the Toolkit? Netwise felt the
only sane way to proceed was with having the users add whatever they
want -- an "Open Architecture" approach.

Finally, it's not clear what an "RPC protocol" is or what it would
mean to change one.  My feeling is that the RPC Specification File is
specifying the protocol for some set of RPC calls. The customization
is part of that Specification File. Sure, you could create a buggy
customization. In Toolkits where RPC customization wasn't available,
you could have bugs in the application code performing the
customization. Bugs are bugs. Again, the difference is that Netwise
contains the customization in one centralized location, providing a
layer of abstraction for the applications programmer.

This may well wind up being a *religious issue* about RPC technology. 

> - uniform transport behaviour. NCS does not depend on transport
>  characteristics. An example - if UDP is chosen as the underlying
>  protocol limits are imposed on the size and number of arguments
>  as well as the reliebility. NCS allows the choice of a variety of
>  transports without affecting the operation.

The Apollo NCS 1.x uses datagram-oriented transports. Unfortunately,
RPC needs to be built on top of a reliable transport. Apollo built
their own protocol on top of datagram transports to achieve
reliability. In some (all?) systems, these are not implemented in the
kernel.

It's unclear why a home-brew reliable transport would be superior to
using the reliable transport provided by the OS. It could be slightly
faster in some environments, particularly if few packets are lost.
Tuning in more hostile environments would probably be difficult,
particularly since the application may not have access to the
appropriate real-time tools that kernel code does. Finally, there is
the added code space for implemting the reliable transport on top of
the datagram transport.

In the NCS2.0 product, my understanding is that they will use both
connection-oriented and datagram-oriented transports (Mike?).  The
next-generation Netwise tool will be offering datagram transports.  However,
we will be offering the raw datagram functionality to the application--message
delivery will be unreliable and message size will be limited to the size
permitted by the transport. This is what OSF means when they say we don't have
uniform transport behavior.

Why did we make this choice? Basiclly, we feel that a connection-oriented
transport is the way to go. However, if an applications writer is willing to
deal with the reliability and space constraints, then a raw datagram transport
interface can be used.  Assuming appropriate reliability characteristics for
the datagram transport, we will have much lower overhead per packet than
either flavour of NCS2.0. Finally, a datagram transport is necessary to
support broadcast.

NCS2.0 will run on top of either a datagram or a connection-oriented
transport, but you're really getting a connection-oriented service in either
case. Mike: what will NCS2.0 do WRT broadcast? Will it be available with both
types of transport? If broadcast is not available under connection-oriented
transports, won't this constitute non-uniform transport behavior?

If a Netwise user really wanted unlimited data under datagrams, he could add
it with customization :-).

> - allows integration with a threading package, also integrated with
>  authentication s/w (Kerberos). Also allows a 'pipe' capability for
>  bulk transfer of data.

The Netwise tool has customization that would give the analogue to
'pipe'. The callback customization would allow a server to throttle
data coming from the client.

Sun's RPC currently does not have multithread support; they have
announced multithread support in the Second Half of 1991. They will
have Kerberos in the first half of 1991 (see document reference
above).

Netwise's current products support multithreaded applications (with
the restriction that each connection be accessed within a single
thread). What does NCS2.0 do in environments that don't support
multiple threads? 

----------------------------------------------------------------

Well, that's about it. I have two other issues about the OSF offering.  First,
will OSF be using OSI protocols? If so, which ones?  Also, what about
availability: have any dates been announced? What platforms will be supported?
Will the OSF DE be available on non-UNIX systems? What about DOS PCs?

Phil Earnhardt          Netwise, Inc.  2477 55th St.  Boulder, CO 80301
Phone:303-442-8280      UUCP: onecom!wldrdg!pae

bb@beach.cis.ufl.edu (Brian Bartholomew) (06/13/90)

> ---( Netwise's introductory text omitted )---
> ---( Defense of example as appropriate omitted )---
>
> One of our customers wanted to have a multi-tasking server (i.e., a
> server that can handle requests from multiple clients simultaneously).
> Multi-tasking servers are quite straightforward in a UNIX
> environment--Netwise provides "canned" code to provide a multitasking
> server for UNIX systems.  Unfortunately, our customer's OS does not
> support the concept of copying network file descriptors on a fork()
> call. Saying the problem another way, the OS's fork() mechanism does
> not permit multiple processes to use the same address for connections.

Let me see if I have this straight.  We are discussing two competing
networking standards, that are or will be released with versions of U*IX. 
But, instead of being shown an example of how NCS is more appropriate 
for U*IX, I am shown how it fixes a problem with another OS.  If there 
can't be found an example that shows it clearly superior for U*IX, 
then perhaps there isn't a clear technical advantage.

> ---( Program flow description omitted )---
> ---( Architectural overview omitted )---
>
> ...If our customer moves his server to a UNIX environment, he
> can simply remove the customization, re-compile, and run with no
> modification to the application code.

If the customer had used a more capable OS in the first place, there 
would be no reason to supplement its features via a networking package 
add on.  So instead, the U*IX users, the vast majority of users for 
whom this standard is being written for in the first place, are to be 
penalized in wasted kernal memory, dead code, and unused features. 

> ---( Tool-building philosophy omitted )---
> ---( RPC toolkit description omitted )---
> ---( History of Apollo implementation omitted )---
> ---( Discussion of what-is-an-RPC omitted )---
> ---( Discussion of features duplicated with U*IX omitted )---

Is there a forest that I am missing, but for all these trees?

"Any sufficiently advanced technology is indistinguishable from a rigged demo."
-------------------------------------------------------------------------------
Brian Bartholomew       UUCP:       ...gatech!uflorida!beach.cis.ufl.edu!bb
University of Florida   Internet:   bb@beach.cis.ufl.edu
--
"Any sufficiently advanced technology is indistinguishable from a rigged demo."
-------------------------------------------------------------------------------
Brian Bartholomew	UUCP:       ...gatech!uflorida!beach.cis.ufl.edu!bb
University of Florida	Internet:   bb@beach.cis.ufl.edu