[comp.dcom.lans] statefull vs stateless servers

trt@rti.UUCP (Thomas Truscott) (01/22/89)

> I became a firm believer in NFS's stateless server approach the very first
> time I saw a server crash and reboot. The client applications using it just
> froze.  When the server came back to life, they picked up and continued as
> if nothing had happened.

But statefull approaches do this too!
It has nothing do with the server (which crashed after all),
it has to do with whether or not
the *client* can recover the server's state.
Many people are firm believers in a statefull client approach,
though perhaps they are not aware of it.

Okay, what about servers, should they be statefull too?  Of course!
UNIX semantics demand it (and so do those of VMS, MS/DOS,
and every other operating system I can think of).
An obvious example is remote "mkdir" (or unlink, exclusive open, etc.)
which succeeds but the ACK is dropped and then the retry fails.
Another is access to special devices, such as autodialers and tape drives.

What happens when a statefull server crashes and reboots?
The (statefull) client applications freeze, and when the server comes up
they reestablish their state and continue as if nothing had happened.
Of course reestablishing an ordinary file's state is much easier
than reestablishing that of an modem or graphics device,
so applications using the latter will probably fail.
But that is better than using stateless services which do not
allow such things in the first place!
Keep in mind that crashes are rare, even on NFS systems
(whose users seem to talk about crashes alot).
With a statefull server, correct UNIX (or perhaps MVS, AOS, whatever)
semantics are easy, natural, and the norm.
Recovering state lost due to a crash is, however,
only an approximation, and is called "error recovery".
With stateless servers, approximation is the whole idea.

While on the subject of crash recovery, please note that
fault tolerant systems provide same by replicating their state,
not by placing it in a single basket.

So how does NFS survive with stateless servers?
The first line of defense is that the network
rarely drops a packet.  The second line of defense is to introduce
server state (oops!) in the form of retry caches, and of course
the infamous "inode generation numbers".  This works well
except under heavy load or when the system crashes.
The third line of defense to introduce a suite of "network services"
that provide statefull support for a few "most wanted" features
lacking in NFS such as file locking.
(I think vendors have to pay SUN extra $$ to get them.)

The fourth line of defense is that "NFS is not UNIX"
and that this is a small price to pay for crash recovery
and for operation in non-UNIX environments.
Well, it has always seemed to me that "NFS is not UNIX"
because "UNIX is not stateless",
that stateless servers are not necessary for crash recovery,
and that stateless servers hinder operation in non-UNIX environments.
Please consider the theorem:

	There is nothing a stateless system can do
	that cannot be done by a statefull system.

The proof is left as an exercise for the reader.

	Tom Truscott

guy@auspex.UUCP (Guy Harris) (01/24/89)

 >The third line of defense to introduce a suite of "network services"
 >that provide statefull support for a few "most wanted" features
 >lacking in NFS such as file locking.
 >(I think vendors have to pay SUN extra $$ to get them.)

No.  The lock daemon, and other ONC services, come with the NFS tape, at
least for the latest ONC/NFS source tape....