[comp.unix.questions] Robust Mounts

mliverig@Verity.COM (Mike Liveright x7627) (01/05/90)

A few months back there was a discussion about making "nfs" networks
robust. It was my impression that there was a consensus that:

1) Soft mounts of read-write file systems would increase the risk of
corrupting the file systems.

2) Automounting was a good idea that was not yet safe.

3) Hard mounts should be done at points such that when a "get current
directory" was done it would not "run" into dead mount points, e.g. if the
user was in directory = /a/b/c, then it was important not to mount another
disk at location /a/aa as this would lead to a lockup as the "get current
directory" tried to find "b" in /a, and ran into the dead "/a/aa".

Question:
  a) Is this a correct summary of the situation today relative to sun 4.0.1,3
  b) Are there other thoughts as to increasing the robustness of networked 
         CPU's

I will summarize the responses if there seems to be interest. Thanks...
    Mike Liveright, VERITY, mliverig@verity.com
---------------------------------------------

deke@ee.rochester.edu (Dikran Kassabian) (01/10/90)

In article <10284@zodiac.ADS.COM> mliverig@spark.uucp writes:
>A few months back there was a discussion about making "nfs" networks
>robust. It was my impression that there was a consensus that:
>
>1) Soft mounts of read-write file systems would increase the risk of
>corrupting the file systems.

I agree with this, and stick to it as a loose rule.  There are some occasions
when I ignore it, after thinking it over carefully.

>
>2) Automounting was a good idea that was not yet safe.

This is what caught my eye.  I have been reading up on, and testing, automounter
for the last week and a half.  Why was it considered 'not yet safe'?  I have
apparently missed an important discussion.

      ^Deke Kassabian,   deke@ee.rochester.edu   or   ur-valhalla!deke
   Univ of Rochester, Dept of EE, Rochester, NY 14627     (+1 716-275-3106)

urlichs@smurf.ira.uka.de (01/12/90)

In comp.unix.questions deke@ee.rochester.edu writes:
< 
< In article <10284@zodiac.ADS.COM> mliverig@spark.uucp writes:
< >A few months back there was a discussion about making "nfs" networks
< >robust. It was my impression that there was a consensus that:
< >
< >1) Soft mounts of read-write file systems would increase the risk of
< >corrupting the file systems.
< 
How?

After all, whoever mounts the NFS volume is not doing any device-level stuff
(it's Network _File_, not _Volume_ System, after all ;-) ), so how could a
file system get corrupted?

If you mean that the internal structure of some files could get screwed up, on
the other hand, you're perfectly correct.

< I agree with this, and stick to it as a loose rule.  There are some occasions
< when I ignore it, after thinking it over carefully.
< 
Such as? (Inquiring minds want to know...)

-- 
Matthias Urlichs

deke@ee.rochester.edu (Dikran Kassabian) (01/16/90)

In article <1376@smurf.ira.uka.de> urlichs@smurf.ira.uka.de writes:
>In comp.unix.questions deke@ee.rochester.edu writes:
>< 
>< In article <10284@zodiac.ADS.COM> mliverig@spark.uucp writes:
>< >A few months back there was a discussion about making "nfs" networks
>< >robust. It was my impression that there was a consensus that:
>< >
>< >1) Soft mounts of read-write file systems would increase the risk of
>< >corrupting the file systems.
>< 
>How?

In the same way a system crash can result in corruption of local disks.
Summary information can be inaccurate depending on exactly when the crash
takes place, as it relates to pending disk writes.

An NFS rw,hard mount is a win in this case...  the process on the NFS client
hangs until the NFS mount becomes available again, and so gets to continue.
Not that this guarentees you a clean file-system, but I believe that your
chances are lots better.

>After all, whoever mounts the NFS volume is not doing any device-level stuff
>(it's Network _File_, not _Volume_ System, after all ;-) ), so how could a
>file system get corrupted?
>
>If you mean that the internal structure of some files could get screwed up, on
>the other hand, you're perfectly correct.

Absolutely;  I have seen this happen.

>< I agree with this, and stick to it as a loose rule.  There are some occasions
>< when I ignore it, after thinking it over carefully.
>< 
>Such as? (Inquiring minds want to know...)

Any time that small, infrequent writes from an NFS client are wanted, I'd
consider it.   Here's an example of when I'd use rw,soft:

* Server A has an area of disk I'd infrequently like to write to from client B.
* The data written will consist of small files which don't grow (perhaps from
	maintaining single record, statistical summaries).
* Server A has a tendancy of going down more often than I like, and I don't
	want to hang client B when this happens ... I'd rather restore the
	data files from backups and/or run fsck in the rare case when I 
	happened to be writing just as A crashed.

BUT:

My preferred solution would be to use SunOS automount(8) or Jan-Simon Pendry's
'amd'.  I'm still hoping someone will comment on my question, which
asked about automounter, and why it might be considered 'not yet safe'.

>
>-- 
>Matthias Urlichs

      ^Deke Kassabian,   deke@ee.rochester.edu   or   ur-valhalla!deke
   Univ of Rochester, Dept of EE, Rochester, NY 14627     (+1 716-275-3106)

urlichs@smurf.ira.uka.de (01/17/90)

In comp.unix.questions deke@ee.rochester.edu (Dikran Kassabian) writes:
< In article <1376@smurf.ira.uka.de> urlichs@smurf.ira.uka.de writes:
< > In article <10284@zodiac.ADS.COM> mliverig@spark.uucp writes:
< > >
< > >1) Soft mounts of read-write file systems would increase the risk of
< > >corrupting the file systems.
< > 
< >How?
< 
< In the same way a system crash can result in corruption of local disks.
< Summary information can be inaccurate depending on exactly when the crash
< takes place, as it relates to pending disk writes.
< 
< An NFS rw,hard mount is a win in this case...  the process on the NFS client
< hangs until the NFS mount becomes available again, and so gets to continue.
< Not that this guarentees you a clean file-system, but I believe that your
< chances are lots better.
< 
Well, I fail to see why, given the following sequence of events
- client sends NFS request
- server (partially or completely) processes the request
- server crashes

either one of the following events
- client times out, user program gets error
or
- client hangs until server is back, user program continues
or
- client gets disconnected by automount until server is back, user program
  gets error

could possibly have any impact on the probability that
- server disk needs to be fsck'd, probably dropping some files
or that
- buffer was not written on server, causing inconsistent database although the
  client got an OK return from NFS.

A hard NFS mount obviously improves your chances if
- server crashes but managed to write its buffers, but
- client was doing things which left the database inconsistent.

In this case, obviously, a hard mount is helpful here and either a soft mount
or a client disconnected by an automount daemon would cause problems because
the client has no way to get the database back into a consistent state.

I hope I'm not missing anything here.
< 
< BUT:
< 
< My preferred solution would be to use SunOS automount(8) or Jan-Simon Pendry's
< 'amd'.  I'm still hoping someone will comment on my question, which
< asked about automounter, and why it might be considered 'not yet safe'.
< 
My understanding of an automount daemon is:
- It periodically tests if the server is still there.
- If not for N seconds, the server is unmounted. This has the same effect as a
  soft mount in that the client, trying to read or write a file, gets an error.
- Any request to the server returns an error immediately until the server is
  back online, in which case
- the automounter reconnects the client to the server.

Anyone more knowledgeable enlighten me in case I'm wrong, please.
-- 
Matthias Urlichs