[comp.unix.aix] Control NFS exported filesystems

CALT@SLACVM.SLAC.STANFORD.EDU (06/18/91)

On my machines, the 530s, there are a number of filesystems exported
to be mounted by a number of other machines here.

By exprience, I learned that mounting NFS filesystems by the
hard/foreground options may cause the machines hanging, while
by the soft/backgroud options seems to work OK.

Here comes a problem: If some machine DOES mount my exported filesystems
by the hard/foreground optins, it may cause my machines hanging. Is there
any way to configure my exporting filesystems as follows:
   Only the machines which use the soft/backgronud options will be allowed
to use my exported filesystems.

Is it possible? Or is there any other way to solve the problem?

Thanks in advanc for any advice!

Ching Shih
shih@cithex.bitnet
shih@cithe1.cithep.caltech.edu

marc@ekhomeni.austin.ibm.com (Marc Wiz) (06/19/91)

In article <91169.000329CALT@SLACVM.SLAC.STANFORD.EDU>,
CALT@SLACVM.SLAC.STANFORD.EDU writes:
> By exprience, I learned that mounting NFS filesystems by the
> hard/foreground options may cause the machines hanging, while
> by the soft/backgroud options seems to work OK.
>

To put it mildly this is not a good thing to do.  Remember if you mount the
filesystem soft the client process will get an error after three
retries.  If your
application can handle this fine but I have to wonder how many applications
can not.  If you care about your data I recommend hard mounts.  At least when
the server/network comes back up the data will be written/read to/from
the server.

> Here comes a problem: If some machine DOES mount my exported filesystems
> by the hard/foreground optins, it may cause my machines hanging. Is there
> any way to configure my exporting filesystems as follows:
>    Only the machines which use the soft/backgronud options will be allowed
> to use my exported filesystems.
> 

The mount options are controlled from the client.  The server has no control
over this.

What are you trying to accomplish?

Marc Wiz 				MaBell (512)823-4780

NFS/NIS Change Team
Yes that really is my last name.
The views expressed are my own.

marc@aixwiz.austin.ibm.com 
or
uunet!cs.utexas.edu!ibmchs!auschs!ekhomeni.austin.ibm.com!marc

mrl@uai.com (Mark R. Ludwig) (06/19/91)

In article <8567@awdprime.UUCP>, marc@ekhomeni (Marc Wiz) writes:
>In article <91169.000329CALT@SLACVM.SLAC.STANFORD.EDU>,
>CALT@SLACVM.SLAC.STANFORD.EDU writes:
>> By exprience, I learned that mounting NFS filesystems by the
>> hard/foreground options may cause the machines hanging, while
>> by the soft/backgroud options seems to work OK.
>>
>
>To put it mildly this is not a good thing to do.  Remember if you mount the
>filesystem soft the client process will get an error after three
>retries.  If your
>application can handle this fine but I have to wonder how many applications
>can not.  If you care about your data I recommend hard mounts.  At least when
>the server/network comes back up the data will be written/read to/from
>the server.

I agree fully.  At least one of the Sun administration manuals states
it bluntly: if you are mounting the NFS read/write, you should mount
it hard.  To do otherwise is to risk corrupted files.  However, I
believe if you have *very* intelligent applications manipulating the
files, you may disregard this warning, but I dare say the average Unix
utility is not in this category.  Furthermore, why would you want
this?  Since your application probably really wants to write the file
it was trying to write when the server went silent, then your
application has to keep trying until the server responds.  With
``hard'' the system does it for you.

The second part which I want to address is the foreground/background
part.  We use the ``bg'' option routinely, because the NFS partitions
are not required for the system to operate, and this allows the system
to finish multi-user startup without mounting all the NFS partitions.
The NFS partitions are only required for certain applications to run.
If the partition is required for the system, you probably must use
foreground.

>> Here comes a problem: If some machine DOES mount my exported filesystems
>> by the hard/foreground optins, it may cause my machines hanging. Is there
>> any way to configure my exporting filesystems as follows:
>>    Only the machines which use the soft/backgronud options will be allowed
>> to use my exported filesystems.

Come again?  You're saying that the *server* is hanging because the
*client* mounts the NFS hard?  I've never seen that happen.

>What are you trying to accomplish?

Right.  This is the first question we have to ask.  It helps to get
answers when you explain what you really want to do, and the
circumstances which caused you to be wedged into the corner.  Maybe
then we can get you centered in the room.$$
-- 
INET: mrl@uai.com       UUCP: uunet!uaisun4!mrl       PSTN: +1 213 822 4422
USPS: 7740 West Manchester Boulevard, Suite 208, Playa del Rey, CA  90293
WANT: Succinct, insightful statement to occupy this space.  Inquire within.

jona@iscp.Bellcore.COM (Jon Alperin) (06/19/91)

In article <1991Jun19.154830.17276@uai.com>, mrl@uai.com (Mark R. Ludwig) writes:
|> In article <8567@awdprime.UUCP>, marc@ekhomeni (Marc Wiz) writes:
|> >In article <91169.000329CALT@SLACVM.SLAC.STANFORD.EDU>,
|> >CALT@SLACVM.SLAC.STANFORD.EDU writes:
|> >> By exprience, I learned that mounting NFS filesystems by the
|> >> hard/foreground options may cause the machines hanging, while
|> >> by the soft/backgroud options seems to work OK.
|> >>
|> >
|> >To put it mildly this is not a good thing to do.  Remember if you mount the
|> >filesystem soft the client process will get an error after three
|> >retries.  If your
|> >application can handle this fine but I have to wonder how many applications
|> >can not.  If you care about your data I recommend hard mounts.  At least when
|> >the server/network comes back up the data will be written/read to/from
|> >the server.
|> 
|> I agree fully.  At least one of the Sun administration manuals states
|> it bluntly: if you are mounting the NFS read/write, you should mount
|> it hard.  To do otherwise is to risk corrupted files.  However, I
|> believe if you have *very* intelligent applications manipulating the
|> files, you may disregard this warning, but I dare say the average Unix
|> utility is not in this category.  Furthermore, why would you want
|> this?  Since your application probably really wants to write the file
|> it was trying to write when the server went silent, then your
|> application has to keep trying until the server responds.  With
|> ``hard'' the system does it for you.

	Hey...maybe this explains the reason that when I save a file
under VI which is kept on another NFS partition, VI tells me that it
was able to save the file, but because the real physical disk was full I
end up with a 0 length file (and lose all my work).....

|> -- 
|> INET: mrl@uai.com       UUCP: uunet!uaisun4!mrl       PSTN: +1 213 822 4422
|> USPS: 7740 West Manchester Boulevard, Suite 208, Playa del Rey, CA  90293
|> WANT: Succinct, insightful statement to occupy this space.  Inquire within.

-- 
Jon Alperin
Bell Communications Research

---> Internet: jona@iscp.bellcore.com
---> Voicenet: (908) 699-8674
---> UUNET: uunet!bcr!jona

* All opinions and stupid questions are my own *

jackv@turnkey.tcc.com (Jack F. Vogel) (06/20/91)

In article <1991Jun19.162331.25505@bellcore.bellcore.com> jona@iscp.Bellcore.COM (Jon Alperin) writes:
>In article <1991Jun19.154830.17276@uai.com>, mrl@uai.com (Mark R. Ludwig) writes:
[ stuff about using the 'hard' mount for data integrity deleted...]
>
>	Hey...maybe this explains the reason that when I save a file
>under VI which is kept on another NFS partition, VI tells me that it
>was able to save the file, but because the real physical disk was full I
>end up with a 0 length file (and lose all my work).....

No, I don't believe that mounting the filesystem 'hard' will prevent this
from happening. The reason this can happen is that the NFS client is doing
a bawrite() (asynchronous) so it doesn't get an immediate error, rather
just the inode is marked in error. If you wrote enough data that multiple
calls to bawrite() were necessary then the error would be noticed and vi
would tell you.

In the NFS in BSD 4.3 reno, there was a mount option 'synchronous' that
solves this by forcing the client to use bwrite(), thus you will be
guaranteed to see the error, of course you are going to suffer somewhat
of a performance hit by using it. I don't know if the current SunOS has
such an option or not. Also I don't know what level of NFS the 6000 is
based on, so it could have such an option for all I know, check the
man page for mount.

Disclaimer: I'm a kernel hacker not a company spokesweenie :-}.

-- 
Jack F. Vogel			jackv@locus.com
AIX370 Technical Support	       - or -
Locus Computing Corp.		jackv@turnkey.TCC.COM

marc@ekhomeni.austin.ibm.com (Marc Wiz) (06/20/91)

In article <1991Jun19.172354.9964@turnkey.tcc.com>,
jackv@turnkey.tcc.com (Jack F. Vogel) writes:
> From: jackv@turnkey.tcc.com (Jack F. Vogel)
> >	Hey...maybe this explains the reason that when I save a file
> >under VI which is kept on another NFS partition, VI tells me that it
> >was able to save the file, but because the real physical disk was full I
> >end up with a 0 length file (and lose all my work).....
>  
> No, I don't believe that mounting the filesystem 'hard' will prevent this
> from happening. The reason this can happen is that the NFS client is doing
> a bawrite() (asynchronous) so it doesn't get an immediate error, rather
> just the inode is marked in error. If you wrote enough data that multiple
> calls to bawrite() were necessary then the error would be noticed and vi
> would tell you.
> 
> In the NFS in BSD 4.3 reno, there was a mount option 'synchronous' that
> solves this by forcing the client to use bwrite(), thus you will be
> guaranteed to see the error, of course you are going to suffer somewhat
> of a performance hit by using it. I don't know if the current SunOS has
> such an option or not. Also I don't know what level of NFS the 6000 is
> based on, so it could have such an option for all I know, check the
> man page for mount.

Mounting the file system hard will not prevent the problem.  An
application will
not "see" the error until a fsync or close is done on the file.  Probably what
is happening with vi is that it is not checking the return from close.  I can't
ever remember seeing someone check the return from a close with all the
code that
I have looked at.  (I'm sure there are folks out there who have looked
at a lot more
code that I have :-)

NFS for the 6000 does not have a sync option for mount.  There is a way to
force synchronous operations on a file.  First of all opening the file
O_SYNC does NOT
do it.  (I don't know whether this is a problem or not)  The way to do
it is to open the
file and then obtain a lock on a piece of the file.  When NFS sees that
a lock has been
obtained it turns off caching for that file.

One thing to point out is that you would take one heck of a performance
hit if you were
running synchronously.  One could do a fsync after every write but that
means taking a 
performance hit not to mention you have to change your application.  I'm
sure that there
are other ways to handle this without impacting performance too much. 
One way to do
it (IMHO) is to have your application hold on to it's buffers between
fsync calls.
If you get an error from fsync, correct the problem and then reissue the
fsync for that
batch of writes.  This means that the application will need someway of
knowing when to
resume (i.e. the problem aka file system full has been corrected)

As an aside does anyone out there check the return from close in their
programs?

Marc Wiz 				MaBell (512)823-4780

NFS/NIS Change team
Yes that really is my last name.
The views expressed are my own.

marc@aixwiz.austin.ibm.com 
or
uunet!cs.utexas.edu!ibmchs!auschs!ekhomeni.austin.ibm.com!marc

teexand@ioe.lon.ac.uk (Andrew Dawson) (06/20/91)

In <1991Jun19.162331.25505@bellcore.bellcore.com> jona@iscp.Bellcore.COM (Jon Alperin) writes:

>	Hey...maybe this explains the reason that when I save a file
>under VI which is kept on another NFS partition, VI tells me that it
>was able to save the file, but because the real physical disk was full I
>end up with a 0 length file (and lose all my work).....

This sounds like something we have been discussing with IBM recently. I think
essentially the client is cacheing requests, so although write() returns
sucessfully, the data hasn't been written to disk. Your application may pick up
an error if fsync is called, and the close should also fail (not that many
applications check this). However, I'm not sure even this much worked until
we'd applied an APAR fix.
-- 
#include <std_disclaimer.h>  /* My brain was swiss-cheesed when I wrote this */
JANET:    andrew@uk.ac.ucl.sm.uxm     UUCP/EARN/BITNET: andrew@uxm.sm.ucl.ac.uk
INTERNET: andrew%uxm.sm.ucl.ac.uk@nsfnet-relay.ac.uk
"Leapers do it with assistance from neurological holograms"

jackv@turnkey.tcc.com (Jack F. Vogel) (06/20/91)

In article <8623@awdprime.UUCP> marc@aixwiz.austin.ibm.com writes:
>In article <1991Jun19.172354.9964@turnkey.tcc.com>,
>jackv@turnkey.tcc.com (Jack F. Vogel) writes:

[ stuff about using the 'hard' nfs mount option not preventing data loss]

|Mounting the file system hard will not prevent the problem.  An
|application will
|not "see" the error until a fsync or close is done on the file.  Probably what
|is happening with vi is that it is not checking the return from close.  I can't
|ever remember seeing someone check the return from a close with all the
|code that
|I have looked at. 

Right, its not just an issue of user-written applications not doing this,
its that every application "command" in the system, like vi, don't do
this. Marc, you and I both know where this issue is coming from (like a
certain customer complaint :-) and this is BOGUS. The user is free to
write code to check the return from close should he like, but rewriting
vi ?!? NO, the right option as far as I am concerned is what was done
in 4.3reno, provide a synchronous option to mount!!

>As an aside does anyone out there check the return from close in their
>programs?

We could probably generate a month-long thread of arguments in
comp.unix.wizards on the appropriateness of this :-} :-}!

Disclaimer: I hack the kernel, I don't speak for the company.

-- 
Jack F. Vogel			jackv@locus.com
AIX370 Technical Support	       - or -
Locus Computing Corp.		jackv@turnkey.TCC.COM

chip@tct.com (Chip Salzenberg) (06/20/91)

According to marc@aixwiz.austin.ibm.com:
>As an aside does anyone out there check the return from close in their
>programs?

Yes.  In writing the mail delivery program "Deliver", I was (and still
am) paranoid about checking all system calls for which I have some
reasonable action in case of failure.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.com>, <uunet!pdn!tct!chip>
 "You can call Usenet a democracy if you want to.  You can call it a
  totalitarian dictatorship run by space aliens and the ghost of Elvis.
  It doesn't matter either way."  -- Dave Mack

jona@iscp.Bellcore.COM (Jon Alperin) (06/20/91)

In article <8623@awdprime.UUCP>, marc@ekhomeni.austin.ibm.com (Marc Wiz) writes:
|> In article <1991Jun19.172354.9964@turnkey.tcc.com>,
|> jackv@turnkey.tcc.com (Jack F. Vogel) writes:
|> > From: jackv@turnkey.tcc.com (Jack F. Vogel)

	<from my original post>

|> > >	Hey...maybe this explains the reason that when I save a file
|> > >under VI which is kept on another NFS partition, VI tells me that it
|> > >was able to save the file, but because the real physical disk was full I
|> > >end up with a 0 length file (and lose all my work).....
|> >  

	<mark responds....>

|>  This means that the application will need someway of
|> knowing when to
|> resume (i.e. the problem aka file system full has been corrected)

	Yes, but has the "file system full" bug been fixed. This is becoming
an increasing problem in large NFS server/WS based environments, since the
editing is being done on the WS, but the file gets trashed on the server.
Furthermore, the server copy is completely trashed (size = 0) rather than just
losing the changes made in that editing session.

|> 
|> Marc Wiz 				MaBell (512)823-4780
|> 
|> NFS/NIS Change team
|> marc@aixwiz.austin.ibm.com 


-- 
Jon Alperin
Bell Communications Research

---> Internet: jona@iscp.bellcore.com
---> Voicenet: (908) 699-8674
---> UUNET: uunet!bcr!jona

* All opinions and stupid questions are my own *

jona@iscp.Bellcore.COM (Jon Alperin) (06/20/91)

In article <1991Jun20.090136.14351@ioe.lon.ac.uk>, teexand@ioe.lon.ac.uk (Andrew Dawson) writes:
|> In <1991Jun19.162331.25505@bellcore.bellcore.com> jona@iscp.Bellcore.COM (Jon Alperin) writes:
|> 
|> >	Hey...maybe this explains the reason that when I save a file
|> >under VI which is kept on another NFS partition, VI tells me that it
|> >was able to save the file, but because the real physical disk was full I
|> >end up with a 0 length file (and lose all my work).....
|> 
|> This sounds like something we have been discussing with IBM recently. I think
|> essentially the client is cacheing requests, so although write() returns
|> sucessfully, the data hasn't been written to disk. Your application may pick up
|> an error if fsync is called, and the close should also fail (not that many
|> applications check this). However, I'm not sure even this much worked until
|> we'd applied an APAR fix.


	SO....whats the APAR #?

|> JANET:    andrew@uk.ac.ucl.sm.uxm     UUCP/EARN/BITNET: andrew@uxm.sm.ucl.ac.uk
|> INTERNET: andrew%uxm.sm.ucl.ac.uk@nsfnet-relay.ac.uk
|> "Leapers do it with assistance from neurological holograms"

-- 
Jon Alperin
Bell Communications Research

---> Internet: jona@iscp.bellcore.com
---> Voicenet: (908) 699-8674
---> UUNET: uunet!bcr!jona

* All opinions and stupid questions are my own *

marc@ekhomeni.austin.ibm.com (Marc Wiz) (06/20/91)

> 	Yes, but has the "file system full" bug been fixed. This is becoming
> an increasing problem in large NFS server/WS based environments, since the
> editing is being done on the WS, but the file gets trashed on the server.
> Furthermore, the server copy is completely trashed (size = 0) rather
than just
> losing the changes made in that editing session.

The problem is being addressed and somone is aware that 
the server copy of the file is being trashed.

Marc Wiz 				MaBell (512)823-4780

Yes that really is my last name.
The views expressed are my own.

marc@aixwiz.austin.ibm.com 
or
uunet!cs.utexas.edu!ibmchs!auschs!ekhomeni.austin.ibm.com!marc

jpe@egr.duke.edu (John P. Eisenmenger) (06/21/91)

From article <91169.000329CALT@SLACVM.SLAC.STANFORD.EDU>, by CALT@SLACVM.SLAC.STANFORD.EDU:
> By exprience, I learned that mounting NFS filesystems by the
> hard/foreground options may cause the machines hanging, while
> by the soft/backgroud options seems to work OK.

Hmm.. I usually mount hard,bg,intr and have had no problems.  In general
I use hard mounts for read-write mounts and soft mounts for read-only
mounts.  Putting the mount in the background will prevent the client from
waiting indefinitely for the mount to complete...

-John

marc@ekhomeni.austin.ibm.com (Marc Wiz) (06/21/91)

In article <1991Jun20.141138.7555@bellcore.bellcore.com>,
jona@iscp.Bellcore.COM (Jon Alperin) writes:

> 	SO....whats the APAR #?

And the winning numbers are: ix18846 and ix20007.

Marc Wiz 				MaBell (512)823-4780

Yes that really is my last name.
The views expressed are my own.

marc@aixwiz.austin.ibm.com 
or
uunet!cs.utexas.edu!ibmchs!auschs!ekhomeni.austin.ibm.com!marc

jfh@rpp386.cactus.org (John F Haugh II) (06/24/91)

In article <1991Jun20.090136.14351@ioe.lon.ac.uk>, teexand@ioe.lon.ac.uk (Andrew Dawson) writes:
> In <1991Jun19.162331.25505@bellcore.bellcore.com> jona@iscp.Bellcore.COM (Jon Alperin) writes:
> 
> >	Hey...maybe this explains the reason that when I save a file
> >under VI which is kept on another NFS partition, VI tells me that it
> >was able to save the file, but because the real physical disk was full I
> >end up with a 0 length file (and lose all my work).....
> 
> This sounds like something we have been discussing with IBM recently. I think
> essentially the client is cacheing requests, so although write() returns
> sucessfully, the data hasn't been written to disk. Your application may pick
> up an error if fsync is called, and the close should also fail (not that many
> applications check this). However, I'm not sure even this much worked until
> we'd applied an APAR fix.

There have been problems with "vi" and other applications which do not check
the return status of "close" when using NFS on certain file systems.  For
example, I worked on an APAR where "vi" of a file on an NFS-mounted MVS file
system failed if the file was extended.  The solution was, as I recall, to
have fsync() called and to check its return status.

I'd suggest that anyone who finds a problem where a program thinks it has
exited successfully, but the data wasn't written, open an APAR.  The cause
will probably be very similiar.
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"UNIX signals are not interrupts.  Worse, SIGCHLD/SIGCLD is not even a UNIX
 signal, it's an abomination."  -- Doug Gwyn