[comp.unix.aux] Personal System Folders and NFS

hosking@cs.umass.edu (Tony Hosking) (12/02/90)

I am having trouble with personal system folders (created using the
systemfolder command) for users whose home directories are on an NFS-mounted
volume. When the user logs in at the console things go really slowly, with
some disk activity, then eventually the login just gives up and the Login
dialogue box comes back. No error messages, nothing, just plain refusal to log
in. If I remove the personal System Folder from the user's remote home
directory then they can log in again with no problems. Does anyone have any
idea what's going on and how to fix the situation? I have no problem with
personal system folders for users having a local home directory. Perhaps this
is a bug in A/UX 2.0? Has anyone else encountered this?
--
	Tony Hosking					
	Dept. of Computer and Information Science	 _--_|\
	University of Massachusetts			/      \
	Amherst, MA 01003				\_.--._/    )
	(413) 545-0256; hosking@cs.umass.edu		      v    /

urlichs@smurf.sub.org (Matthias Urlichs) (12/02/90)

In comp.unix.aux, article <HOSKING.90Dec1145333@ibis.cs.umass.edu>,
  hosking@cs.umass.edu writes:
< I am having trouble with personal system folders (created using the
< systemfolder command) for users whose home directories are on an NFS-mounted
< volume. When the user logs in at the console things go really slowly, with
< some disk activity, then eventually the login just gives up and the Login
< dialogue box comes back. No error messages, nothing, just plain refusal to log
< in.

I saw this too. The System _file_ must not be on an NFS volume.
(I don't remember any more whether the System _Folder_ can be...)

I have no idea why. Can anyone (preferably at Apple...) check?

-- 
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de     /(o\
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330)   \o)/

coolidge@cs.uiuc.edu (John Coolidge) (12/03/90)

urlichs@smurf.sub.org (Matthias Urlichs) writes:
>I saw this too. The System _file_ must not be on an NFS volume.
>(I don't remember any more whether the System _Folder_ can be...)

I'm using a personal System Folder on an NFS volume (both the folder
and the System file are on the NFS volume) and having no trouble.
I've been doing this since 2.0b9 and having no trouble. Under 2.0b3
we observed the same problems being described, but I haven't seen
them in a while.

--John

--------------------------------------------------------------------------
John L. Coolidge     Internet:coolidge@cs.uiuc.edu   UUCP:uiucdcs!coolidge
Of course I don't speak for the U of I (or anyone else except myself)
Copyright 1990 John L. Coolidge. Copying allowed if (and only if) attributed.
You may redistribute this article if and only if your recipients may as well.

liam@cs.qmw.ac.uk (William Roberts) (12/04/90)

In <HOSKING.90Dec1145333@ibis.cs.umass.edu> hosking@cs.umass.edu (Tony 
Hosking) writes:

>I am having trouble with personal system folders (created using the
>systemfolder command) for users whose home directories are on an NFS-mounted
>volume. When the user logs in at the console things go really slowly, with
>some disk activity, then eventually the login just gives up and the Login
>dialogue box comes back. No error messages, nothing, just plain refusal to log
>in. If I remove the personal System Folder from the user's remote home
>directory then they can log in again with no problems. Does anyone have any
>idea what's going on and how to fix the situation? I have no problem with
>personal system folders for users having a local home directory. Perhaps this
>is a bug in A/UX 2.0? Has anyone else encountered this?

All our users have home directories on NFS fileservers, and we have for now 
rejected the whole idea of personal system folders because we couldn't make 
them work satisfactorily: we did sometimes manage to login, but it wasn't very 
reliable and horrendously slow. 

I doubt that this is a bug in A/UX 2.0, just that some things are done in ways 
which are very inefficient via NFS. The only NFS-related snag we have had is 
that /mac/bin/Login tries to write the user's .aux_prefs file (which records 
the session type) as root, which falls foul of the uid 0 -> uid 65534 
translation.

We tried having personal system folders, but local .fs_cache, .fs_dirIDs, 
DesktopDB and Desktop DF files, i.e. the personal System Folder had symbolic 
links to local versions of these files, but this didn't work because the .fs_* 
files were removed almost instantly and re-written (onto the NFS fileserver). 
We now think that this might be due to date problems making the .fs_cache look 
out of date, but we haven't tried repeating the experiments yet.

We've also given up on accumulating a Desktop database: we have a compressed 
cpio archive containing clean, sensible versions of the files mentioned above, 
and we recreate the files from that archive for EVERY login (as part of the 
/mac/bin/mac32 script). We also use a semaphore so that an INIT can let the 
outside world know when startmac has finally started, and we run the 
CommandShell after that. The current "CommandShell &" scheme relies on 
everything happening before CommandShell's patience runs out, and that wasn't 
reliable either if the previous Mac session had died or been terminated 
brutally.
--

William Roberts                 ARPA: liam@cs.qmw.ac.uk
Queen Mary & Westfield College  UUCP: liam@qmw-cs.UUCP
Mile End Road                   AppleLink: UK0087
LONDON, E1 4NS, UK              Tel:  071-975 5250 (Fax: 081-980 6533)

urlichs@smurf.sub.org (Matthias Urlichs) (12/04/90)

In comp.unix.aux, article <1990Dec3.045212.12037@julius.cs.uiuc.edu>,
  coolidge@cs.uiuc.edu writes:
< urlichs@smurf.sub.org (Matthias Urlichs) writes:
< >I saw this too. The System _file_ must not be on an NFS volume.
< >(I don't remember any more whether the System _Folder_ can be...)
< 
< I'm using a personal System Folder on an NFS volume (both the folder
< and the System file are on the NFS volume) and having no trouble.
< I've been doing this since 2.0b9 and having no trouble. Under 2.0b3
< we observed the same problems being described, but I haven't seen
< them in a while.
< 
Interesting. This statrs to look like some sort of NFS incompatibility.

For the record, I saw the problem with our Ultrix 4.0 NFS server.

-- 
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de     /(o\
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330)   \o)/

mgchow@Apple.COM (Mike Chow) (12/04/90)

In article <h^8pg2.fc@smurf.sub.org> urlichs@smurf.sub.org (Matthias Urlichs) writes:
>In comp.unix.aux, article <HOSKING.90Dec1145333@ibis.cs.umass.edu>,
>  hosking@cs.umass.edu writes:
>< I am having trouble with personal system folders (created using the
>< systemfolder command) for users whose home directories are on an NFS-mounted
>< volume. When the user logs in at the console things go really slowly, with
>< some disk activity, then eventually the login just gives up and the Login
>< dialogue box comes back. No error messages, nothing, just plain refusal to log
>< in.
>
>I saw this too. The System _file_ must not be on an NFS volume.
>(I don't remember any more whether the System _Folder_ can be...)
>

This is untrue.  In fact, I usually run A/UX 2.0 with a system folder on
an nfs file system.  You might want to check that the permissions on your
system folder are correct -- the Mac environment quietly quits if a key
file is not accessable when it is starting up.  One thing to try is to
log into the console emulator (do this from the Login dialog), and once
in the console, type "mac32" to start the environment.

Other than NFS being slower than a local disk, there is no reason why a system
folder on an NFS file system shouldn't work.  However, the release notes do
caution against multiply using the same system folder on an NFS file system.
Since some of the Mac data files are not "re-entrant", you can only have one
Mac session accessing a system folder at a time.  Check for this situation as 
well.


Mike Chow
mgchow@apple.com

chet@Advansoft.COM (Chet Wood) (12/05/90)

>>>>> On 3 Dec 90 18:56:15 GMT, urlichs@smurf.sub.org (Matthias Urlichs) said:
matthias> Interesting. This statrs to look like some sort of NFS
matthias> incompatibility.

matthias> For the record, I saw the problem with our Ultrix 4.0 NFS
matthias> server.

I've noticed the problem. Our file server is a Sun4/370 running SunOS
4.1. The disk with the home directories is big-- 1 GB. It was taking
me 5-10 minutes to log in until I moved my system folder to a local
disk. I eventually just created a home directory on the local disk. I
seem to remember that that speeded things up more.

One of our users has _never_ been able to log in-- the system hangs
for hours-- until you telnet in and kill the fake_mac_os process.
( Creating a local home directory for him will presumably cure the
problem-- if he ever again shows an interest in using the Mac.)

A friend in the A/UX group has hypothesized that it may be an ethernet
problem. We are running a 3COM board on our FX. I'm told it interrupts
the processor on every longword. You're supposed to have an apple ethernet
card of series "F" or something like that for decent throughput on the
FX.

chet.
--
Chet Wood                       ~                         (408)727-3357 X269
   chet@Advansoft.Com    .  Advansoft Research Corporation
     arc!chet@apple.COM    .      4301 Great America Parkway, 6th floor
            apple!arc!chet   .            Santa Clara, CA 95054, USA

steveg@ni.umd.edu (Steve Green) (12/05/90)

In article <CHET.90Dec4132813@mars.Advansoft.COM> chet@Advansoft.COM (Chet Wood) writes:
>
>>>>>> On 3 Dec 90 18:56:15 GMT, urlichs@smurf.sub.org (Matthias Urlichs) said:
>matthias> Interesting. This statrs to look like some sort of NFS
>matthias> incompatibility.
>
>matthias> For the record, I saw the problem with our Ultrix 4.0 NFS
>matthias> server.
>
>I've noticed the problem. Our file server is a Sun4/370 running SunOS
>4.1. The disk with the home directories is big-- 1 GB. It was taking
>me 5-10 minutes to log in until I moved my system folder to a local
>disk. I eventually just created a home directory on the local disk. I
>seem to remember that that speeded things up more.
>
>One of our users has _never_ been able to log in-- the system hangs
>for hours-- until you telnet in and kill the fake_mac_os process.
>( Creating a local home directory for him will presumably cure the
>problem-- if he ever again shows an interest in using the Mac.)
>
>A friend in the A/UX group has hypothesized that it may be an ethernet
>problem. We are running a 3COM board on our FX. I'm told it interrupts
>the processor on every longword. You're supposed to have an apple ethernet
>card of series "F" or something like that for decent throughput on the
>FX.

This may or may not be the same problem I am having with NFS.  Anyway, I
have been complaining to Apple for a while about my inability to use NFS.

I have a CONSISTANT problem where code compiled on an NFS mount does not work
correctly or even at all.  What is strange is that all of the phases work
fine except for the ld phase.  I was told to turn off the biod's because of a 
race condition..  this did not help me but I understand that it has helped
some others...

BTW, I am using an ULTIRX 4.0 machine for my server.

Anyone else?

-- 
Silica gel -- No not eat.				steveg@ni.umd.edu

coolidge@cs.uiuc.edu (John Coolidge) (12/05/90)

steveg@ni.umd.edu (Steve Green) writes:
>I have a CONSISTANT problem where code compiled on an NFS mount does not work
>correctly or even at all.  What is strange is that all of the phases work
>fine except for the ld phase.  I was told to turn off the biod's because of a 
>race condition..  this did not help me but I understand that it has helped
>some others...

I've got this one... most of the time. When compiling to disks from
the Encore server, I get bad links (the link finishes cleanly, but
the resulting image is damaged, often in strange and interesting
ways). On the other hand, when the remote server is a Sun or an A/UX
machine, I've never seen this problem.

The problem doesn't happen with small programs: hello world works
10 out of 10. Big programs (nn, gcc, g++, etc) always fail.

--John

--------------------------------------------------------------------------
John L. Coolidge     Internet:coolidge@cs.uiuc.edu   UUCP:uiucdcs!coolidge
Of course I don't speak for the U of I (or anyone else except myself)
Copyright 1990 John L. Coolidge. Copying allowed if (and only if) attributed.
You may redistribute this article if and only if your recipients may as well.

mst@mx.csun.edu (Michael Temkin) (12/10/90)

(The original posting has been lost to the mists of 'expire'...)

The problem was long login times when the home directory was nfs
mounted.  I have (as of 12+ hours ago) placed the following at
the top of my .cshrc (local to the A/UX disk):

test -f .home
if ( $status == 0 ) then
        set home=`cat .home`
        cd $home
endif

This is the .cshrc that is in my passwd entry.  The .home file contains
a full path to my nfs mounted home directory (/usr/users/stf/mst).  I have
not tested it out fully (been Chanukkah shopping) yet, but I still login
in the same amount of time as when my home dir was local (and yes, I do
know how long nfs logins are :-) ).  I have checked and my $home is correct,
my $HOME is correct, ~ expansion is correct...

If anyone has some test for me to try, let me know.  BTW, the .login that
gets executed after the .cshrc is in my /usr/users/stf/mst dir, so if any
of you try it, make sure you transfer any A/UX code to the new .login.

Mike.
--
Mike Temkin
mst@csun.edu
Cal. State U. Northridge, School of Engineering and Computer Science
Voice phone: (818) 885-3919

maples@ddti.com (Greg Maples) (12/11/90)

>steveg@ni.umd.edu (Steve Green) writes:
>>I have a CONSISTANT problem where code compiled on an NFS mount does not work
>>correctly or even at all.  What is strange is that all of the phases work
>>fine except for the ld phase.  I was told to turn off the biod's because of a 
>>race condition..  this did not help me but I understand that it has helped
>>some others...
>
>I've got this one... most of the time. When compiling to disks from
>the Encore server, I get bad links (the link finishes cleanly, but
>the resulting image is damaged, often in strange and interesting
>ways). On the other hand, when the remote server is a Sun or an A/UX
>machine, I've never seen this problem.

I'm not sure that this is the solution to the above problems, but we have
also had similar problems with ld across NFS.  Our nightly development
build occurs across nfs with a sun 4/370, and we began to find it failing
badly once we started upgrading to A/UX 2.0.

The problem seems to lie (for us) with the ethernet cards.  We found that
16K cards fail, and 64K cards do not.  Apple informed us that the only cards
they support (or endorse) for A/UX are 64K cards.  3COM acted as if they were
totally unaware of this.

Upgrading to 64K cards seems to have solved the problem.  Good Luck.
Greg Maples
maples%ddtisvr@uunet.uu.net
DuPont Design Technologies

liam@cs.qmw.ac.uk (William Roberts;) (12/11/90)

In <CHET.90Dec4132813@mars.Advansoft.COM> chet@Advansoft.COM (Chet Wood) 
writes:
>One of our users has _never_ been able to log in-- the system hangs
>for hours-- until you telnet in and kill the fake_mac_os process.
>( Creating a local home directory for him will presumably cure the
>problem-- if he ever again shows an interest in using the Mac.)

Don't blame NFS or A/UX for this problem - it is bound to be your user 
shooting him or herself in the foot. Classic techniques for this are:

1) Putting "exec something" in the .login or .cshrc
2) Putting an interactive program in the .login (e.g. a newsreader)

Remember that the .login and .cshrc files get run by the shell used to invoke 
the /mac/bin/mac32 script, so they happen long before startmac gets run, and 
you probably don't even see the prompt on the console because you are looking 
at that "Starting session for fred.." message.

>A friend in the A/UX group has hypothesized that it may be an ethernet
>problem. We are running a 3COM board on our FX. I'm told it interrupts
>the processor on every longword. You're supposed to have an apple ethernet
>card of series "F" or something like that for decent throughput on the
>FX.

Read this stuff about Ethernet cards to mean, "the FX is a damn good machine 
and deserves an Ethernet card with some useful memory on it". I entirely 
agree: the performance on the FX as an NFS server is better than doubled by 
using the Revision L Apple Ethernet Card with has 64K of memory on it rather 
than 16K, just because fewer packets get lost due to no buffer space available 
on the card. It might conceivably help with doing lots of NFS reads (8K+header 
packets mean you can't get two read replies into a 16k buffer), but I don't 
believe that that is the "long login time" problem. I intend to use nfswatch 
to find out what is really happening, and I'll mail this group when I have 
some answers.
--

William Roberts                 ARPA: liam@cs.qmw.ac.uk
Queen Mary & Westfield College  UUCP: liam@qmw-cs.UUCP
Mile End Road                   AppleLink: UK0087
LONDON, E1 4NS, UK              Tel:  071-975 5250 (Fax: 081-980 6533)

liam@cs.qmw.ac.uk (William Roberts;) (12/11/90)

In <1990Dec5.054623.2117@julius.cs.uiuc.edu> coolidge@cs.uiuc.edu (John 
Coolidge) writes:

>steveg@ni.umd.edu (Steve Green) writes:
>>I have a CONSISTANT problem where code compiled on an NFS mount does not work
>>correctly or even at all.  What is strange is that all of the phases work
>>fine except for the ld phase.  I was told to turn off the biod's because of 
a 
>>race condition..  this did not help me but I understand that it has helped
>>some others...

>I've got this one... most of the time. When compiling to disks from
>the Encore server, I get bad links (the link finishes cleanly, but
>the resulting image is damaged, often in strange and interesting
>ways). On the other hand, when the remote server is a Sun or an A/UX
>machine, I've never seen this problem.

We've seen this too. Our experiments say that it is caused by "ld", that the 
significant factor is putting the final output file on the NFS server, that 
the bad thing which happens is patches of zeros in the output file, and that 
SunOS 4.1 and A/UX 2.0 servers don't suffer from this.
--

William Roberts                 ARPA: liam@cs.qmw.ac.uk
Queen Mary & Westfield College  UUCP: liam@qmw-cs.UUCP
Mile End Road                   AppleLink: UK0087
LONDON, E1 4NS, UK              Tel:  071-975 5250 (Fax: 081-980 6533)

liam@cs.qmw.ac.uk (William Roberts;) (12/11/90)

In <1990Dec10.182359.8057@ddti.com> maples@ddti.com (Greg Maples) writes:

>The problem seems to lie (for us) with the ethernet cards.  We found that
>16K cards fail, and 64K cards do not.  Apple informed us that the only cards
>they support (or endorse) for A/UX are 64K cards.  3COM acted as if they were
>totally unaware of this.

>Upgrading to 64K cards seems to have solved the problem.  Good Luck.

Naturally, Apple never seem to tell anyone this, let alone customers who a) 
don't live in the US, and b) who have 100+ A/UX machines with 16K Apple 
Ethernet cards. How long have Apple been selling 64K cards anyway, and what 
does "non-support of 16K cards" mean for existing customers? You are talking 
about changing the card in the A/UX machines acting as clients, right?

16K -> 64K change of card means more buffering, so fewer "receive overflows" 
in the card. In our experiments which demonstrated the problem, there were no 
such messages at all. The only significant things you'd lose would be 
acknowledgements of non-idempotent operations which aren't covered by cached 
responses in the server, maybe create operations (which ld does seem to do 
rather a lot of).
--

William Roberts                 ARPA: liam@cs.qmw.ac.uk
Queen Mary & Westfield College  UUCP: liam@qmw-cs.UUCP
Mile End Road                   AppleLink: UK0087
LONDON, E1 4NS, UK              Tel:  071-975 5250 (Fax: 081-980 6533)