[comp.sys.apollo] NFS problem SCO Unix, executables over NFS

eli (Steve Elias) (09/14/90)

summary for comp.sys.apollo readers:
we are encountering a strange NFS problem when we run large
executables on SCO Unix from an NFS server.  in our case, our
processes get killed randomly.  in Tom's case, listed below, the
executables only die off if a particular Apollo OS version is
the NFS server.  hence i crosspost this to comp.sys.apollo in 
the hope that a reader out there will know of any differences between
Apollo OS 9.7 and 10.2 with regard to NFS.  (obscure enough for ya?) 

! I read your posting about the NFS problem in comp.unix.sysv386. I have a
! very similar problem with my 386 running isc 2.0.2. Every time I try
! to execute a file which is located on a nfs-mounted disc I get the terse
! message "killed".

interesting that the problem happens every time for you.  in our case,
it is intermittent.  my theory (ahem) is that it happens whenever a
crucial NFS packet is lost.  our network is known to drop packets
occasionally, and the frequency of packet drops seems to be in line
with the frequency of random SCO Unix executable-over-NFS deaths.

leading me to the question, Tom:  is your ethernet known to drop packets?

! The NFS server is an apollo-workstation. The funny (when i would not try
! to laugh about it I would cry bitterly) thing is that it worked before
! we upgraded the apollo OS from 9.7 to 10.2. So there seems to be a problem
! in the apollo server?!!

i'm not so sure it's an apollo problem.  we've seen the problem with
SCO Unix acting as the server, and with a "real" HP series 300 HPUX
acting as server.  the Apollo dependencies you've listed might have 
some strange interaction with SCO Unix that causes the problem, however.

! That's all I can tell you at the moment. Apollo (aeh - hp now) could not
! answer my questions. I'm in a lucky position in having a big new internal
! disc. So the problem does not start my hairs falling out.

our workaround is to run the executables off of local disk, as well. 
but my hair is still falling out!  :)

! If you think that I can help you with a more detailed description 
! (configuration ...) fell free to contact me.
! On the other hand I would appreciate if you could email me helpfull
! responses from other friendly netlanders.

you've been the most helpful so far, Tom.  thanks!

! +--------------------+--------------------------------------------------------+
! | ---- Thomas Winder | Departement of VLSI-Design  | VOICE: (+43.1)58801-8145 |
! |  / /---/ /-/-/     | Technical University Vienna | FAX:   (+43.1)569697     |
! | / /---/ / / /      | Treitlstrasse 3 | EMAIL: tom@vlsivie.tuwien.ac.at      |
! |                    | 1040 Vienna     |        tom@vlsivie.uucp              |
! | just do it!!!      |                 |        e182201@awituw01.bitnet       |
! +--------------------+-----------------+--------------------------------------|

ps -- our mailer barfs on that ac.at address above.  

; Steve Elias, eli@pws.bull.com;  617 932 5598 (voicemail)
; 508 294 0101 (SCO Unix fax)
; 508 294 7556 (work phone)

wjw@eba.eb.ele.tue.nl (Willem Jan Withagen) (09/15/90)

In article <15869@know.pws.bull.com> eli (Steve Elias) writes:
>
>summary for comp.sys.apollo readers:
>we are encountering a strange NFS problem when we run large
>executables on SCO Unix from an NFS server.  in our case, our
>processes get killed randomly.  in Tom's case, listed below, the
>executables only die off if a particular Apollo OS version is
>the NFS server.  hence i crosspost this to comp.sys.apollo in 
>the hope that a reader out there will know of any differences between
>Apollo OS 9.7 and 10.2 with regard to NFS.  (obscure enough for ya?) 
>

We've got just about the same problem. But the system running the commands
is a SUN 3/110 running SunOS 3.5. And then it only occurs when running a
c-shell. But it does not matter whether we use either the SR9.7 mount or
the SR10.2 mount, both just do not execute.

I've posted this already in comp.sys.apollo and comp.sys.sun, (About 
2-3 months ago) but with very little response as up to now.

I once had a sort like problem: running a SR10.2 script on a SR9.7 system
did not work. This was due to diffent file types for the script files
under the two OSes. But this is not the case, since a SunOS is not typed.

The lost packages could be a cause, especially since I've soft mounted
the NFS disks. But I know too little of NFS to investigate on my own.
Another clue could be that the Apollo-NFS does not give the beginning of 
the file in correct shape. This would kill the magic number in the files!

Does anybody else have any clues,
	Willem Jan Withagen.

Eindhoven University of Technology   DomainName:  wjw@eb.ele.tue.nl    
Digital Systems Group, Room EH 10.10 BITNET: ELEBWJ@HEITUE5.BITNET
P.O. 513                             Tel: +31-40-473401
5600 MB Eindhoven                    The Netherlands

asherman@dino.ulowell.edu (Aaron Sherman) (09/16/90)

I have discovered that this problem exists with ALL of our SysV machines.

If a program which was compiled by the SysV machine is physically located on
an Apollo disk which is NFS-mounted, then execution of that program will
return the following message: "Killed".

On out Stellar GS2000 (running Stellix2.0) the message returned is the same,
but a syslog message is also sent out which says that this file had been
corupted. My initial guess (to which there is still no confirmation, or
denial) is that the file-modification times are being changed on the Apollo
end AFTER the file is loaded.

			-AJS

PS: I'm not sure if I posted about this last night, if I did, then sorry
    about the waste.
--
asherman@dino.ulowell.edu	or	asherman%cpe@swan.ulowell.edu
Note that as of 7/18/90 that's asherman@dino.cpe.ulowell.edu
"That that is is that that is not is not is that it it is."