[comp.sys.sgi] NFS problems and RE:RE:VT100/Keyboard

XBR2D96D@DDATHD21.BITNET (Knobi der Rechnerschrat) (11/05/87)

Hallo,

First in my posting, I have a serious problem. We are running three
IRIS 31xx on an ethernet with TCP/IP and NFS. We have Rev. 3.5r1 of the
software (Rev. 1.00 of NFS). On one machine we have a Fujitsu Eagle which
is local (EFS) to that machine and a NFS-disk to the remaining two stations.

Beside a few/lot problems we have encountered with NFS (dbx, graphics,
priority, etc., all posted month ago) I yesterday traced another bug
(as I assume):

Some programs (e.g. good old Kermit) do wild card expansions in their
code. Due to the lack of existing routines the do it by their own (see
the "expand" modules in Kermit). They use (due to the lack of BSD like
opendir/closedir routines) open and read statement  to access the
directory files (which are told to be "normal" files with the exeption
that you cannot write them). Now the problem: with the EFS directories
everything works fine, with the NFS directories everything I get from
read seems to be  junk. Result: I cannot use Kermit to do wildcard
get/send's on the NFS disks. Is this a problem/bug or feature? Is this
known? Will this be fixed in the next (3.6) release? (Sorry for the DEC
slang)

Second: I would like to thank Michael Toy for his reply to my posting
from last week. I'm unfortunately unable to reply directly.

Michael: I didn't know that a  24x80 wsh window is a vt100 emulator (what
about the (* flame again *) keyboard?). I had that machine only for four
days and couldn't discover all new features. And again: Is there a PD
VT100 emulator for the 31xx console?

Concerning the keyboard, I dont't complain that much about the physical
layout. I can live with 17 keys on they keypad instead of the "standard"
18. But what I hate are the (* flame *) ESC sequences that are generated
by that child of IBM compatibility (* general flame: Is having a three
letter abrevation for the companies name not compatible enough ??? *).
Is there a easy way to remap that sequences to a more handy set?
If the ad's say: "Just recompile the 31xx code and enjoy" this should not
only be (almost) true for the graphics, but also for the lower classes
of applications. And the 4D keyboard is INCOMPATIBLE to the 31xx.

Regards
Martin Knoblauch

TH-Darmstadt
Dept. Physical Chemistry 1
Petersenstrasse 20
D-6100 Darmstadt
West-Germany

BITNET: <XBR2D96D@DDATwoman

vjs@rhyolite.SGI.COM (Vernon Schryver) (11/11/87)

In article <2868@batcomputer.tn.cornell.edu>, sparks@batcomputer.tn.cornell.edu (Steve Gaarder) writes:
>             We have the following trouble:
> 1. "pwd" does not work in most (but not all) NFS directories.  It gives
>    "read error in .."
> 2. C-shell scripts do not work in the same directories.  The error here is
> "file not found".  Bourne shell scripts run fine.
Well,  nobody's perfect... :-)

The infamous "can't pwd" problem and, I think the csh problem, is a
difficulty with System V file directories.  Your server probably has a
large file system with >64K inodes.  Standard SV gives i-numbers 16 bits
in many places, including stat.h.  That makes things like ftw(3), pwd(1),
and so on unhappy when they try to stat(2) a file descriptor, and then
use readdir/getdents to find its name.  (Yes, pwd(1) does something else,
but the problem is the same.)  In the latest version of EFS for the IRIS 4D,
we changed to 32-bit i-numbers and recompiled the world.  We could do that
since we had control of all 4D binaries and file systems in the world.

However, release 3.5 and 3.6 kernels were/will be binary compatible for
application programs.  This means we could not change ino_t in types.h
to a long (one of things done for the new EFS on the 4D).

Fortunately, there is a hack involving type punning in the user code that
makes csh, pwd, and other utilities work 99.999999% of the time (Yes, 
that's what I calculate it to be :-).  3.6 will have the hack in all of
the places that matter.  Contact the hotline if you're application code
messes with i-numbers and you need the hack.  Try them if you need help
before you receive 3.6.

You might not need more than 64K inodes on your server file system.  A
300MB disk should have about 70K inodes, assuming 'typical' file sizes.
You might want to check 'df -i' or the ULTRIX equivalent to see if you
could rebuild it with <64K inodes.

> 3. Occasionally, if there are 2 or more users running ld, the server
> or one of the clients will crash.
If the Microvax, crashes, you should call the manufacturer.
If an IRIS crashes, call the hotline.  We can only fix the problems
the hotline tells us about or that we find ourselves.

Inside SGI we use NFS quite heavily for development.  The source servers
tend to be 4D's with Eagles with 80K inodes.  Most of the hundreds of
clients are 3000's, but there are plenty of 4D's and others.  I daily use
two 3030's and four 4D's as clients and servers for kernel and user-code
builds and debugging--all linked with NFS.  The news article to which
I'm responding is NFS mounted on my personal 3030.

mike@BRL.ARPA (Mike Muuss) (11/13/87)

The "stat" problem can be easily solved, while retaining binary
compatability, by renaming the existing syscall numbers as
"oldstat" and "oldfstat" (the only two affected syscalls), providing
--special-- code in the kernel to return the old (16-bit size) STAT
data, and then install two new syscall numbers that are assigned
to "stat" and "fstat".

Thus, old code continues to see what it expects, while any code
recompiled will get the new #include files, and will link with the
new sys-call interface routines in libc.

If SGI ships 3.6 without a reliable fix for this problem, I will be
personally very furious, because the 3030 on my desk stubs it's toes
on this every few hours.

I hope you will seriously consider adding the necessary 20 lines of code
to the kernel (and similar amount to libc) to eliminate this problem.
	Best,
	 -Mike

sgf@nancy ( _/**/Sam_Fulcomer ) (11/18/87)

In article <8711122300.aa16787@SEM.BRL.ARPA> mike@BRL.ARPA (Mike Muuss) writes:
>If SGI ships 3.6 without a reliable fix for this problem...

Well, I'll be a little unhappy if SGI doesn't cough up the promised,
generally reworked, NFS. In addition to the problems I outlined in an earlier
posting (mostly to do with heterogeneous networks) I've come up with an ugly
little problem with links in nfs mounts. The kernel lookup code  doesn't 
properly deal with links on the nfs filesystem under some circumstances

(e.g.
	baby:/usr         nfs  100672  83961  16711  83%  /usr
	/usr/lib/crontab -> /private/usr/lib/crontab

	# vi /usr/lib/crontab

	:w

    Yields:
	Permission denied [Warning - /usr/lib/crontab is incomplete]

    It doesn't make any difference if /private/usr/lib/crontab doesn't exist
)

Sam


-------------------------------------------------------------------------
		BITNET		sgf@BROWNCS
		CSNET		sgf@cs.brown.edu
		ARPANET 	sgf%cs.brown.edu@relay.cs.net
		UUCP		{ihnp4,allegra,decvax,princeton}!brunix!sgf
		TELECOM		401-863-3618