[comp.unix.i386] NFS problems: ed

greyham@hades.OZ (Greyham Stoney) (03/03/90)

Ok, how's this for a bug:

SYMPTOM:
	when using ed(1) on a file that is on a filesystem that is NFS mounted
	from a remote host, every second 'w' (write) fails with a '?',
	starting with the first. every other 'w' succeeds. Bizarre!

SYSTEM:
	386/ix 2.0.2 with NFS over TCP/IP.

REPEAT-BY:
	cd <nfs-mounted-directory>
	ed file
	w				(will fail with '?')
	w				(will succeed with '0')
	w				(will fail with '?')
	w				(will succeed with '0')
	etc etc

	Note: if 'file' existed already, you'll get a different number than 0;
	but otherwise it's as above. ed diagnostics don't help much.

Anyone got any clues on why this happens and perhaps a fix?.

								Greyham.
-- 
/*  Greyham Stoney:                            Australia: (02) 428 6476  *
 *     greyham@hades.oz  - Ausonics Pty Ltd, Lane Cove, Sydney, Oz.      *
 * "Beware! Grid Bugs!"  \ Quotes from the Ultimate Video Experience...  *
 * "Nice Try, Timelord!" / Can you identify it? Win absolutely nothing!  */

timr@labtam.oz (Tim Roper) (03/08/90)

I think this is because ed has a "feaure" whereby it refuses to attempt to
write to a file system that doesn't have enough space left.
Unfortunately it thinks blocks are always 512 bytes so it can make the
wrong decision for file systems with bigger blocks.  Maybe NFS filesystems
are the ones you have with bigger blocks but not too many of them.

Every second 'w' works because typing 'w' again is the ed way of saying
you really mean it (as for writes to read-only files).

-Tim.

daveb@i88.isc.com (Dave Burton) (03/10/90)

Disclaimer: ** This is not an official reply from ISC. **

In article <618@hades.OZ> greyham@hades.OZ (Greyham Stoney) writes:
| SYMPTOM:
|	when using ed(1) on a file that is on a filesystem that is NFS mounted
|	from a remote host, every second 'w' (write) fails with a '?',
|	starting with the first. every other 'w' succeeds. Bizarre!
|
| SYSTEM:
|	386/ix 2.0.2 with NFS over TCP/IP.
| ...
| Anyone got any clues on why this happens and perhaps a fix?.

In article <3974@labtam.oz> timr@labtam.oz (Tim Roper) replies:
: I think this is because ed has a "feaure" whereby it refuses to attempt to
: write to a file system that doesn't have enough space left.
: Unfortunately it thinks blocks are always 512 bytes so it can make the
: wrong decision for file systems with bigger blocks.  Maybe NFS filesystems
: are the ones you have with bigger blocks but not too many of them.
:
: Every second 'w' works because typing 'w' again is the ed way of saying
: you really mean it (as for writes to read-only files).

The version of ed distributed from AT&T does not have this "feature". In
the case of a read-only file, successive 'w' requests with fail with '?'.

The toggling behavior is due to the coding in ed(1), which does (essentially):

	static int toggle = 0;
	...
	ustat(st_dev of file to write, &struct ustat);
	if (!toggle && ustat.numblks_on_fs < (numblks for file + fudge)) {
		toggle = 1;
		report an error and restart input;
	}
	toggle = 0;

Every other pass through this code, the write will fail - assuming
numblks.fs < (numblks.file+fudge), which it is, because:

SVR3.2 uses the device number of a mounted filesystem to identify
non-local filesystems, e.g. those with st_dev < 0.  NFS reports a
negative st_dev to conform with this expectation.  ustat(2) (sic)
detects a negative st_dev and calls the RFS dependant code directly,
without going through the FSS.  The result is improper behavior for NFS
filesystems; i.e. calls to ustat(2) for NFS filesystems go through the
RFS code (please note the caveat at the bottom of the ustat(2) man
page).  Not surprisingly, ustat(2) fails for NFS (with ENOMEM),
although ed(1) doesn't check the return status.

One fix is to replace the ustat call with statfs(2), which uses the
FSS.  The workaround is to use ex(1) or vi(1) :-).
--
Dave Burton
uunet!ism780c!laidbak!daveb