whm@sunquest.UUCP (Bill Mitchell) (02/23/90)
I found that on a particular DECstation 3100, "setld -i" when redirected to
an NFS-mounted filesystem on a Sun-3/280 would produce a file with some null
characters in it. The file would always have the right number of bytes, but
some bytes would be null. The 3100 is a loaner and our DEC rep inquired about
this problem on some sort of DEC internal network. He forwarded me a couple
of responses and they boiled down to "looks like a Sun bug".
I tried to reproduce the problem using another 3100 as a NFS server, but the
problem didn't appear.
I investigated further and found that "setld -i" has non-deterministic output:
the lines aren't always in the same order. setld is actually a script and for
some reason it does a lot of echo's in the background. I created a shell
script that does a bunch of echo's in the background (it follows at the
end of this message). When run on a 3100 and redirected to an NFS filesystem
on another 3100, the observed failure rate is 100%. When run on a Sun and
redirected to an NFS filesystem on a Sun or a 3100, the observed failure rate
is 0%. So, it looks like some sort of client-side Ultrix NFS bug.
If you'd like to try to reproduce this on your system, here's the script:
-------------------------------------------------------------------------
echo '.xx xxx x' &
echo '.xx xxxx' &
echo 'xxx \- xxxxxxxxxxx xxx xxxxx xxxx' &
echo '.xx xxxxxx' &
echo '.x xxx' &
echo '[\xx\-x\xx] [\xx\-x\xx] [\xx\-x\xx] [\xx\-x\xx] ' &
echo '[\xx\-x\xx\|] [\xx\-x\xx] [\xx\-x\xx] \xxxxxx...\xx' &
echo '.xx' &
echo '.xx xxxxxxxxxxx' &
echo '.xxx "xxx xxxxxxx"' &
echo '.xxx "xxxx" "xxxxxxxxxx"' &
echo 'xxx' &
echo '.xx xxx' &
echo 'xxxxxxx xxxxx xxxx' &
echo '.x xxxx' &
echo 'xx xxxxxxxx xxx xxxxxxxx xx xx xxx xxxxxxxx xxxxxx. xxxxxxxxx,' &
echo 'xx xxxxxxx xxx xxxx xx xxx xxxxxxxx xxxxxx xxx xxxx:' &
echo '.xx' &
echo 'xxx xxxx' &
echo '.xx' &
echo 'xx xxxxxxxxxxx xxx xxxxx xxx xxxxx xxx xxxxxx' &
echo 'xx xxx xxxxx xxx xxxx:' &
echo '.xx' &
echo 'xxx xxxxx xxxxn >xxxxn' &
echo '.xx' &
echo 'xx xx xxxxx xxxx xx xxxxx, xx xx xx xx xxxx (\-) xx xxxxxxxxxxx xx' &
echo 'xx xxxxxxxx,' &
echo '.xx xxx' &
--------------------------------------------------------------------
Bill Mitchell whm@sunquest.com
Sunquest Information Systems sunquest!whm@arizona.edu
930 N. Finance Center Dr. {arizona,uunet}!sunquest!whm
Tucson, AZ, 85710 602-885-7700iglesias@orion.oac.uci.edu (Mike Iglesias) (02/24/90)
We just found out that there is a known Sun NFS bug (ref # 1014577) for SunOS 4.0. Here's the info we have: Synopsis: NFS mounted files occasionally get garbage/nulls written to them. Description: Occasionally when writing to NFS mounted files, parts of a file are replaced exactly (no insertions or deletions) with garbage, usually nulls. This can span several appens to the file by distinct processes running minutes apart. Does that sound like the problem you're having? We've seen the results of this bug, but have no idea (until now) how to cause it. Mike Iglesias University of California, Irvine
iglesias@orion.oac.uci.edu (Mike Iglesias) (02/24/90)
Well, I guess it isn't the Sun NFS bug. I tried it between a DECstation 3100 (Ultrix 3.1) and a Sun Sparc 1 (SunOS 4.0.3) and it does fail. Looking at the packets with etherfind, I see nulls in the packets at the exact places they ended up in the file. Looks like Ultrix is messing up the file. Mike Iglesias University of California, Irvine
meissner@osf.org (Michael Meissner) (02/24/90)
In article <25E5B8AD.23477@orion.oac.uci.edu> iglesias@orion.oac.uci.edu (Mike Iglesias) writes: | We just found out that there is a known Sun NFS bug (ref # 1014577) | for SunOS 4.0. Here's the info we have: | | | Synopsis: NFS mounted files occasionally get garbage/nulls written to them. | | Description: | Occasionally when writing to NFS mounted files, parts of a file are replaced | exactly (no insertions or deletions) with garbage, usually nulls. This can | span several appens to the file by distinct processes running minutes apart. | | | Does that sound like the problem you're having? We've seen the results | of this bug, but have no idea (until now) how to cause it. When I was at Data General, we had the same problem with SunOS 3.5 that we were using to bootstrap the AViiON software. Our network people discovered sun was not turning on checksumming on the NFS UDP packets. We kludged around it, by taking the NFS source for the module which opens the socket, and turning on checksumming, and rebuilding the kernel with this module. I would hope that Ultrix turns on checksumming, but you never know.... -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA Catproof is an oxymoron, Childproof is nearly so