[comp.sys.apollo] Perl op.stat failure

dennis@peanuts.nosc.mil (Dennis Cottel) (11/02/90)

In mid-August I posted a followup to a person who found a problem
installing Perl on their Apollo.  I confirmed that I, too, had a
problem with the op.stat regression test failing.  I posted my log
entry:

     op.stat... FAILED on test 1
        This test opens a file and then looks to see if there is one hard
        link, but the link field is still 0.  When I added a close before
        the stat, this test worked (but, naturally, test 2 failed).

Jim Rees responded: 

   I'm not convinced.  I just tried it a bunch of ways, including
   using creat and open for read and/or write, and fstat as well as
   stat.  I can't get it to do anything unusual.

The failure happened again when I installed patches through 37.  On a
hunch I CRPed onto the node whose disk contains the Perl distribution
and the problem went away.  It appears to have something to do with
cross-node I/O.  It's nice to know I wasn't seeing things, but kind of
scary that such differences from normal UNIX behavior still exist.

   Dennis Cottel, dennis@NOSC.MIL, (619) 553-1645  
   Naval Ocean Systems Center, San Diego, CA  92152

kgallagh@digi.lonestar.org (Kevin Gallagher) (11/03/90)

In article <3071@nosc.NOSC.MIL> dennis@peanuts.nosc.mil (Dennis Cottel) writes:
>In mid-August I posted a followup to a person who found a problem
>installing Perl on their Apollo.  I confirmed that I, too, had a
>problem with the op.stat regression test failing.  I posted my log
>entry:
>
>     op.stat... FAILED on test 1
>        This test opens a file and then looks to see if there is one hard
>        link, but the link field is still 0.  When I added a close before
>        the stat, this test worked (but, naturally, test 2 failed).
>[stuff deleted]
>The failure happened again when I installed patches through 37.  On a
>hunch I CRPed onto the node whose disk contains the Perl distribution
>and the problem went away.  It appears to have something to do with
>cross-node I/O.  It's nice to know I wasn't seeing things, but kind of
>scary that such differences from normal UNIX behavior still exist.

I doubt it.  Our installation of Perl passes the op.stat tests regardless of the
node those tests are run on.  Our nodes are all 3500's and 3550's; some with
disks and some without.

It is more likely that the node under which perl was built has its directories
set up with apollo acl which emulates the Unix environment under which it was
built, bsd4.3 or sys5.3, whatever the case may be.  On the other hand, your
directories, within which you ran op.stat, probably have an acl setup not
matching that in the perl directories.  If perl was built expecting a bsd4.3
directory structure, he will get unexpected results if the directory structure
turns out to be something else.  Directory and file commands such as chmod
will not behave as perl expects.

Convert your directory containing op.stat (via chacl) to emulate the SAME
directory structure as that under which perl was built.  Then run the op.stat
test again.
-- 
----------------------------------------------------------------------------
Kevin Gallagher        kgallagh@digi.lonestar.org OR ...!uunet!digi!kgallagh
DSC Communications               OR apcihq!apcidfw!digi!kgallagh
----------------------------------------------------------------------------

kgallagh@digi.lonestar.org (Kevin Gallagher) (11/09/90)

In article <1219@digi.lonestar.org> kgallagh@digi.lonestar.org (Kevin Gallagher) writes:
>In article <3071@nosc.NOSC.MIL> dennis@peanuts.nosc.mil (Dennis Cottel) writes:
>>In mid-August I posted a followup to a person who found a problem
>>installing Perl on their Apollo.  I confirmed that I, too, had a
>>problem with the op.stat regression test failing.  I posted my log
>>entry:
>>
>>     op.stat... FAILED on test 1
>>        This test opens a file and then looks to see if there is one hard
>>        link, but the link field is still 0.  When I added a close before
>>        the stat, this test worked (but, naturally, test 2 failed).
>>[stuff deleted]
>>The failure happened again when I installed patches through 37.  On a
>>hunch I CRPed onto the node whose disk contains the Perl distribution
>>and the problem went away.  It appears to have something to do with
>>cross-node I/O.  It's nice to know I wasn't seeing things, but kind of
>>scary that such differences from normal UNIX behavior still exist.
>
>I doubt it.  Our installation of Perl passes the op.stat tests regardless of
>the node those tests are run on.  

Well, it turns out I was wrong.  I have perl, pl37, installed.  If I run
op.stat on my node in a directory on the disk in my node, the first op.stat
test passes.  If I creep on to another node (or simply log on to another node)
and run the test in the same directory on my disk, the first op.stat test
fails.  The number of links returned by stat is 0, when it fails.

I tried modifying the op.stat perl script by adding the line

    close(foo);

immediately before the invocation of stat.  Now, it did not matter if I ran it
from my node or some other node, stat always returned 0 as the number of
links.

So, I asked myself the question: "Is the perl stat broken or is Apollo's stat
broken?  To check out the Apollo stat, I ran the following program:

/*
** teststat.c
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>

struct stat buf;
char *path;

main()
{
  path = "./Op.stat.tmp";
  stat(path, &buf);
  printf("%s number of links = %d\n", path, buf.st_nlink);
}  

Well, it did not matter where we ran it, local or remote node, it always
reported 1 link for the file.

If I modified the perl op.stat script NOT to open the file before invoking the
stat command, it always returned 0 links, no matter where I was logged in.

It appears that, on the Apollo, perl's stat only works correctly if the file
is opened AND is located on a a disk in the same node where the script is
executed.  But this is a preliminary guess.  I did not check if the two nodes
I used were on the same or different rings.  Maybe it works OK if the nodes
are in the same ring.  But I am puzzled why perl stat always returns 0 links
when the file is not opened.  Anyway, the problem needs further investigation.
I briefly searched the perl source this evening and suspect that the file
doio.c handles the perl stat processing.

Anyone got a clue what is going on here?

rog@speech.kth.se (Roger Lindell) (11/28/90)

kgallagh@digi.lonestar.org (Kevin Gallagher) writes:

>I tried modifying the op.stat perl script by adding the line

>    close(foo);

>immediately before the invocation of stat.  Now, it did not matter if I ran it
>from my node or some other node, stat always returned 0 as the number of
>links.

[Stuff deleted]

>If I modified the perl op.stat script NOT to open the file before invoking the
>stat command, it always returned 0 links, no matter where I was logged in.

>It appears that, on the Apollo, perl's stat only works correctly if the file
>is opened AND is located on a a disk in the same node where the script is
>executed.  But this is a preliminary guess.  I did not check if the two nodes
>I used were on the same or different rings.  Maybe it works OK if the nodes
>are in the same ring.  But I am puzzled why perl stat always returns 0 links
>when the file is not opened.  Anyway, the problem needs further investigation.
>I briefly searched the perl source this evening and suspect that the file
>doio.c handles the perl stat processing.

>Anyone got a clue what is going on here?

The problem is that perl's stat() is the equivalent to Apollo's fstat() and
to use this function you need a file descriptor as an argument. This means that
the file must be open when you do a stat in perl.

I wrote a small C-program which uses fstat() and reports the number of links
to a file. The source code follows:

	#include <stdio.h>
	#include <sys/file.h>
	#include <sys/types.h>
	#include <sys/stat.h>

	void main(int argc, char **argv)
	{

	  int file_d ;
	  struct stat stat_buffer ;

	  if (argc == 2)
	  {
	    file_d = open(argv[1],O_CREAT|O_RDWR|O_APPEND,0644) ;
	    fstat(file_d,&stat_buffer) ;
	    printf("Number of hard links = %d\n",(int)stat_buffer.st_nlink) ;
	    close(file_d) ;
	  }
	}

When I run this program, and as input gives the name of a nonexistent file, I get
the answer
	Number of hard links = 0
if the disk, on which the file gets created, is not local. If on the other hand I
run this program on the machine that has the disk, I get
	Number of hard links = 1

If I run the program, and as input gives the name of an existing file, I get
the answer
	Number of hard links = 1
independently of if the file is on a local disk or not.

You can try this with perl's op.stat program, add these two lines before the first
open
	open(foo, ">Op.stat.tmp");
	close(foo);

If you do this and then run the op.stat program it will report that everthing is OK,
but since there is an unlink instruction preceding the original open I don't think
that my fix is a "supported" fix.

So the problem seems to be with Apollo's fstat().
Is this a genuine Apollo bug or is this something that is allowed by BSD UNIX?
Why I ask this is because if you read the man page for fstat and look at the BUGS
section it says: 
     Applying fstat to a socket (and thus to a pipe) returns a zeroed buffer,
     except for the blocksize field, and a unique device and inode number.

Roger Lindell



--
Roger Lindell			rog@speech.kth.se
Phone: +46 8 790 75 73		Fax: +46 8 790 78 54
Dept. of Speech Communication and Music Acoustics
Royal Institute of Technology	Sweden