[comp.soft-sys.andrew] console gets bus error, other nasties -- PL 6

cwitty@csli.Stanford.EDU (Carl Witty) (08/18/90)

Some problems I've had with Andrew PL 6:

1) "console" compiled on a Sparcstation 1 gives a bus error in a
function called by the application_Start() macro on line 202 of
runapp.c.  It runs fine on a sun 3 or an IBM RT running AOS 4.3.  (The
suns both run 4.0.3).

How does one go about debugging this?  Normally, I'd go into a
debugger and get a backtrace, to see where it died, and then look at
the source code to see if I could figure out what was wrong.  With
this dynamic loading stuff, I can't figure out how to get the debugger
to tell me what function it died in, so I can't figure out how to
start debugging it.

2) .../overhead/util/lib/tokpak.c kills the Sparcstation compiler.  It
works fine if it's compiled without optimization.

3) .../config/vax_3/system.h is missing the declaration of osi_Times
and the definition of osi_GetSecs().

4) .../ams/demo/gendemo doesn't work if you don't set ANDREWDIR and
you don't build andrew in /usr/andrew.

5) .../overhead/class/testing/Imakefile should set MAKEDOFLAGS to
-b ../cmd -g -d $(BASEDIR)/lib .  Otherwise, it doesn't work if you're
not installing in /usr/andrew.

6) .../overhead/snap2/guardian/cmd/Imakefile and
.../overhead/snap2/pcserver/Imakefile refer to
${AFSBASEDIR}/lib/librauth.a, which is not part of AFS 3.0.  Things
seem to work fine if that library is removed from the compile
line--evidently AFS 3.0 has the necessary functions from that library,
but has moved them to some other library.

7) .../overhead/wpi/wpi.c , line 503, calls puts(WPI_Value("Fwd", entry));
however, if somebody has no forward field, WPI_Value returns NULL,
which makes puts() core dump.  wpi should probably check for this case
explicitly and print out something like "no forwarding address".

8) Is the AFS monitoring for console supposed to work under AFS 3.0?
The sample consoles that claim to monitor the file system always claim
"There has been no file system activity."

I'd appreciate any help with fixing the problems with console (numbers
1 and 8); the rest I've fixed myself.

Thanks,

Carl Witty
cwitty@cs.stanford.edu

Craig_Everhart@TRANSARC.COM (08/21/90)

Those are great bug reports.  The guys actually supporting Andrew should
take them seriously.

I can help with only a few of these.

Excerpts from internet.info-andrew: 18-Aug-90 console gets bus error,
oth.. Carl Witty@decwrl.dec.co (2083)

> 4) .../ams/demo/gendemo doesn't work if you don't set ANDREWDIR and
> you don't build andrew in /usr/andrew.

The Imakefile should be passing in a correct value for DESTDIR via the
following hack:
	-${CSHELL} gendemo  -force -DESTDIR ${DESTDIR} ${DESTDIR}/.MESSAGES/demo

I've always ignored any problems with ams/gendemo since I don't want an
AMS demo folder created where this thing wants to put it, anyway.  But
running this thing assumes that you have CUI configured properly (i.e.
assumes that you have an AndrewSetup file that will make things work). 
Generally, the time when you're building the software is not the time
that you want to get AMS running, so it's an inconvenient interruption. 
Sigh.

> 6) .../overhead/snap2/guardian/cmd/Imakefile and
> .../overhead/snap2/pcserver/Imakefile refer to
> ${AFSBASEDIR}/lib/librauth.a, which is not part of AFS 3.0.  Things
> seem to work fine if that library is removed from the compile
> line--evidently AFS 3.0 has the necessary functions from that library,
> but has moved them to some other library.

Yup, the AFS 3.0 distribution no longer generates librauth.a, and there
are backward-compatibility functions available.  Last November, which is
the last time I tested this, using these backward-compatibility
functions prevented PC users from authenticating with their cleartext
passwords, though it was possible to authenticate by copying tokens from
one process to another with no problems.  With dozens of other things to
do, I didn't pursue the problems further.  Sorry.

> 8) Is the AFS monitoring for console supposed to work under AFS 3.0?
> The sample consoles that claim to monitor the file system always claim
> "There has been no file system activity."

Yes, it works fine, but you have to say ``fs monitor localhost'' or its
equivalent.  (It used to be ``fs monitor <hostname>''.)  This gets the
local cache manager to send the messages that Console will monitor.

		Craig

gk5g+@ANDREW.CMU.EDU (Gary Keim) (08/21/90)

Excerpts from misc: 21-Aug-90 Re: console gets bus error,.. Craig F.
Everhart (2167+0)

> Those are great bug reports.  The guys actually supporting Andrew should
> take them seriously.

Some great, some not so.  Did I miss something here.

Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
Witty@decwrl.dec.co (2083)

> How does one go about debugging this?  

I think we sent (or are sending) info on gdb.

Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
Witty@decwrl.dec.co (2083)

> 2) .../overhead/util/lib/tokpak.c kills the Sparcstation compiler.  It
> works fine if it's compiled without optimization.

Why don't you send the error messages reported.

Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
Witty@decwrl.dec.co (2083)

> 3) .../config/vax_3/system.h is missing the declaration of osi_Times
> and the definition of osi_GetSecs().

Blunder on my part.  Fixed.

Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
Witty@decwrl.dec.co (2083)

> 4) .../ams/demo/gendemo doesn't work if you don't set ANDREWDIR and
> you don't build andrew in /usr/andrew.

Again, can you send the error messages printed.

Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
Witty@decwrl.dec.co (2083)

> 5) .../overhead/class/testing/Imakefile should set MAKEDOFLAGS to
> -b ../cmd -g -d $(BASEDIR)/lib .  Otherwise, it doesn't work if you're
> not installing in /usr/andrew.

Another mistake by me.  Fixed.

Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
Witty@decwrl.dec.co (2083)

> 6) .../overhead/snap2/guardian/cmd/Imakefile and
> .../overhead/snap2/pcserver/Imakefile refer to
> ${AFSBASEDIR}/lib/librauth.a, which is not part of AFS 3.0.  Things
> seem to work fine if that library is removed from the compile
> line--evidently AFS 3.0 has the necessary functions from that library,
> but has moved them to some other library.

I've #ifdef AFS30_ENV around librauth.a.

Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
Witty@decwrl.dec.co (2083)

> 7) .../overhead/wpi/wpi.c , line 503, calls puts(WPI_Value("Fwd", entry));
> however, if somebody has no forward field, WPI_Value returns NULL,
> which makes puts() core dump.  wpi should probably check for this case
> explicitly and print out something like "no forwarding address".

Fixed.

Thank you for the report.

Gary Keim
ATK Group

cwitty@csli.Stanford.EDU (Carl Witty) (08/22/90)

   Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
   Witty@decwrl.dec.co (2083)

   > 2) .../overhead/util/lib/tokpak.c kills the Sparcstation compiler.  It
   > works fine if it's compiled without optimization.

   Why don't you send the error messages reported.

rm -f tokpak.o
cc -c -I.  -O -Bstatic -I/afs/ir.stanford.edu/@sys/local/andrew/include/atk 
 -I/afs/ir.stanford.edu/@sys/local/andrew/include -I/usr/afsws/include 
 -I//usr/include/X11   tokpak.c
cc: Fatal error in iropt: Segmentation fault
*** Error code 1
make: Fatal error: Command failed for target `tokpak.o'

(I split the compile line for mailing--in reality, of course, it's all
one line.)

   Excerpts from misc: 18-Aug-90 console gets bus error, oth.. Carl
   Witty@decwrl.dec.co (2083)

   > 4) .../ams/demo/gendemo doesn't work if you don't set ANDREWDIR and
   > you don't build andrew in /usr/andrew.

   Again, can you send the error messages printed.

Do you want to create a messages demo folder in the directory
 /afs/ir/users/c/cwitty/.MESSAGES/amsdemo, erasing any previous
 contents AND ALL SUBDIRECTORIES recursively [no] ? yes
/usr/andrew/bin/cui: Command not found.
/afs/ir/users/c/cwitty/.MESSAGES/amsdemo is not a directory--aborting.

(Again, I split the "Do you want ..." line into three lines for
mailing.)

This fails because gendemo assumes that either you've got the
ANDREWDIR environment variable set, or you're installing in
/usr/andrew.  It would be nice if this didn't require you to set
ANDREWDIR, since (as far as I can tell) everything else in Andrew
works without ANDREWDIR being set.

Carl Witty
cwitty@cs.stanford.edu

cwitty@csli.Stanford.EDU (Carl Witty) (08/24/90)

In article <14951@csli.Stanford.EDU> cwitty@csli.Stanford.EDU (Carl Witty) writes:
   1) "console" compiled on a Sparcstation 1 gives a bus error in a
   function called by the application_Start() macro on line 202 of
   runapp.c.  It runs fine on a sun 3 or an IBM RT running AOS 4.3.  (The
   suns both run 4.0.3).

   How does one go about debugging this?  Normally, I'd go into a
   debugger and get a backtrace, to see where it died, and then look at
   the source code to see if I could figure out what was wrong.  With
   this dynamic loading stuff, I can't figure out how to get the debugger
   to tell me what function it died in, so I can't figure out how to
   start debugging it.

Thanks to Gary Keim <gk5g+@andrew.cmu.edu> for sending me information
on debugging dynamically loaded programs.  Unfortunately, when I
attempted to recompile the offending parts with debugging information,
the problem went away.  His information was enough to localize the
function in which the core dump occurred, though, and I used
printf()'s (ugh) to find the line it was dumping core on: the problem
is in the file .../andrew/atk/console/lib/vmmon.c, InitStats(), around
line 61, which is:

    mask = sigblock(1 << (SIGCHLD - 1));

Since the problem goes away when optimization is turned off, it's
probably just a compiler bug.  (/bin/cc, SunOS 4.0.3c.  SparcStation 1.)

For now, I'll just compile that file without optimization.

Carl Witty
cwitty@cs.stanford.edu