[comp.sys.apollo] SR10 /com/tb

krowitz@RICHTER.MIT.EDU (David Krowitz) (04/29/89)

Ok, now it's my turn to ask for help ...
How do I get a trackback of a program that has died
under SR10? The process that was running the program
has, of course, disappeared by the time I get the
error message telling me whatever the Apollo system
fault was.


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter@eddie.mit.edu
krowitz%richter@athena.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)

wescott@LNIC1.HPRC.UH.EDU (Andrew M. Wescott) (04/29/89)

Well I started with Apollos at SR 10, so I don't know what
you did at SR 9, but I think I know what to do at SR 10.  If
the process has in fact disappeared, you can't do just a
"tb" but rather you must do a "tb -n //node_spec" to examine
the process dump file on a particular node.  I suggest you
start with tb(1) "BSD Command Reference" and abort(3)
"BSD Programmer's Reference".

We also uncovered some interesting tb behavior for f77 (and I
assume /com/cc) in the Unix environment.  After a run-time
failure on f77 compiled code, we got "no traceback information
matched your specifications" after invoking tb.  I decided 
that this was a bug when we found that tb worked properly
with the same code compiled under the ftn front-end.  The
response to my APR came yesterday, and they told me that Unix
has "no concept" of a traceback (yes we've all gotten those
core dumps"), so a tb invoked immediately after bombing a
Unix compiled code would do nothing for me.  Hence you have
three options as I see it: (1) tb -n //node_spec , (2) set your
f77_dump_flag environment to "y", or (3) compile with the -g
switch (or you could use /com/ftn and /com/cc as we do).

I'm sure there are similar measures for /bin/cc, but those of
you who program more in C than I can figure that out.  This
was just a general comment on some differences between Aegis
and Unix.

What was that nonsense I read here recently about SR 10 not
being real Unix?  As far as manufacturer additions/deletions
go, I think all the manufacturers are guilty.  Maybe OSF will
give us a "real" Unix, but what I described above is certainly
the essence of Unix.


Andrew M. Wescott
University of Houston
Department of Chemical Engineering

ced@apollo.COM (Carl Davidson) (05/01/89)

From article <8904281911.AA08487@richter.mit.edu>, by krowitz@RICHTER.MIT.EDU (David Krowitz):
> Ok, now it's my turn to ask for help ...
> How do I get a trackback of a program that has died
> under SR10? The process that was running the program
> has, of course, disappeared by the time I get the
> error message telling me whatever the Apollo system
> fault was.
> 
> 
>  -- David Krowitz
> 
> krowitz@richter.mit.edu   (18.83.0.109)
> krowitz%richter@eddie.mit.edu
> krowitz%richter@athena.mit.edu
> krowitz%richter.mit.edu@mitvma.bitnet
> (in order of decreasing preference)

/com/tb will still do the job.  Nowadays, though, it's really a link to
/usr/apollo/bin/tb.

When a process crashes it dumps crash info in the file
node_data/system_logs/proc_dump.  This file is a circular buffer which
can contain well over 100 dumps before it wraps on itself.  the
traceback program gets the appropriate info from here.


-- 
  Carl Davidson              "Real life is too important to be taken seriously"
  Apollo Computer Inc.                 
  Chelmsford, MA 01824                 
  ced@apollo.com

nazgul@apollo.COM (Kee Hinckley) (05/02/89)

>with the same code compiled under the ftn front-end.  The
>response to my APR came yesterday, and they told me that Unix
>has "no concept" of a traceback (yes we've all gotten those
>core dumps"), so a tb invoked immediately after bombing a
>Unix compiled code would do nothing for me.  Hence you have
>three options as I see it: (1) tb -n //node_spec , (2) set your
>f77_dump_flag environment to "y", or (3) compile with the -g
>switch (or you could use /com/ftn and /com/cc as we do).
>
That's just weird.  Unless the program really blew itself
away, a 'tb -l' (show last program traceback) works fine
for me.
-- 
### User Environment, Apollo Computer Inc. ###  Public Access ProLine BBS   ###
###     {mit-eddie,yale}!apollo!nazgul     ###  nazgul@pro-angmar.cts.com   ###
###           nazgul@apollo.com            ### (617) 641-3722 300/1200/2400 ###
I'm not sure which upsets me more; that people are so unwilling to accept       responsibility for their own actions, or that they are so eager to regulate     everyone else's.

oj@apollo.COM (Ellis Oliver Jones) (05/02/89)

In article <42f7bbce.1b147@apollo.COM> nazgul@apollo.COM (Kee Hinckley) writes:
>That's just weird.  Unless the program really blew itself
>away, a 'tb -l' (show last program traceback) works fine
>for me.

Some application programs have fault handlers which ``recover'' 
from faults by fixing them up, then printing a message and 
exiting.  This is irritating, as it prevents tracebacks.  
Could this be the problem?
/oj