[comp.sys.sgi] Solution to Wavefront core dump

architec@cutmcvax.OZ (Phil Dench ) (06/16/90)

A few weeks ago I installed Wavefront (the real version) on
one of our 4D-70GT's.

Everything worked well except for one annoying problem. The
renderer ('image') core dumped with a SEGV.  Not the sort
of thing you expect from a professional piece of software.

To make things more confusing, it would only core dump if
the output image format was 'rla' (Wavefronts own) and not
targa, vista, pixar etal.

Unless of course it was run as root, then everything was OK.

The Wavefront hotline/hotfax suggested I check (and
recheck) permissions, ownerships etc. And I kept saying
that all the permissions were OK. Its your problem, not
ours.

I finally decided to try and find the bug myself.  So I
whipped out dbx and had a look at the core file.  The
executable was stripped of course, so I started wading
through assembler listings to try and work out why it was
crashing.

It turns out that it was crashing in a strncpy. The second
(source string) argument was NULL.  I thought they must
have been calling something that only root had permission
to do and weren't checking for a null return ( eg
fopen()).  Or maybe root had an environment variable that
the normal user didn't (ie getenv() returning a null).

I then spent a few more hours tracing back from the scrncpy
to maybe see if anything obvious was happening before the
strncpy call.  Nothing really useful was turning up so I
was looking for some other avenue of attack.

I then thought that I should check what this second arg to
strncpy held when the prog was run by root (ie when it
worked). As soon as I saw the contents it all made sense.
In less than a minute I had a fully working program.

The second argument was "root"; the user name.  And we are
using yellow pages. So the local passwd file only contained
root, a few others and +::0:0:::.  They must be doing
something like a getpwuid() and not checking for a NULL
return.  Not checking the return value or forgetting to
link with -lsun appers to be quite a common mistake.  I've
done it myself enough times. I'm just amazed that it got
through all their testing unnoticed. Obviously none of
their test sites or other customers are running on networks
using yellow pages.

So I just added the wavefront owner to the local passwd
file and it all ran OK. And not surprisingly, the resultant
'rla' image header includes the user name.  What I will do
now is relink all the executables (Wavefront is distributed
as *.[ao]'s) with -lsun.

BTW. I'm not trying to say nasty things about the Wavefront
software.  Everything else about the installation went
smoothly and the product itself appears to be pretty fast,
easy to use and capable of excellent results.  I just
thought it was worthwhile telling others of my experience
so they too don't spend a day looking for a solution to
similar problems with other shop bought software.

And I'm still interested in a Wavefront email address if
anyone has one.


	Phil Dench


--------------------------------------------+----------------------------------
                                            | School of Computer Science,
ACSNet: architec@cutmcvax.oz                | Curtin University of Technology,
UUCP:   ...!uunet!munnari!cutmcvax!architec | Kent Street,
ARPA:   architec%cutmcvax.oz@uunet.UU.NET   | Bentley
                                            | Western Australia, 6102
--------------------------------------------+----------------------------------