[rec.games.hack] Nethack crashes after picking character

emct35@castle.ed.ac.uk (Jason Smith) (08/11/89)

We managed to get around the fact that our compiler is worse than DUMB, STUPID
STUPID_CPP and DUMB.Setup expect. (By not including extern.h and declaring
the minimum of stuff by hand)

We can start the program up, it allows us to pick a character, then there is
a dramatic pause, followed by a segmentation (memory) fault.
This is on an ancient GEC63 machine running UX63 (SysV), and we can't get
a new compiler.
Sdb might have helped us, but the symbol table is far too big for the compiler
to cope with, so no cc -g.

We are using curses (no termlib).


Can anyone out there help us ?

advTHANKSance

jriegel@cstemp2.almaden.ibm.com (Jeff Riegel) (08/22/89)

In article <97@castle.ed.ac.uk>, emct35@castle.ed.ac.uk (Jason Smith) writes:
> 
> We can start the program up, it allows us to pick a character, then there is
> a dramatic pause, followed by a segmentation (memory) fault.
> 
  I'm having the same problem using an IBM RT running AIX 2.2.1.  I'd appreciate finding out what this problem is.  I compiled the program with SYS V defined.
Thanks in advance.


-Jeff
jriegel@ibm.com

bolosky@cs.rochester.edu (Bill Bolosky) (08/22/89)

In article <1021@ks.UUCP> jriegel@cstemp2.almaden.ibm.com (Jeff Riegel) writes:
>In article <97@castle.ed.ac.uk>, emct35@castle.ed.ac.uk (Jason Smith) writes:
>> 
>> We can start the program up, it allows us to pick a character, then there is
>> a dramatic pause, followed by a segmentation (memory) fault.
>> 
>  I'm having the same problem using an IBM RT running AIX 2.2.1.  I'd appreciate finding out what this problem is.  I compiled the program with SYS V defined.
>Thanks in advance.
>
>
>-Jeff
>jriegel@ibm.com


This problem has been around on RTs for several older versions of nethack,
also.  It's not operating system dependant, because it also happens under
Mach.  My (fairly strong) suspicion is that it's related to pointer
rounding.  The RT has (half and) fullword load and store operations that oly
work when aligned on a (half or) fullword boundary.  When you try to make an
unaligned reference, it simply rounds the pointer down rather than
generating an error, thus postponing the eventual crash to sometime more
obscure and annoying.  malloc and the compiler do their best to align
everything on 32-bit boundaries, but it is quite possible to foil them.  For
example, if you do foo = malloc(100); *(int *)(foo+2) = x;, x will be
written at *foo, not *(foo + 2).

I have been meaning to look into the sources to find out who does this and
fix it, but I'm far too busy (and lazy).  If someone who's more familiar
with the code knows where something like this might happen, and would let me
know, I'd be happy to fix it.

Bill

norm@cfctech.UUCP (Norm Meluch) (08/23/89)

>In article <97@castle.ed.ac.uk>, emct35@castle.ed.ac.uk (Jason Smith) writes:
> 
> We can start the program up, it allows us to pick a character, then there is
> a dramatic pause, followed by a segmentation (memory) fault.
> 

We had this (and many other problems) with nethack on our 3B2/600.
The solution for us was to change the typedef of schar in config.h
from signed char to short int.  It seems that the room generation 
functions use the "schar" for a termination flag of the room list for the 
level, and the termination value for the list is -1.  Well on our system
signed char is *not* signed, so the search of the list runs off the edge
of memory causing the core dump.

(This bug took me about 1.5 days to find).

After we fixed this problem the program ran fine.

						- Norm

|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Norman J. Meluch ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Mail:{sharkey|mailrus}!cfctech!norm                  Voice: (313) 244-1809   |
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
| Note: The opinions expressed here are in no way to be confused with valid    |
|_______ideas or corporate policy._____________________________________________|

Ralf.Brown@B.GP.CS.CMU.EDU (08/23/89)

In article <13641@cfctech.UUCP>, norm@cfctech.UUCP (Norm Meluch) wrote:
}We had this (and many other problems) with nethack on our 3B2/600.
}The solution for us was to change the typedef of schar in config.h
}from signed char to short int.  It seems that the room generation 
}functions use the "schar" for a termination flag of the room list for the 
}level, and the termination value for the list is -1.  Well on our system
}signed char is *not* signed, so the search of the list runs off the edge
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
}of memory causing the core dump.

It isn't signed because "signed" is #defined to the null string in tradstdc.h.
This one bit me with the Metaware High-C compiler on an IBM RT running Mach.
Look for the #ifdef __HC__ and add a case for the 3B2....
--
UUCP: {ucbvax,harvard}!cs.cmu.edu!ralf -=-=-=-=- Voice: (412) 268-3053 (school)
ARPA: ralf@cs.cmu.edu  BIT: ralf%cs.cmu.edu@CMUCCVMA  FIDO: Ralf Brown 1:129/46
FAX: available on request                      Disclaimer? I claimed something?

"For numerical analysis, there are theorems that are true, and theorems that
 are *really* true."  -- John Dennis (in Upson's Familiar Quotations)