[comp.sys.ibm.pc.rt] Need help with kernel message from AIX 2.2.1 !!!

jason@cs.utexas.edu (Jason Martin Levitt) (07/16/90)

    I keep getting this message from the AIX 2.2.1 kernel:

  Core failure: 255

    No one responded to my posting a couple of weeks ago about
this message. I'm really worried that my machine may fail abruptly
since I seem to be getting the message more frequently now. 
    Can someone at IBM *please* grep through the AIX 2.2.1 kernel
source code and find this string? 

        --Jason     jason@cs.utexas.edu   (512) 326-9102

jeffe@sandino.austin.ibm.com (Peter Jeffe 512.823.4091) (07/28/90)

In article <212@gort.cs.utexas.edu> jason@cs.utexas.edu (Jason Martin Levitt) writes:
>    I keep getting this message from the AIX 2.2.1 kernel:
>  Core failure: 255

This means that core() couldn't create the core file.  The most likely reason
is that it doesn't have enough room on the disk.  Look for an old core file
that may indicate what's dumping, or else liberate some space and look for
it when next it occurs.
-------------------------------------------------------------------------------
Peter Jeffe   ...uunet!cs.utexas.edu!ibmchs!auschs!sandino.austin.ibm.com!jeffe
        first they want a disclaimer, then they make you pee in a jar,
                   then they come for you in the night

jason@cs.utexas.edu (Jason Martin Levitt) (07/28/90)

In article <2949@awdprime.UUCP> jeffe@sandino.austin.ibm.com (Peter Jeffe 512.823.4091) writes:
>In article <212@gort.cs.utexas.edu> jason@cs.utexas.edu (Jason Martin Levitt) writes:
>>    I keep getting this message from the AIX 2.2.1 kernel:
>>  Core failure: 255
>
>This means that core() couldn't create the core file.  The most likely reason
>is that it doesn't have enough room on the disk.  Look for an old core file
>that may indicate what's dumping, or else liberate some space and look for
>it when next it occurs.

  There are no core files on the machine. Also, I changed /etc/skulker a
long time ago so that it wouldn't remove core files left lying around. 
  I've already tried dumping core in a number of ways and it always
succeeds [so what else is new? :-)  ]
  I don't think that disk space is the issue since I haven't been close
to filling up in a long time.
  But, I tested your hypothesis by filling up a partition and then having
a program dump core. It gave the message:

   Core failure: 28

  A promising start.

 I stopped thinking that the 255 error was critical
after I was told that it was coming from core.c. Now, I'm just curious as
hell. 
 Since AIX 2.2.1 is going the way of the Dodo bird and Multics,
I didn't think I'd get a lot of response, but I never realized people
would be *this* apathetic.  :-\

   Thanks for the tip; Maybe someone will take the time to figure out
what's going on in core.c  .....

     peace,

     ---Jason    jason@cs.utexas.edu   (512) 326-9102

au0005@dundee.austin.ibm.com (08/03/90)

In article <212@gort.cs.utexas.edu>, jason@cs.utexas.edu (Jason Martin
Levitt) writes:
> From: jason@cs.utexas.edu (Jason Martin Levitt)
> Subject: Need help with kernel message from AIX 2.2.1 !!!
> Date: 16 Jul 90 09:44:15 GMT
> 
> 
>     I keep getting this message from the AIX 2.2.1 kernel:
> 
>   Core failure: 255
> 
>     No one responded to my posting a couple of weeks ago about
> this message. I'm really worried that my machine may fail abruptly
> since I seem to be getting the message more frequently now. 
>     Can someone at IBM *please* grep through the AIX 2.2.1 kernel
> source code and find this string? 
> 
>         --Jason     jason@cs.utexas.edu   (512) 326-9102

Hello Jason,

Yes, this message does come from core.c. The value printed comes from
u.u_error, after a failure to dump a core file for some specific reason.

All the calls that core.c makes should return a proper error code,
like your 28, which is ENOSPC.

However, one of the first things that core.c does is grab a bunch of
space from the stack, so that he can parse a filename in there for the
core file.

He does this using the copyout() function, which only returns -1 on
failure. Since the type of u.u_error == char ( which is unsigned ),
this shows as a Core Failure: 255.

Basically, if you get a 255 for an error code, you have a problem with
the stack OR this filename. i.e, no stack space available, the filename
causes the stack to grow beyond limits etc. 

Hope that this helps.

Regards, 

Peter May.


#include <standard.disclaimer>
/*  My Comments are my own: I do not represent IBM. */

jason@cs.utexas.edu (Jason Martin Levitt) (08/03/90)

In article <3283@d75.UUCP> au0005@dundee.austin.ibm.com () writes:
> [answer deleted]
>Hope that this helps.
>
>Regards,
>Peter May.
>
>#include <standard.disclaimer>
>/*  My Comments are my own: I do not represent IBM. */

  Thanks Peter. I really appreciate the time and energy you took
to retrieve that answer. I wouldn't use this net bandwith to
thank you except I can't get an email msg to you with the address
info provided. I don't understand the brain-dead topology of the mail
machines at IBM in Austin, but I do recall that the last mail msg
I sent there required about 6 "!"'s and that's quite a few considering
that the machine I'm typing this on calls IBM directly. 8-\

        peace,

       ---Jason   jason@cs.utexas.edu