[mod.amiga] An Explanation of Guru Meditation numbers

page@ulowell.UUCP (08/08/86)

[ This is a reprint of an old article.  The exact numbers have changed
a little with V1.2 of the system software (just that more distinctions
have been added), but this should help during those long nights while
you crash your machine.  --Bob]

                    "You Too Can Be An Amiga Guru!"
                           by Dave Boulton
               ('New & Improved' version as of 11 Feb 86)
 
 Arrrrrrgh!, you snarl as the dreaded System Request box pops on to
 the Workbench screen: 
 
    Software error - task held
    Finish ALL disk activity
    Select CANCEL to reset/debug
 
 You stare at the computerese for a moment, and then hit the CANCEL
 button. The current screen is pushed down and you find yourself
 staring at the orange and black finality of:
 
    Software Failure.          Press left mouse button to continue.
                Guru Meditation #02010009.00009310
 
 Many have wondered what the cryptic digits of the Guru Meditation
 Number were all about. Perhaps the flashing orange box has some
 strange mystical hypnotic powers. On the other hand, perhaps the
 number is some sort of digitized mantra which allows the Yogis of
 Los Gatos to attain perfect enlightenment. Whatever the
 interpretation was, it was beyond the reach of the average user.
 The most that the user is ever told is 'Not enough memory' or
 maybe 'Software Failure'.
 
 In fact, the Guru Meditation Number (or simply 'alert number')
 distills a lot of information about exactly what mishap has
 befallen your Amiga. To those able to decipher it, the alert
 number tells a great deal about who did what to whom as the
 machine crashed. Not that there is much that can be done about the
 situation after the fact; the information is mainly useful to help
 the Software Gurus in debugging their programs or in diagnosing
 what caused the fatal situation. It is a kind of post-mortem
 report, explaining why the patient died.
 
 But there are times when it would be valuable to the user to know
 precisely what happened to cause a crash. Other times, you may
 just be curious as to what was going on. All of the Alerts (the
 correct name for a Guru Meditation box) are defined in the header
 file called "exec/alerts" provided to software developers. What
 follows is an attempt at translating that information into a first
 order approximation of English.
 
 Many of the error conditions that the Amiga OS detects are deeply
 intertwined with the various internal data structures and
 operating system calls. There isn't any way that I can define all

 of the terms used without reprinting most of the ROM Kernel Manual
 here in the newsletter. However, if you are familiar with the
 basics of how the system software works, then you can figure out a
 great deal about why a particular program has crashed.

 

      Specific error code -----+         +--- Task Address
                               |         |
                       02 01 0009 . 00009310       
                        |  |
      Subsystem number -+  +--- General error code
 
 An alert number is divided into several parts. The section to the
 right of the decimal point is simply the address in RAM of the
 task that was running when the error occurred. This helps tell
 someone who is debugging a program which of the many different
 programs running in the Amiga caused the problem. In the example
 given above, the running task was at 9310 hex. In this case that
 happens to mean that the error occurred in CLI process number 1.
 If I had been debugging a complex program which uses several
 tasks, this information would be useful. As it happens, all it
 tells me here is that the problem was in my software, not in any
 of the system tasks.
 
 The left hand portion of the alert number is an encoded error
 number. There are several fields with different meanings:
 
 The first two digits tell which module of the operating system
 reported the error (this is technically known as the 'alert
 object' or a Subsystem ID). In the example above the alert object
 is 02, which tells me that the error was reported by the Graphics
 library. 
 
 The first digit can be encoded in a funny way. The 'most
 significant bit' of this digit says whether or not this alert is a
 'dead-end'; that is, does the system have a chance of recovering
 from the error or not. If the alert object had been given as 82
 instead of 02, then the alert would be a 'dead-end'.
 
 In practice, this is a pretty narrow distinction. By the time you
 have gotten to the Alert Box the system is in such dire straits
 that the only choices left are to reboot, or to enter the system
 debugger. The 'dead-end' bit could possibly cause some confusion
 if you aren't used to dealing in hexadecimal. If the first digit
 of the alert object is ever greater than 7 (hexadecimal) then
 subtract 8 from it. Thus if the first two digits were B1, then the
 alert is a dead-end error reported by the Workbench (Subsystem ID
 code 31). (B minus 8 equals 3, for those of you without the
 required 16 fingers!)
 
 The Subsystem ID codes are as follows:
 
 Exec Library        01     Console Device      11
 Graphics Library    02     GamePort Device     12
 Layers Library      03     Keyboard Device     13
 Intuition Library   04     TrackDisk Device    14
 Math Library        05     Timer Device        15
 CList Library       06     CIA Resource        20
 DOS Library         07     Disk Resource       21
 RAM Library         08     Misc Resource       22
 Icon Library        09     BootStrap           30
 Audio Device        10     Workbench           31
 
 The next two digits specify the general type of error which has
 occurred. For many specialized types of errors this field is 00,
 instead of one of the general error codes below. This field is
 often very useful, since the user can easily tell such things as
 out-of-memory conditions, and missing libraries or device drivers
 (if you have deleted files from the LIBS or DEVS directories of
 your boot disk). In the example given above the general error code
 is 01, which means the the Graphics Library was not able to find
 enough free memory to allocate for some reason.
 
 The general error codes are:
 
 Insufficient memory   01    OpenDevice error    04
 MakeLibrary error     02    OpenResource error  05
 OpenLibrary error     03    I/O error           06
 
 The last four digits of the alert number give specific information
 about exactly what error has occurred. The interpretation of the
 specific error code depends on which subsystem we are talking
 about. Each subsystem reuses the same values for the specific
 error code with different meanings. In our example the specific
 error code is 0009. Since we are talking about an error in the
 graphics library, we determine that the error is called
 'TextTmpRas' which means that a call to the Text() routine (trying
 to draw characters on the screen) ran out of memory when it tried
 to allocate memory for a TmpRas (temporary raster work area) data
 structure.
 
 If the error had been in a different subsystem (say the Intuition
 library) then the same error code of 0009 would have had a
 completely different interpretation (for Intuition it would mean
 that the Screen Type parameter to an OpenScreen call was not a
 valid type).
 
 There is one special case in dealing with Guru Meditation Numbers.
 Everything that we have discussed so far has to do with alerts
 that are detected and generated by the Amiga ROM Kernel. There is
 another case, which is when an alert is caused by a 68000
 processor exception (or 'trap'). Whenever a CPU trap occurs (for
 instance, an illegal opcode is executed) the Exec will generally
 cause an alert. A program can intercept this trap processing, and
 insert its own 'trap handler' to perform some other function, but
 usually these traps end up causing an alert. When this happens,
 the left hand part of the guru number will be a small value. The
 subsystem ID and the general error code will both be zero. The
 specific error code will be the 'trap number' of the trap that
 occurred. The trap numbers are part of the 68000 chip, and are not
 assigned by the ROM kernel like other error codes are.
 
 The following is a list of all the possible CPU traps. A few of
 these will never show up as an alert because they are always
 handled by the ROM Kernel. I list all of them here for
 completeness (and just in case I'm wrong, and they ever _do_ show
 up).
 
 Bus Error              02      Privilege Violation    08
 Address Error          03      Instruction Trace      09
 Illegal Instruction    04      Line A Emulation       0A
 Divide by Zero         05      Line F Emulation       0B

 CHK instruction        06      TRAP 0 ... 15          20 ... 2F
 TRAPV (Overflow)       07
 
 The following is a list of the specific error codes and a short
 comment about their meaning. The descriptions are a slightly
 edited version of the exec/alerts file, so they are very cryptic.
 
 Exec Library
 ExcptVect      81000001    CPU trap vector checksum
 BaseChkSum     81000002    ExecBase checksum error
 LibChkSum      81000003    library checksum failure
 LibMem         81000004    no memory to make library
 MemCorrupt     81000005    corrupted free memory list
 IntrMem        81000006    no memory for interrupt servers
 
 Graphics Library
 CopDisplay     82010001    copper display list, no memory
 CopInstr       82010002    copper instruction list, no mem.
 CopListOver    82000003    copper list too long
 CopIListOver   82000004    copper intermediate list too long
 CopListHead    82010005    copper list head, no memory
 LongFrame      82010006    long frame, no memory
 ShortFrame     82010007    short frame, no memory
 FloodFill      82010008    flood fill, no memory
 TextTmpRas     02010009    text, no memory for TmpRas
 BltBitMap      8201000A    BltBitMap, no memory
 
 Intuition Library
 GadgetType     84000001    unknown gadet type
 CreatePort     84010002    create port, no memory
 ItemAlloc      84010003    item plane alloc, no memory
 SubAlloc       84010004    sub alloc, no memory
 PlaneAlloc     84010005    plane alloc, no memory
 ItemBoxTop     84000006    item box top < RelZero
 OpenScreen     84010007    open screen, no memory
 OpenScrnRast   84010008    OpenScreen's AllocRast, no mem.
 SysScrnType    84000009    open sys screen, unknown type
 AddSWGadget    8401000A    add SW gadgets, no memory
 OpenWindow     8401000B    open window, no memory
 BadState       8400000C    Bad State Return entering Int.
 BadMessage     8400000D    Bad Message received by IDCMP
 WeirdEcho      8400000E    Weird echo causing problem
 NoConsole      8400000F    couldn't open the Console Device
 
 DOS Library

 StartMem       07010001    no memory at startup
 EndTask        07000002    EndTask didn't
 QPktFail       07000003    Qpkt failure
 AsyncPkt       07000004    Unexpected packet received
 FreeVec        07000005    Freevec failed
 DiskBlkSeq     07000006    Disk block sequence error
 BitMap         07000007    Bitmap corrupt
 KeyFree        07000008    Key already free
 BadChkSum      07000009    Invalid checksum
 DiskError      0700000A    Disk Error
 KeyRange       0700000B    Key out of range
 BadOverlay     0700000C    Bad overlay
 
 TrackDisk Device
 TDCalibSeek    14000001    calibrate: seek error
 TDDelay        14000002    delay: error on timer wait
 
 Timer Device
 TMBadReq       15000001    bad request
 
 Disk Resource
 DRHasDisk      21000001    get unit: already has disk
 DRIntNoAct     21000002    interrupt: no active unit
 
 BootStrap
 BootError      30000001    boot code returned an error
 
 *** End of File ***