clark@b14.ingr.com (Clark Williams) (12/17/88)
I am trying to bring up AKCL on an Intergraph Interpro 240 workstation. This machine runs System V release 3.1 on the Intergraph Clipper chip set. AKCL is dying while trying to load code, prior to dumping a saved_kcl. I have dug deep enough to find that the problem starts when raw_kcl is loading the file autoload.lsp, and occurs at the following bit of code: (defvar *type-list* '(cons fixnum bignum ratio short-float long-float complex character symbol package hash-table array vector string bit-vector structure stream random-state readtable pathname cfun cclosure spice)) The output from an attempt to build KCL is: KCl (Kyoto Common Lisp) June 1987 8192 pages In Autoload.lsp NEXT STATEMENT is (defvar *type-list*)!!! Error: #<OBJNULL> is not of type SYMBOL. Error signalled by SYSTEM:*MAKE-SPECIAL. Error: 0 is an illegal frs index. Error signalled by SYSTEM:UNIVERSAL-ERROR-HANDLER. [ Many instances of the above error deleted ] Unrecoverable error: bind stack overflow. sh: 1947 abort - core dumped However, if I change the 'defvar' to look like this: (defvar *type-list*) (setq *type-list* '(cons fixnum bignum ratio short-float long-float complex character symbol package hash-table array vector string bit-vector structure stream random-state readtable pathname cfun cclosure spice)) I get the following DIFFERENT error output: KCl (Kyoto Common Lisp) June 1987 8192 pages In Autoload.lsp NEXT STATEMENT is (defvar *type-list*)!!! >#<"COMPILER" package> COMPILER>#<"SYSTEM" package> SYSTEM>#<"USER" package> >#<"LISP" package> LISP>#<"SLOOP" package> SLOOP>#<"USER" package> > Error: Cannot find the external symbol *SYSTEM-DIRECTORY* in #<"SYSTEM" package>. Error signalled by READ. Broken at READ. Type :H for Help. [ Many other strange and wonderful error messages deleted ] Has ANYONE ever seen this sort of error when porting KCL/AKCL? I do not have a clue as to where to begin looking for the problem. Hash table routines? Package routines? Ouija board routines? Oops, I mean binding routines? I would appreciate any suggestions (even the ones telling me to junk this mother! ;-) -- _ /| | Clark Williams Ack Thippfft! \'o.O` | Intergraph Corp. =(___)= | ...uunet!ingr!b14!clark U | (205) 772-6881
jeff@aipna.ed.ac.uk (Jeff Dalton) (12/19/88)
In article <130@b14.ingr.com> clark@b14.ingr.com (Clark Williams) writes: >I am trying to bring up AKCL on an Intergraph Interpro 240 workstation. This >machine runs System V release 3.1 on the Intergraph Clipper chip set. I've never ported AKCL, but I have dealt with KCL. AKCL may have somewhat different problems, but I sould still suspect the same parts of the system. >Has ANYONE ever seen this sort of error when porting KCL/AKCL? I do not have >a clue as to where to begin looking for the problem. Hash table routines? >Package routines? Ouija board routines? Oops, I mean binding routines? >I would appreciate any suggestions (even the ones telling me to junk this >mother! ;-) Most of KCl is very portable. I've ported it twice without having any trouble with the compiler, interpreter, most built-in functions, and so on. However, the system-dependent parts of KCL such as memory allocation, si:save-system (i.e., unexec), and compiled-file loading can present problems that often have non-obvious manifestations. Some of this code doesn't do all the error checking it might. For example, I spent a long time tracking down a problem that produced fairly unenlightening errors only to find that the data segment size limits for the process were too small. The first attempt to allocate memory failed, but the code went on as if it had worked. I haven't had a problem exactly like yours, but my suspicions would focus on the initialization code in general and the code that looks at memory layout in particular. First see if raw_kcl runs by itself. See if you can input symbols and lists and have them printed out. See if symbols and lists have the right types (i.e., try consp, symbolp, etc). This is not really necessary, though, because if the types aren't right the printer should have already gotten confused. The next thing to test is garbage collection. Try calling GBC (with arg T or NIL) in raw_kcl and see if it still works after. I wouldn't expect any bugs in the garbage collector itself, but if something's wrong with the memory setup it may show up after a GC. (And make sure the code in bitop.c is right.) While you're at it, you might try SI:SAVE-SYSTEM in raw_kcl just to see if it works. If all works this far, the problem still might involve memory management or garbage collection but be somewhat subtle so that it's hard to make it misbehave. Then you'll probably have to follow through the whole initialization to see when it goes wrong. It may help to (SETQ SI::*NOTIFY-GBC* T) before loading init_kcl.lsp. The files I had to change when porting KCL are listed below. The ones followed by "?" are unlikely to apply in your case, and there may be some variation in the "unix" ones on a Sys V machine. But, apart from those qualifications, I would be surprised if other files were involved. alloc.c earith.c include.h ? unixfasl.c bitop.c eval.c ? main.c unixint.c cfun.c ? frame.c ? num_co.c unixsave.c Anyway, the evidence you cite in your message seems consistent with my idea of where the problem(s) might be: >AKCL is dying while trying to load code, prior to dumping a saved_kcl. I have >dug deep enough to find that the problem starts when raw_kcl is loading the >file autoload.lsp I think that's one of the first .lsp files loaded. It may be that it causes a garbage collection. It's been a while since I thought about KCL internals, though. >However, if I change the 'defvar' to look like this: This also suggests some sort of allocation problem. By making a trivial change, you could cause garbage collection (or new memory allocation) to occur at a slightly different point. Error: #<OBJNULL> is not of type SYMBOL. Error signalled by SYSTEM:*MAKE-SPECIAL. Error: 0 is an illegal frs index. Error signalled by SYSTEM:UNIVERSAL-ERROR-HANDLER. If I recall correctly, #<OBJNULL> is a null pointer. The frs index error for zero probably indicates that frs_org or frs_top is wrong. (That is, I think 0 should be legal.) (Note too that small integers such as zero are a special case as objects go becuase they're preallocated. They may therefore be less likely to be zapped when things go wrong.) I hope this message is some help and doesn't just suggest things you've already done. I'd be interested in knowing how it turns out. Jeff Dalton, JANET: J.Dalton@uk.ac.ed AI Applications Institute, ARPA: J.Dalton%uk.ac.ed@nss.cs.ucl.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!J.Dalton