[comp.lang.lisp] Problems porting KCL/AKCL

clark@b14.ingr.com (Clark Williams) (12/17/88)

I am trying to bring up AKCL on an Intergraph Interpro 240 workstation. This
machine runs System V release 3.1 on the Intergraph Clipper chip set.

AKCL is dying while trying to load code, prior to dumping a saved_kcl. I have
dug deep enough to find that the problem starts when raw_kcl is loading the
file autoload.lsp, and occurs at the following bit of code:

	 (defvar *type-list*
	     '(cons
	     fixnum bignum ratio short-float long-float complex
	     character symbol package hash-table
	     array vector string bit-vector
	     structure stream random-state readtable pathname
	     cfun cclosure spice))

The output from an attempt to build KCL is:

	     KCl (Kyoto Common Lisp)  June 1987  8192 pages
	     In Autoload.lsp
	     NEXT STATEMENT is (defvar *type-list*)!!!
	     Error: #<OBJNULL> is not of type SYMBOL.
	     Error signalled by SYSTEM:*MAKE-SPECIAL.

	     Error: 0 is an illegal frs index.
	     Error signalled by SYSTEM:UNIVERSAL-ERROR-HANDLER.

     [ Many instances of the above error deleted ]

	     Unrecoverable error: bind stack overflow.
	     sh: 1947 abort - core dumped

However, if I change the 'defvar' to look like this:

	    (defvar *type-list*)
	    (setq *type-list* 
		    '(cons
		      fixnum bignum ratio short-float long-float complex
		      character symbol package hash-table
		      array vector string bit-vector
		      structure stream random-state readtable pathname
		      cfun cclosure spice))

I get the following DIFFERENT error output:

	    KCl (Kyoto Common Lisp)  June 1987  8192 pages
	    In Autoload.lsp
	    NEXT STATEMENT is (defvar *type-list*)!!!

	    >#<"COMPILER" package>

	    COMPILER>#<"SYSTEM" package>

	    SYSTEM>#<"USER" package>

	    >#<"LISP" package>

	    LISP>#<"SLOOP" package>

	    SLOOP>#<"USER" package>

	    >
	    Error: Cannot find the external symbol *SYSTEM-DIRECTORY* in #<"SYSTEM" package>.
	    Error signalled by READ.

	    Broken at READ.  Type :H for Help.

     [ Many other strange and wonderful error messages deleted ]



Has ANYONE ever seen this sort of error when porting KCL/AKCL? I do not have 
a clue as to where to begin looking for the problem. Hash table routines?
Package routines? Ouija board routines? Oops, I mean binding routines?
I would appreciate any suggestions (even the ones telling me to junk this 
mother! ;-)


	
-- 
                        _   /|       |    Clark Williams
          Ack Thippfft! \'o.O`       |    Intergraph Corp.
                        =(___)=      |    ...uunet!ingr!b14!clark
                           U         |    (205) 772-6881

jeff@aipna.ed.ac.uk (Jeff Dalton) (12/19/88)

In article <130@b14.ingr.com> clark@b14.ingr.com (Clark Williams) writes:
>I am trying to bring up AKCL on an Intergraph Interpro 240 workstation. This
>machine runs System V release 3.1 on the Intergraph Clipper chip set.

I've never ported AKCL, but I have dealt with KCL.  AKCL may have
somewhat different problems, but I sould still suspect the same
parts of the system.

>Has ANYONE ever seen this sort of error when porting KCL/AKCL? I do not have 
>a clue as to where to begin looking for the problem. Hash table routines?
>Package routines? Ouija board routines? Oops, I mean binding routines?
>I would appreciate any suggestions (even the ones telling me to junk this 
>mother! ;-)

Most of KCl is very portable.  I've ported it twice without having any
trouble with the compiler, interpreter, most built-in functions, and
so on.  However, the system-dependent parts of KCL such as memory
allocation, si:save-system (i.e., unexec), and compiled-file loading
can present problems that often have non-obvious manifestations.  Some
of this code doesn't do all the error checking it might.  For example,
I spent a long time tracking down a problem that produced fairly
unenlightening errors only to find that the data segment size limits
for the process were too small.  The first attempt to allocate memory
failed, but the code went on as if it had worked.

I haven't had a problem exactly like yours, but my suspicions would
focus on the initialization code in general and the code that looks
at memory layout in particular.  First see if raw_kcl runs by itself.
See if you can input symbols and lists and have them printed out.
See if symbols and lists have the right types (i.e., try consp,
symbolp, etc).  This is not really necessary, though, because if
the types aren't right the printer should have already gotten
confused.

The next thing to test is garbage collection.  Try calling GBC (with
arg T or NIL) in raw_kcl and see if it still works after.  I wouldn't
expect any bugs in the garbage collector itself, but if something's
wrong with the memory setup it may show up after a GC.  (And make sure
the code in bitop.c is right.)

While you're at it, you might try SI:SAVE-SYSTEM in raw_kcl
just to see if it works.

If all works this far, the problem still might involve memory
management or garbage collection but be somewhat subtle so that it's
hard to make it misbehave.  Then you'll probably have to follow
through the whole initialization to see when it goes wrong.  It
may help to (SETQ SI::*NOTIFY-GBC* T) before loading init_kcl.lsp.

The files I had to change when porting KCL are listed below.
The ones followed by "?" are unlikely to apply in your case,
and there may be some variation in the "unix" ones on a Sys
V machine.  But, apart from those qualifications, I would
be surprised if other files were involved.

alloc.c     	earith.c     	include.h ?   	unixfasl.c
bitop.c     	eval.c ?   	main.c     	unixint.c
cfun.c ?   	frame.c ?   	num_co.c     	unixsave.c

Anyway, the evidence you cite in your message seems consistent
with my idea of where the problem(s) might be:

>AKCL is dying while trying to load code, prior to dumping a saved_kcl. I have
>dug deep enough to find that the problem starts when raw_kcl is loading the
>file autoload.lsp

I think that's one of the first .lsp files loaded.  It may be that it
causes a garbage collection.  It's been a while since I thought about
KCL internals, though.

>However, if I change the 'defvar' to look like this:

This also suggests some sort of allocation problem.   By making a
trivial change, you could cause garbage collection (or new memory
allocation) to occur at a slightly different point.

	     Error: #<OBJNULL> is not of type SYMBOL.
	     Error signalled by SYSTEM:*MAKE-SPECIAL.

	     Error: 0 is an illegal frs index.
	     Error signalled by SYSTEM:UNIVERSAL-ERROR-HANDLER.

If I recall correctly, #<OBJNULL> is a null pointer.  The frs index
error for zero probably indicates that frs_org or frs_top is wrong.
(That is, I think 0 should be legal.)  (Note too that small integers
such as zero are a special case as objects go becuase they're
preallocated.  They may therefore be less likely to be zapped
when things go wrong.)

I hope this message is some help and doesn't just suggest things
you've already done.  I'd be interested in knowing how it turns out.

Jeff Dalton,                      JANET: J.Dalton@uk.ac.ed             
AI Applications Institute,        ARPA:  J.Dalton%uk.ac.ed@nss.cs.ucl.ac.uk
Edinburgh University.             UUCP:  ...!ukc!ed.ac.uk!J.Dalton