[net.lang.c] Recoding Lisp programs in C

michaelm@bcsaic.UUCP (michael b maxwell) (10/02/85)

The common wisdom is that AI-type applications should be developed in an AI 
language such as Lisp or Prolog, for obvious reasons.  At the same time, 
it is frequently asserted that a mature Lisp program can be recoded in C 
for increased speed and decreased size.

My question is, how much faster, and how much smaller?

Obviously this depends on many things.  Let's assume for the purposes of
discussion reasonably good Lisp and C compilers.  (e.g. Frantz Liszt, etc.;
I don't "speak" C, so insert your favorite C compiler here!)  Let's also
assume you've made what modifications in your Lisp program you can to speed
it up, e.g. doing (sstatus translink on), using localf (assuming that
works, which it doesn't on my Sun...), setting up the allocation of space
for different forms of data so as to optimize garbage collection, etc.
Likewise whatever optimizations you can make to the C version.

The improvements gained by recoding in C also obviously depend on what kind 
of a program you're recoding; so what programs are helped most?  
Pointer-intensive ones??  Arithmetic-intensive ones?

On a large program, recoding the entire thing in C is probably not
worthwile (I would guess).  How can you tell what parts would best benefit
from translation?  (I assume here a Lisp like Frantz, which allows
calling of C functions from Lisp code.)  Or does this really gain you
anything?

Have many people really done the translation?  Or is it one of those things 
that everyone talks about, but no one does anything about? :-)  What gains 
did you realize?  Are there any published studies?
-- 
Mike Maxwell
Boeing Artificial Intelligence Center
	..uw-beaver!{uw-june,ssc-vax}!bcsaic!michaelm

freeman@spar.UUCP (Jay Freeman) (10/04/85)

[]

There's another issue beside time and speed, having to do with real-time
systems.  A current net discussion bewails the plight of the poor F-16
pilot bounced by a Foxbat, when his or her Ada-coded countermeasures system
raises an error message due to run-time range checking.  But imagine the
plight of the same pilot when the revised fire-control system -- rewritten
in Lisp -- says "Garbage Collection" and goes to sleep for a while.

(There are Lisp implementations that do incremental garbage collection, of
course.)
-- 
Jay Reynolds Freeman (Schlumberger Palo Alto Research)(canonical disclaimer)

gwyn@BRL.ARPA (VLD/VMB) (10/06/85)

If the code needed to be done in LISP in the first place,
then any C translation is going to have to provide garbage
collection anyway.  I once toyed with the idea of providing
a set of C library routines to support LISPy programming,
but it seemed too weird to me.  Why not just use a good
LISP compiler instead, especially since the direct translation
into C will be much less maintainable than the LISP source.

Personally I think AI has been oversold.

preece@ccvaxa.UUCP (10/07/85)

> /* Written  8:39 pm  Oct  5, 1985 by gwyn@BRL.ARPA in ccvaxa:net.lang.c
> */ If the code needed to be done in LISP in the first place, then any C
> translation is going to have to provide garbage collection anyway.
----------
The key word there is "needed."  it may well be that Lisp provided
a more efficient environment for developing the application even if
the application didn't rely on especially Lisp-ish features like
garbage collection, type mutability, and symbols.
----------
> I once toyed with the idea of providing a set of C library routines to
> support LISPy programming, but it seemed too weird to me.
----------
If a C environment with similar capabilities for dynamic development
were available, many such applications might be written in C to
start with.
----------
> Why not just use a good LISP compiler instead, especially since the
> direct translation into C will be much less maintainable than the LISP
> source.
----------
The Lisp environment is likely to be loaded down with features that
are not strictly needed for the running of the application.  Lisp
compilers generally produce code that runs within the Lisp environment,
not as stand-alone applications (some do).

Of course, the translation to C allows the developer an obvious
opportunity to rework the application to preserve needed features
while eliminating unneeded flexibility.  I suspect the rewriting
is as important as the change in language.


-- 
scott preece
gould/csd - urbana
ihnp4!uiucdcs!ccvaxa!preece

richw@ada-uts.UUCP (10/07/85)

>> ***** ada-uts:net.lang.c / brl-tgr!gwyn /  9:39 pm  Oct  5, 1985
>> If the code needed to be done in LISP in the first place,
>> then any C translation is going to have to provide garbage
>> collection anyway.

What about having the C version be conscientious about using "malloc"
AND "free"?  Though many people I know feel that dynamic storage
allocation is a "detail" that programmers need never worry much about,
I wonder if making sure to "free" anything you "malloc" is all that
hard.

I wrote my own slightly modified versions of "malloc" and "free" which
remember how many allocated blocks haven't been freed, and then prints
out an error message at the end of the program (I also modified "exit"
to do this) if all allocated blocks haven't been freed.  Unfortunately,
I haven't had time to use these versions in recent work.

Has anyone out there REALLY used "free"?  I'm very curious about this...

>> Personally I think AI has been oversold.

I'd like to shake your hand.  I have my own opinions about AI that I
won't (and shouldn't) flame about, but KEEP THAT DISRESPECT FLOWING !!!

-- Rich

gww@aphasia.UUCP (George Williams) (10/11/85)

> Personally I think AI has been oversold.

No, no, no. People seem to think AI => LISP. *That* has been oversold, real AI
hasn't been developed yet and so can't be sold.

(If I have posted this twice, I apologize, I got a swap error somewhere in
posting).

				George Williams
				decvax!frog!aphasia!gww

God bless you, in case you sneeze.

matt@prism.UUCP (10/12/85)

/* Written 12:58 pm  Oct  7, 1985 by richw@ada-uts in prism:net.lang.c */
> 
> What about having the C version be conscientious about using "malloc"
> AND "free"?  Though many people I know feel that dynamic storage
> allocation is a "detail" that programmers need never worry much about,
> I wonder if making sure to "free" anything you "malloc" is all that
> hard.
> 
> Has anyone out there REALLY used "free"?  I'm very curious about this...

Ask anyone who has ever developed a piece of commercial code for a small
machine.  (Yes, Virginia, there are people who develop applications for IBM
PC's...)  I, for example, am currently working on a product that will operate
on potentially megabytes of data, and must run in 256 to 512 K of physical
memory, with no help from virtual memory.  You'd better believe we're careful
to match every malloc with an appropriate free!


-------------------------------------------------------------------------------
 Matt Landau      	   ARPA: matt%mirror@cca
 Mirror Systems, Inc.	   UUCP: {decvax!cca, ima!inmet, mit-eddie, wjh12}...
 Cambridge, MA						   ...mirror!prism!matt
-------------------------------------------------------------------------------
Blessed are they that run around in circles, for they shall be known as wheels.

petera@hcrvax.UUCP (Smith) (10/12/85)

	One of the problems of using malloc() and free() is that you may not 
be quite sure when you can free something. Having written a farily large Lisp
interpreter I was faced with this problem in just about every procedure. "What
is now free and what must I keep?" The answer is not easy because in a program
which is highly recursive, ie lots of procedures each of which can call a large
subset of the rest, it is very hard, if not impossible to know that item 'x'
is now free. My response to the question "why not recode in C?" is that reliable
garbage collection is hard and you are going to have to write it if the
program is fairily complex. One interesting aspect of 'mark and gather' garbage
collection in a recursive environment is that you have to mark all references
made by local variables in activated procedures. This means that you must run
down a stack of pointers to local variables and mark recursively all lists that
they refer to. For simple programs the cost of constructing this mark stack
is quite high and it has to be done for just about every recursive function.
This means that simple programs (where you know you are not going to run out
of memory doing one 'evaluation') are probably more effecient when written in C.

			Peter Ashwood-Smith
			Human Computing Resources,
			Toronto Ontario.

bilbo.barry@ucla-locus.ARPA (Barry Gold) (10/18/85)

Paul Eggert at UCSB found a neat hack for doing mark and gather garbage
collectors in a language like c:

run through the c stack (yes, it's not portable, but you only have to
rewrite a few (~10) lines to port it to most implementations):  pretend
everything you find there is a pointer to the stuff you're trying to
collect.  Now check each such pointer to make sure it actually points into
the area(s) you're collecting and is properly aligned.  If not, ignore it;
if it appears OK, mark the thing it points to.

(This works only if the 'type' of a lisp (or similar) object can be determined
from its address/the 'pointer' to it.)

When you're done, you'll have marked everything you should have marked
(because they're all in 'c' variables somewhere on the stack--except for
global or "special" variables, which you should keep in an array somewhere).
You'll also have marked a few things that are garbage--that should be thrown
away/freed/put on the available space list.  Paul claims that this 
falsely marked garbage will amount to less than 10% of the collectable
garbage.

So you'll hold on to everything you should hold on to, and also leave some
uncollected garbage lying around.  Chances are, you'll get that uncollected
stuff on the NEXT garbage collect.  So you trade-off: you need slightly
bigger spaces (to allow for that uncollected garbage), BUT the garbage
collecter AND the normal routine entry/exit run faster.

gwyn@BRL.ARPA (VLD/VMB) (10/19/85)

What we need are fewer "neat hacks" that don't work
and more good algorithms.

petera@hcrvax.UUCP (Smith) (10/21/85)

> what we need are fewer good hacks .... 

  While actually accessing the C stack is a short cut it does pay off. The
resulting LISP interpreter is about TWICE as fast because you don't need
to stack local variables twice. It also uses half as much stack space. 
It is not portable but then again the routine needed to scan through the
stack is quite simple and can easily be recoded for each machine
you put it on. I think the original authors choice of the word "hack"
is inapropriate I would call it a necessary evil to get the speed of the
interpreter up and space requirements down. It is also no more prone to
error than building a stack yourself. So, why build a stack when there is
already one available?

   I fail to see the reason why we should not try to make a program faster
at the expense of portability if such a change is well localized. Isn't the
effort worth it? Or don't we care about the performance of our programs!

	Peter Ashwood-Smith,
	Human Computing Resources,
	Toronto Ontario.

gwyn@BRL.ARPA (VLD/VMB) (10/24/85)

The original suggestion was to perform garbage collection
on the STACK, not on the HEAP.  There is a BIG difference.

Both the DMD (5620) and 8th Ed. UNIX have garbage-collecting
equivalents of malloc().  The DMD does it right, by using an
extra level of indirection, and the 8th. Ed. uses the "neat
hack", but only on the gcalloc()ed HEAP, not on the run-time
STACK.  (It coexists with malloc() and leaves the malloc()
arena alone.)

The best LISP garbage collectors perform on-the-fly, so
that there is no screeching halt while the trash compactor
operates.

I'm all for factors-of-2 performance improvements!

tp@ndm20 (10/29/85)

On the subject, could anyone give me literature references for a good
on-the-fly  garbage collector?   Something  in C  would be especially
useful.  What I'm looking at doing is the following:  I would like an
allocator that managed disk and memory, so I could allocate more than
I had room for (for machines with limited address space) and have the
allocator and associated routines move things off to disk if I am not
using them, garbage collect some space, and bring in the  item I want
from disk.  Basically software VM.  References to such  a beast would
be GREATLY appreciated, but I'll be happy with just  stuff on garbage
collection.

Thanks,
Terry Poot
Nathan D. Maier Consulting Engineers
(214)739-4741
Usenet: ...!{allegra|ihnp4}!convex!smu!ndm20!tp
CSNET:  ndm20!tp@smu
ARPA:   ndm20!tp%smu@csnet-relay.ARPA