[comp.arch] Programmed code generation

nather@ut-sally.UUCP (Ed Nather) (07/14/88)

In article <5262@june.cs.washington.edu>, pardo@june.cs.washington.edu (David Keppel) writes:
> 
> Let me rephrase my position.  There is nothing architecturally weird
> about programs that generate their own code.  Doing so does not cause
> any problems on any machines that I am aware of, although few
> OPERATING SYSTEMS support this.
> 

And no LANGUAGES that I'm aware of.  But that's the whole point. CAN they?

[ separate I-D cache arguments omitted, thus excising the neat idea
  of a "snoopy" cache ]

> Assume that you have an algorithm that is, say, 1000 instructions and
> that each instruction takes 4 cycles to execute (must be a RISC :-).
> One particular part of it, 10 instructions, is written as
> self-modifying code.  It can also be written as 100 instructions of
> non-self-modifying (static) code.  

I wouldn't use self-generated code in such a case, I'd do it the way you
suggest.  But in a different case:

The algorithm has 1000 or so set-up instructions, which generate a small,
fast loop that is executed for each and every pixel displayed on a screen,
but which must do different things to each pixel depending on current
program conditions -- maybe 10 or 12 decisions that can be folded into
perhaps 4 or 5 instructions, but which would require 10 or 12 branch
instructions (plus the executed instructions) for every pass, learning 
over and over (and over, and over) what the constraining program conditions 
are.  And how many pixels are there on a screen?  Well, lots -- 1024 X 1024 
is almost obsolete now :-).

The alternative, to have a separate (static) loop for each possibility,
will run you out of space very quickly if the program conditions that make
sense are combinatorial in nature -- the commonest case.  And a few hundred
copies of *nearly* the same loop would not be easy to maintain.

I believe we can agree (can we?) that some useful conditions can arise where
self-generated code can be very useful.  The term "self-modifying code" 
is, I think, a mis-nomer, implying a loop which changes itself into something
else, which then ... 

But if code generation is kept separate from code execution (which might well
be a reasonable condition to impose in any formal description), I doubt any
serious confusion would arise, and I can't see that much would be lost.  Thus
the term "self-generating" code might be better, or perhaps even "programmed
code generation."  That makes it sound less scary.

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{backbones}!{noao,ut-sally}!utastro!nather
nather@astro.as.utexas.edu

kers@otter.hple.hp.com (Christopher Dollin) (07/15/88)

Ed Nather says (starting with a quote):

| In article <5262@june.cs.washington.edu>, pardo@june.cs.washington.edu 
| (David Keppel) writes:
| >
| > Let me rephrase my position.  There is nothing architecturally weird
| > about programs that generate their own code.  Doing so does not cause
| > any problems on any machines that I am aware of, although few
| > OPERATING SYSTEMS support this.
| >
|
| And no LANGUAGES that I'm aware of.  But that's the whole point. CAN they?

Pop11 and its predecessors Pop2, Pop10. In Pop11 the compiler is a bunch of
procedures that user programs can call to plant code which is then runnable
by those same user programs. But then that's what you expect in an incremental
programming environment.

In Poplog (the "home ground" of Pop11), the compiler procedures are also 
accessible from the other languages of the system - Common Lisp, Prolog, ML.
The "other" languages are implemented using those same compiler routines.

The shared virtual machine language is translated to native code in most
Poplog implementations. "All" it takes to move the system is for the bottom 
level to be ported - the rest of the compilers move for free.

So languages exist in which programs can generate "their own code". It's not
even peculiar to do so.

Regards,
Kers.   | "If anything anyone lacks, they'll find it all ready in stacks".

PS. The recent "partial application in C" debate in comp.wahtever was 
interesting as partial application is another Pop11 primitive procedure .....

peter@ficc.UUCP (Peter da Silva) (07/16/88)

In article <12381@ut-sally.UUCP>, nather@ut-sally.UUCP writes:
> In article ... pardo@june.cs.washington.edu (David Keppel) writes:
> > There is nothing architecturally weird about programs that generate
> > their own code. ...although few OPERATING SYSTEMS support this.

> And no LANGUAGES that I'm aware of.  But that's the whole point. CAN they?

Do you count Forth as a language?

For that matter, I've generated and then executed code in a program
written in Forth running under UNIX. Real code, not threaded code:

CREATE UNDER ASSEMBLER
	S )+ R0 MOV,
	S )+ R1 MOV,
	R0 S -) MOV,
	R1 S -) MOV,
	R0 S -) MOV,
	NEXT,

Which can immediately be used...
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?" (uunet,tness1)!sugar!ficc!peter.

bzs@bu-cs.BU.EDU (Barry Shein) (07/17/88)

Did I miss something critical here? Most lisp systems generate and
execute there own machine code. They usually do so by loading the code
into the data segment but not for any particular reason other than
that few OS's support the extension and modification of a text
segment, and that it probably doesn't much matter other than a) not
being able to mark it read-only which would usually be desireable (tho
this could be done in data or text) and b) possibly taking advantage
of a different paging strategy for text vs data pages, if relevant.
None of that is critical tho both could be very useful.

Exploiting shared library designs might make this more feasible in the
future, it occurs to me that it might be possible to dynamically build
a file object that the OS will consider a shared library text and then
page it back into the text section. I would imagine some changes might
be needed to current designs (eg. do the shareable objects and symbols
have to be fully specified at initial load-time?)

Speaking of lisp and self-modifying code I worked on an IBM370 lisp
which accomplished trace and stepping modes (debugging) by
self-modifying the top of the eval loop. Of course it didn't save much
more than a test and branch, but eval loops are entered for every
operator/function (eg add, subtract etc etc etc), on a 370/158 with
less than a MIP and 100 logged in users one becomes a fanatic about
such things, I don't know how much it helped in fact.

And there was always the IBM370 MVC (move characters) instruction
which only took a fixed number of chars specified at assemble time, a
common sequence was (particular register numbers irrelevant):

	stc	r1,*+5
	mvc	0(r2,0),0(r3)

which overwrote the second zero (,0), the length. The 370 of course
provided an EX (execute) instruction which allowed indirect execution
of an instruction after OR'ing in a particular register's low byte,
naturally that was designed to accomplish the above (SS instructions
which took a fixed length field all had the same basic format using
either the byte or two nibbles as a length indicator.) Why people
tended to use the store char hack above rather than EX was always a
mystery to me, other than tribal practice.

	-Barry Shein, Boston University