[comp.lang.c] Programmed code generation

nather@ut-sally.UUCP (Ed Nather) (07/14/88)

In article <5262@june.cs.washington.edu>, pardo@june.cs.washington.edu (David Keppel) writes:
> 
> Let me rephrase my position.  There is nothing architecturally weird
> about programs that generate their own code.  Doing so does not cause
> any problems on any machines that I am aware of, although few
> OPERATING SYSTEMS support this.
> 

And no LANGUAGES that I'm aware of.  But that's the whole point. CAN they?

[ separate I-D cache arguments omitted, thus excising the neat idea
  of a "snoopy" cache ]
 
> Assume that you have an algorithm that is, say, 1000 instructions and
> that each instruction takes 4 cycles to execute (must be a RISC :-).
> One particular part of it, 10 instructions, is written as
> self-modifying code.  It can also be written as 100 instructions of
> non-self-modifying (static) code.  

I wouldn't use self-generated code in such a case, I'd do it the way you
suggest.  But in a different case:

The algorithm has 1000 or so set-up instructions, which generate a small,
fast loop that is executed for each and every pixel displayed on a screen,
but which must do different things to each pixel depending on current
program conditions -- maybe 10 or 12 decisions that can be folded into
perhaps 4 or 5 instructions, but which would require 10 or 12 branch
instructions (plus the executed instructions) for every pass, learning 
over and over (and over, and over) what the constraining program conditions 
are.  And how many pixels are there on a screen?  Well, lots -- 1024 X 1024 
is almost obsolete now :-).

The alternative, to have a separate (static) loop for each possibility,
will run you out of space very quickly if the program conditions that make
sense are combinatorial in nature -- the commonest case.  And a few hundred
copies of *nearly* the same loop would not be easy to maintain.

I believe we can agree (can we?) that some useful conditions can arise where
self-generated code can be very useful.  The term "self-modifying code" 
is, I think, a mis-nomer, implying a loop which changes itself into something
else, which then ... 

But if code generation is kept separate from code execution (which might well
be a reasonable condition to impose in any formal description), I doubt any
serious confusion would arise, and I can't see that much would be lost.  Thus
the term "self-generating" code might be better, or perhaps even "programmed
code generation."  That makes it sound less scary.

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{backbones}!{noao,ut-sally}!utastro!nather
nather@astro.as.utexas.edu

peter@ficc.UUCP (Peter da Silva) (07/16/88)

In article <12381@ut-sally.UUCP>, nather@ut-sally.UUCP writes:
> In article ... pardo@june.cs.washington.edu (David Keppel) writes:
> > There is nothing architecturally weird about programs that generate
> > their own code. ...although few OPERATING SYSTEMS support this.

> And no LANGUAGES that I'm aware of.  But that's the whole point. CAN they?

Do you count Forth as a language?

For that matter, I've generated and then executed code in a program
written in Forth running under UNIX. Real code, not threaded code:

CREATE UNDER ASSEMBLER
	S )+ R0 MOV,
	S )+ R1 MOV,
	R0 S -) MOV,
	R1 S -) MOV,
	R0 S -) MOV,
	NEXT,

Which can immediately be used...
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?" (uunet,tness1)!sugar!ficc!peter.

bzs@bu-cs.BU.EDU (Barry Shein) (07/17/88)

Did I miss something critical here? Most lisp systems generate and
execute there own machine code. They usually do so by loading the code
into the data segment but not for any particular reason other than
that few OS's support the extension and modification of a text
segment, and that it probably doesn't much matter other than a) not
being able to mark it read-only which would usually be desireable (tho
this could be done in data or text) and b) possibly taking advantage
of a different paging strategy for text vs data pages, if relevant.
None of that is critical tho both could be very useful.

Exploiting shared library designs might make this more feasible in the
future, it occurs to me that it might be possible to dynamically build
a file object that the OS will consider a shared library text and then
page it back into the text section. I would imagine some changes might
be needed to current designs (eg. do the shareable objects and symbols
have to be fully specified at initial load-time?)

Speaking of lisp and self-modifying code I worked on an IBM370 lisp
which accomplished trace and stepping modes (debugging) by
self-modifying the top of the eval loop. Of course it didn't save much
more than a test and branch, but eval loops are entered for every
operator/function (eg add, subtract etc etc etc), on a 370/158 with
less than a MIP and 100 logged in users one becomes a fanatic about
such things, I don't know how much it helped in fact.

And there was always the IBM370 MVC (move characters) instruction
which only took a fixed number of chars specified at assemble time, a
common sequence was (particular register numbers irrelevant):

	stc	r1,*+5
	mvc	0(r2,0),0(r3)

which overwrote the second zero (,0), the length. The 370 of course
provided an EX (execute) instruction which allowed indirect execution
of an instruction after OR'ing in a particular register's low byte,
naturally that was designed to accomplish the above (SS instructions
which took a fixed length field all had the same basic format using
either the byte or two nibbles as a length indicator.) Why people
tended to use the store char hack above rather than EX was always a
mystery to me, other than tribal practice.

	-Barry Shein, Boston University

boyne@hplvli.HP.COM (@Art Boyne) (07/18/88)

>>'When will you guys realize that there is "a rat" in sep*arat*e?'
>
>Why do you use *realism* and *realist* but then switch to *realize*?
>
>                             s m ryan

Because my Webster's indicates that *realize* is the only valid spelling!

smryan@garth.UUCP (Steven Ryan) (07/22/88)

>Because my Webster's indicates that *realize* is the only valid spelling!

I have a dictionary that says -ise is normal and -ize is an american variant.

Rather than continue off a tangent: somebody got all hot and botherred over
a misspelling. Big deal! Lots of clumsy fingers on clumsy keyboards; some
people typing faster than they're thinking; not everybody proofreading; some
stubborn spellers; British and American spellers; and lotsa nonnative speakers
of english out there who do a remarkable job of following along on a keyboard
and screen not designed for english. I just might be able to follow a
discussion in french, but I wouldn't stand a chance in german.

Hey, kids, let's be tolerant. Instead of finding differences to fight over,
let's enjoy the diversity and complexity of humanity.

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (07/22/88)

In article <1051@garth.UUCP>, smryan@garth.UUCP (Steven Ryan) writes:
> >Because my Webster's indicates that *realize* is the only valid spelling!
> I have a dictionary that says -ise is normal and -ize is an american variant.

That's funny.  My Oxford English Dictionary says that -ize is the
correct English spelling everywhere and that -ise is an acceptable
British variant.

Here's an excerpt from the Oxford English Dictionary:

   "-isation", frequent variant of "-ization".
   
   "ise", a frequent spelling of "ize", suffix forming verbs, which see.
   
   "-ization", suffix forming nouns of action from verbs, in "ize".
   
   "ize", suffix forming verbs. ... and this became established as the
   normal form form for the latinizing of Greek verbs, or the formation
   of verbs upon Greek analogies.  ... in modern French, the suffix has
   become "iser" ....  Hence, some have used the spelling "ise" in English,
   as in French, ... and some prefer "ise" in words formed in French
   or English from Latin elements, retaining "ize" for those of Greek
   composition.  But the suffix itself, whatever the element to which
   it is added, is in its origin the Greek iota-zeta-epsilon-iota-nu,
   Latin "izare"; and, as the pronounciation is also with z, there is
   no reason why in English the special French spelling should be followed,
   in opposition to that which is at once etymological and phonetic.
   In this Dictionary the termination is uniformly written "ize".

Note especially the last three lines:  "there is no reason why ...".

The unix spell program is especially bad on this.
The man page even documents that it is wrong.

The -ize words really should be put into the common word list.
Insisting on -ise with the -b option is almost as bad as
insisting on "nite" and rejecting "night" for American spelling.