[comp.lang.c] Partial application in C

leo@philmds.UUCP (Leo de Wit) (06/18/88)

In article <22944@oliveb.olivetti.com> chase@orc.olivetti.com () writes:
>(and now for something completely different)
>As a bit of an experiment based on a comment by Richard Stallman, I
>decided to implement partial application for C on a Sun, and I did it
>and it worked.  Perhaps you will find this interesting--comments are
>welcome.
>----------------
>The subroutine "make_p_app" takes a function and an "argument" (in
>this case, an int; I am well aware of the non-portability of this).
>The value returned is a pointer-to-function (actually some
>heap-allocated memory) which when called will prepend "argument" to
>the argument list in that call.  For example, you could use this to

If I understand it correctly the pointer points to a memory location
that you fill and jump to in run-time? Then you can expect some
problems: there are machines that don't allow you to execute data as
code (in fact that's safer I think; what will happen if some pointer
goes berzerk and modifies your precious code? The least error is an
illegal instruction or invalid opcode, but worse things could happen).
It also has a smell of self-modifying code about it (but maybe you like
that smell 8-). If you remember the discussion about executing data we
have had recently - don't remember the exact title - you would also be
aware of the fact that something like

static short code[] = {
    0x....,0x....
    ....
};

    /* ===== */

    (*(void (*)())code)(x,y,z);

is not portable, not (only) for the 'code' but also for the construct.
(But I used it myself on a MC68000 micro for a program that test how
many ticks it takes to execute a series of shorts!).  But let's
continue:

>the argument list in that call.  For example, you could use this to
>get a printf-like function to a file
>
>  gprintf = make_p_app(fd,stderr)
>
>or a function that always returns a new pointer to an integer
>
>  new_int = make_p_app(malloc,sizeof(int))
>
>These examples are only so-so---one could easily argue that these
>could just as easily be written as separate subroutines or subroutines
>that use static data (non-reentrant! boo! hiss!).  However, I think
>that situations will arise where this is more convenient or flexible.
   [Sun implementation of code left out]...

The only merit I see for such a scheme is for performance reasons;
however if you're talking performance your garbage collector should be
in hardware.  If performance is less crucial, you could do something
like:

typedef struct {
    void (*pa_func)();
    int  pa_parm;
} partapp;

    partapp new_int;

    new_int.pa_func = malloc; new_int.pa_parm = sizeof(int);
    /* ... or let a 'make_p_app()' initiate it; and then ... */

    exec_p_app(new_int,other,parameters,follow,here,...);

The exec_p_app could be created inline if you have a clever compiler;
it does something analogous to the code you created, only getting the
stack right is done on run-run time instead of compile-run time, if you
can follow me 8-). If it is inline it is even faster, requiring only
the rearrangement of the stack and the call to malloc(); if you're
lucky and/or you swap the pa_func and pa_parm members you posssibly
won't have to rearrange at all. If it is not it will probably be about
as fast as the code you proposed.  I can figure that out if you're
interested. If you are, could you mail me a copy of the relevant part
of the code (your example is a bit terse)?

>Stallman thinks that similar code could also be written for a VAX,
>though the chunk of code must include a mask copied from the partially
>applied function and jump to the instruction after the mask.  Anyone
>care to do this for other machines and post their results? (send to me
>and I'll summarize)
>----------------

Note that the solution is not only machine but also
compiler-dependent.  It depends for instance on the calling convention
used (forward/reverse order of arguments on stack, size of elements on
stack, stack frame etc).  But generally speaking implementing such a
solution would be a piece of cake; I'll bake you one 8-).

>Reclaiming the memory is not a problem for me because I've got a
>garbage collector (if you don't believe in garbage collection, then
>you will have to invent your own solution;  I'm pretty happy with
>garbage collection because it lets me worry about more interesting
>problems.  If there is substantial interest I can post the collector;
>it is in the public domain, but the public doesn't know it.).
>
>David Chase
>Olivetti Research Center, Menlo Park

I love garbage collection; why else would I read netnews 8-) ?
And I'm interested in the g.c.

	Leo.

djones@megatest.UUCP (Dave Jones) (06/19/88)

From article <509@philmds.UUCP>, by leo@philmds.UUCP (Leo de Wit):

> there are machines that don't allow you to execute data as
> code.

Which ones?  


		Dave J.

jimp@cognos.uucp (Jim Patterson) (06/21/88)

In article <611@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <509@philmds.UUCP>, by leo@philmds.UUCP (Leo de Wit):
>> there are machines that don't allow you to execute data as
>> code.
>
>Which ones?  

The HP/3000 is one.
 

-- 
Jim Patterson                              Cognos Incorporated
UUCP:decvax!utzoo!dciem!nrcaer!cognos!jimp P.O. BOX 9707    
PHONE:(613)738-1440                        3755 Riverside Drive
                                           Ottawa, Ont  K1G 3Z4

leo@philmds.UUCP (Leo de Wit) (06/22/88)

In article <611@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <509@philmds.UUCP>, by leo@philmds.UUCP (Leo de Wit):
>> there are machines that don't allow you to execute data as
>> code.
>Which ones?  
>		Dave J.

Is good ol' PDP-11 a good enough example for you?
In fact every machine with a virtual memory system that separates data
and code. Read for instance Andrew S. Tanenbaum, Structured Computer
Organization, section 6.4.10. (virtual memory on the PDP-11). The
hardware maps virtual addresses to physical addresses, so that for
instance

    jmp 200

jumps to address 200 of the text space, and

    clr 200

clears the word at address 200 of the data space (which for the
PDP-11/44 is a totally different location). Trying to execute the
'200'-data address by calculating its 'text'-address (if you were able
to do so) results in a segmentation violation: the PC is not within the
text space boundaries.  Another example: a MC68000 with a decently used
MMU (I mean to say: many micro's using a 68K don't exploit the use of
an MMU fully; obviously because this requires a REAL O.S. 8-).

Hope this answers your question (I cannot give you ALL the names, if 
that was what you were looking for) ?

    Leo.

djones@megatest.UUCP (Dave Jones) (06/25/88)

From article <3353@cognos.UUCP>, by jimp@cognos.uucp (Jim Patterson):
> there are machines that don't allow you to execute data as
> code.

So this guy goes to the doctor, see, and he wiggles his arm and
says, "	Doc, my arm hurts when I do this."  So the doctor says,
"Then don't *DO* that!"

When I published a runtime linking loader a while back, there were
some who mentioned that on some machines you could not "execute data."
My immediate reaction was to say, "Then don't *USE* those kinds of machines!"

But of course, you may have to.  It's good to know that the restriction
exits, although I will continue to use dynamic loading, because I
have applications that absolutely scream for it.

I began to wonder why such a restriction might be deemed necessary.
Was it Big Brother engineering?  -- Thou shalt not modify thy
executable, for it is a Bad Thing. -- Or is there a valid technical
reason behind it?  I can see one possible rationale: You can have 128KB of 
memory in a sixteen bit machine, divided evenly between data and code,
if you use all the addresses for both kinds of memory.

pardo@june.cs.washington.edu (David Keppel) (06/25/88)

djones@megatest.UUCP (Dave Jones) writes:
>From article <3353@cognos.UUCP>, by jimp@cognos.uucp (Jim Patterson):
>[ Some machines: can't executet data as code ]
[ Why?  So we can't change executables?  Or a valid technical reason? ]

Valid reason: You can have seperate memories and seperate busses and
be fetching both instructions and data at the same time.

	;-D on  ( Another wild and crazy opinion )  Pardo

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/25/88)

In article <619@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>I began to wonder why such a restriction might be deemed necessary.
>Was it Big Brother engineering?  -- Thou shalt not modify thy
>executable, for it is a Bad Thing. -- Or is there a valid technical
>reason behind it?  I can see one possible rationale: You can have 128KB of 
>memory in a sixteen bit machine, divided evenly between data and code,
>if you use all the addresses for both kinds of memory.

That's one reason, probably the main one behind split-I&D PDP-11s.

Another reason is that high-performance processors generally pipeline
instructions, and if you could modify the code right in front of the
PC, it would require invalidation of the prefetch, which is extra
architectural overhead that we would prefer to do without.

daveb@llama.rtech.UUCP (Dave Brower) (06/26/88)

In article <3353@cognos.UUCP> jimp@cognos.UUCP (Jim Patterson) writes:
>In <611@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>>From article <509@philmds.UUCP>, by leo@philmds.UUCP (Leo de Wit):
>>> there are machines that don't allow you to execute data as
>>> code.
>>
>>Which ones?  
>
>The HP/3000 is one.

Even the new Spectrum versions?  Odd, the UNIX on the 9000/8xx (same
processor) lets you execute data.

-dB
{amdahl, cpsc6a, mtxinu, sun, hoptoad}!rtech!daveb daveb@rtech.com <- FINALLY!

smryan@garth.UUCP (Steven Ryan) (06/26/88)

>I began to wonder why such a restriction might be deemed necessary.
>Was it Big Brother engineering?  -- Thou shalt not modify thy
>executable, for it is a Bad Thing. -- Or is there a valid technical
>reason behind it?

Most any system with page/segment descriptors allows execute only or read
only memory.

- Increased the address space for a PDP-11.

- Trashing data with a bad pointer is hard enough to track down; trashing
  code is even worse.

- If code cannot be modify, the system can safely and aggressively cache it.

The operating system should provide a way to move pages between data and
instruction space during execution.

blandy@marduk.cs.cornell.edu (Jim Blandy) (06/26/88)

About executing data:

	With rare exception, I think we can all say that self-modifying
	code is horrid, and anyone who writes it should be left alone.

	It's nice to have one's code segment protected from out-of-control
	writes.

	Caching is great.  I'd guess ? that if a program's having trouble
	with caching, the fault is in the design, not the caching.  I
	wouldn't be too surprised by a notable exception, but I think
	it's generally true;  YOU DON'T MODIFY ACTIVELY RUNNING CODE,
	so the assumptions made by a caching system should hold.

Where executing data really comes in handy is in situations like
interactive compilers.  For example, Chez Scheme is a neato implementation
of Scheme (a popular Lisp dialect); you define a function, chez compiles it
and lets you execute assembly language, not some wimpy scheme p-code.  If
you really want to do this right, you need to put the code somewhere in
your own address space; you need to be able to execute your data.

One could call this self-modifying code, and they'd be right, strictly
speaking, but it's a clean, upstanding use for executable data spaces.

(I have no affiliation with the Chez Scheme people; I just think it's a
good program.)
--
Jim Blandy - blandy@crnlcs.bitnet
"insects were insects when man was just a burbling whatsit."  - archie

boyne@hplvly.HP.COM (Art Boyne) (06/27/88)

djones@megatest.UUCP (Dave Jones) writes:
>From article <3353@cognos.UUCP>, by jimp@cognos.uucp (Jim Patterson):
>[ Some machines: can't executet data as code ]
[ Why?  So we can't change executables?  Or a valid technical reason? ]

Another valid reason:  on virtual memory machines, when bringing in a new
page to memory from disk, a unmodifiable code page can simply be deleted,
whereas a (modified) data page must be written back to disk.  Having
unmodifiable code segments therefore reduces disk activity.

Art Boyne, !hplabs!hplvly!boyne 

16012_3045@uwovax.uwo.ca (Paul Gomme) (06/28/88)

In article <619@goofy.megatest.UUCP>, djones@megatest.UUCP (Dave Jones) writes:
> 
> From article <3353@cognos.UUCP>, by jimp@cognos.uucp (Jim Patterson):
>> there are machines that don't allow you to execute data as
>> code.
> 

> 
> I began to wonder why such a restriction might be deemed necessary.
> Was it Big Brother engineering?  -- Thou shalt not modify thy
> executable, for it is a Bad Thing. -- Or is there a valid technical
> reason behind it?  I can see one possible rationale: You can have 128KB of 
> memory in a sixteen bit machine, divided evenly between data and code,
> if you use all the addresses for both kinds of memory.


	Unless my memory is failing me completely, I believe that OS/2 will
absolutely prohibit "executing data".  In fact, it does away with the
ubiquitous (MS-DOS) .COM file for the simple reason that they share code
and data segments.  My recollection is that this restriction is in place
in order to allow for "orderly" multitasking - i.e. if one process runs
amok, it shouldn't affect the other processes, which could occur if a program
can alter its code segment.
	Besides, I thought that self-modifying code was (a) extremely difficult
to write, and (b) considered poor programming practice.
-------------------------------------------------------------------------
Paul Gomme                             E-Mail:
Department of Economics
University of Western Ontario          Bitnet:  p.gomme@uwovax.bitnet
London, Ontario, Canada                         p.gomme@uwovax.uwo.ca
N6A 5B7                                ARPA:    p.gomme@uwo.ca
(519) 679-2111 ext. 6418

chase@Ozona.orc.olivetti.com (David Chase) (06/28/88)

(Replies, in no particular order)

> With rare exception, I think we can all say that self-modifying
> code is horrid, and anyone who writes it should be left alone.

You should look a little harder about the example I provided.  I was
implementing an interesting (to some people) abstraction; the use of
executed data was just one way to make this happen.  The code, once
written, does not later modify itself.  Partial application is
certainly less horrid than static data.  If nothing else, it is
re-entrant (unlike the Unix library).

> [comments on pipelines]

As a practical matter, since the "make_p_app" subroutine is
machine-dependent, one can always write code that does the
appropriate number of nops before returning the partially applied
function.

> [comments on caches]

The first time around you can count on that code NOT being in the
instruction cache.  One hopes, for the sake of dynamic loading of code
and similar things, that it is possible to get some sort of handle on
the caches.  Assuming that this is in fact true, then the
machine-dependent make_p_app flushes the appropriate caches in the
appropriate ways.

> [comments on split I&D]

I was aware of this, but wanted to know how pervasive this technique
is.  I'm not a real big fan of the technique precisely because it
makes this sort of thing difficult, but then that appears to be a
matter of taste.  So far it appears that the following machines use
separate I and D segments:

some PDP-11s
some 68ks (not Suns)
     HP3000
some 80386s (depending on segment registers?)

> It's nice to have one's code segment protected from out-of-control
> writes.

I agree, and that is not incompatible with my implementation of
partial application (it can't be, since my code segment is protected
from out-of-control writes).  (In fact, I agree so much that I have
become a fan of garbage collection.  As I said before, this is
possible in C, and though it is very easy to write code that breaks
the collector it is also very easy to write useful programs that do
not.).

I hope this covers all the various questions.  I was sort of hoping
that somebody out there would get excited about the idea of partial
application, or maybe send details on how to do it on their machine.

Someone also proposed a special widget for doing the rest of a partial
application--I'm sorry, but that is totally uninteresting.  I want a
result that looks exactly like any other function, so I can pass it
around, assign it to pointer-to-function variables, etc.

David

wes@obie.UUCP (Barnacle Wes) (06/28/88)

From article <3353@cognos.UUCP>, by jimp@cognos.uucp (Jim Patterson):
> there are machines that don't allow you to execute data as
> code.

In article <619@goofy.megatest.UUCP>, djones@megatest.UUCP (Dave Jones) replies:
% I began to wonder why such a restriction might be deemed necessary.
% Was it Big Brother engineering?  -- Thou shalt not modify thy
% executable, for it is a Bad Thing. -- Or is there a valid technical
% reason behind it?  I can see one possible rationale: You can have 128KB of 
% memory in a sixteen bit machine, divided evenly between data and code,
% if you use all the addresses for both kinds of memory.

This is called `Split I & D space' in PDP-11 vernacular.  Many other
machines have similar features - the '286 for instance. 

Every segment on the 286 (in 286 mode, of course) has a descriptor. 
Each descriptor has a byte, called the Access Rights Byte, that
contains protection bits for the segment.  Bit 4 is the 'executable'
flag; if 0, this is a Data Segment Descriptor, if 1, this is a
Executable Code Segment Descriptor.

Now, if this is a Data Segment Descriptor, bit 1 describes whether
this segment is read-only (0), or read-write (1).  If, on the other
tentacle, this is an Executable Code Segment Descriptor, bit 1
describes whether the segment can be read (1) or not (0).  You cannot
write to an Executable Code Segment!  The system, of course, loads a
program as data segments and then modifies the segment descriptors to
make them executable.

I hope this wasn't hopelessly overboard - looking a real data (even if
it is for a brain-dead cpu like the 286) makes things look a little
clearer to me.

-- 
  {uwmcsd1, hpda}!sp7040!obie!wes  | "If I could just be sick, I'd be fine."
	+1 801 825 3468	           |          -- Joe Housely --

smryan@garth.UUCP (Steven Ryan) (06/29/88)

>I hope this covers all the various questions.  I was sort of hoping
>that somebody out there would get excited about the idea of partial
>application, or maybe send details on how to do it on their machine.

It's easy to provide partial application and first class procedures
using a heap (especially with garbage collection). It's not really necessary
modify instruction space to do this.

chris@mimsy.UUCP (Chris Torek) (06/29/88)

>It's easy to provide partial application and first class procedures
>using a heap (especially with garbage collection). It's not really necessary
>modify instruction space to do this.

True enough.  The modified I-space will be faster, though, at least if
the newly created procedures are called often.  There is no particular
reason that OSes should not have calls to create a new chunk of I-space
(or get rid of an old one), except that the demand for it has been low.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

ddb@ns.UUCP (06/30/88)

In article <3950006@hplvly.HP.COM>, boyne@hplvly.HP.COM (Art Boyne) writes:
> ...on virtual memory machines,
> ...Having
> unmodifiable code segments therefore reduces disk activity.
  No, it's whether the code segments *ACTUALLY ARE* modified that should
control disk activity; any decent virtual memory system must be able to 
determine if a writable page has actually been written!  So, truly pure
code does reduce paging, but the ABILITY to have impure code shouldn't
cost anything unless it's used.
-- 
                  -- David Dyer-Bennet
		     Fidonet 1:282/341.0

mcdonald@uxe.cso.uiuc.edu (07/01/88)

>Every segment on the 286 (in 286 mode, of course) has a descriptor. 
>Each descriptor has a byte, called the Access Rights Byte, that
>contains protection bits for the segment.  Bit 4 is the 'executable'
>flag; if 0, this is a Data Segment Descriptor, if 1, this is a
>Executable Code Segment Descriptor.

>Now, if this is a Data Segment Descriptor, bit 1 describes whether
>this segment is read-only (0), or read-write (1).  If, on the other
>tentacle, this is an Executable Code Segment Descriptor, bit 1
>describes whether the segment can be read (1) or not (0).  You cannot
>write to an Executable Code Segment!  The system, of course, loads a
>program as data segments and then modifies the segment descriptors to
>make them executable.

And, for all but brain-dead compilers and/or operating systems,
a cast of a data pointer to a code pointer (or vice-versa) should
still work. The compiler either does it directly, or, in the case
of aggressively over-protective operating systems, calls a system
routine which generates a segment descriptor of the proper type which
points to the same location in memory. I would presume that the ANSI
standard would describe in detail the requirements for this. If
a facility such as this is missing, C would be a seriously restricted
language: no Forth written in C, no Turbo Pascal or Turbo C, no
incremental compilation! OS/2, for example, has a system call to do this;
whether the compiler automatically makes that call I doubt.
I believe it is called DOSCreateAlias or DOSCreateCodeAlias. What is
the equivalent call in UNIX?

Doug McDonald

nevin1@ihlpf.ATT.COM (00704a-Liber) (07/08/88)

[followups to comp.os.misc]

In article <291@ns.nsc.com> ddb@ns.nsc.com (David Dyer-Bennet) writes:
|In article <3950006@hplvly.HP.COM>, boyne@hplvly.HP.COM (Art Boyne) writes:
|> ...on virtual memory machines,
|> ...Having
|> unmodifiable code segments therefore reduces disk activity.
|  No, it's whether the code segments *ACTUALLY ARE* modified that should
|control disk activity; any decent virtual memory system must be able to 
|determine if a writable page has actually been written!  So, truly pure
|code does reduce paging, but the ABILITY to have impure code shouldn't
|cost anything unless it's used.

It still does, however.  If an executing code segment is restricted to being
unmodifiable, then it NEVER has to be written out to disk.

If an executing code segment is modifiable, then a slightly more complex
paging scheme needs to be used.  It can be done, but not without cost.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				You are in a little twisting maze of
 /  / _ , __o  ____		 email paths, all different.
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

henry@utzoo.uucp (Henry Spencer) (07/08/88)

> Besides, I thought that self-modifying code was (a) extremely difficult
> to write, and (b) considered poor programming practice.

Don't think of self-modifying code, which is indeed an abomination.  Think
of code that generates other code at execution time.  For example, the
fastest implementations of RasterOp generate custom-built code at run time,
and then execute it, when the rasters being manipulated are big.  Various
incremental compiler/interpreter hybrids are another obvious example --
done well, dynamic code generation can give near-compiler speeds without
sacrificing the advantages of an interpreter.

daveb@geac.UUCP (David Collier-Brown) (07/08/88)

In article <429@uwovax.uwo.ca> 16012_3045@uwovax.uwo.ca (Paul Gomme) writes:
[discussion of execute-only code segments]
>	Besides, I thought that self-modifying code was (a) extremely difficult
>to write, and (b) considered poor programming practice.

  Yes, it is and it is.

  What is really wanted here is architectural (and language) support
of the "generate and execute" paradigm, which grew out of the old
self-modifying-code techniques when
	a) people started using HLLs like lisp, and
	b) people started trying to deal with complex/unconstrained
	   problems.
  What you tend to see now is something like the "sort generators"
of the self-modifying-code era, except they're generators for all
sorts of special-purpose functions.

  The architectural support should include the ability to write code
to a logical segment, then make it executable (and normally
read-only!), as mentioned early in this thread.
  The language support ranges from a facility to call something
which one has a pointer to (C), to integrated compiler/interpreter
sets (lisp, prolog, etc)...  For obvious reasons, persons wanting to
write the latter in the former want to be able to create such
pointers, preferably without requiring language/OS changes.

 --dave ((lisp has (too many) (parentheses))) c-b
-- 
 David Collier-Brown.  {mnetor yunexus utgpu}!geac!daveb
 Geac Computers Ltd.,  | "His Majesty made you a major 
 350 Steelcase Road,   |  because he believed you would 
 Markham, Ontario.     |  know when not to obey his orders"

chase@Ozona.orc.olivetti.com (David Chase) (07/09/88)

In article <429@uwovax.uwo.ca> 16012_3045@uwovax.uwo.ca (Paul Gomme) writes:
>in order to allow for "orderly" multitasking - i.e. if one process runs
>amok, it shouldn't affect the other processes, which could occur if a program
>can alter its code segment.
>	Besides, I thought that self-modifying code was (a) extremely difficult
>to write, and (b) considered poor programming practice.

Let me try to put this discussion back on track (or shelve it
permanently).  Note the subject: "Partial application in C".

Modifying code segments is totally unrelated; what I wanted to know
was, "what machines allow or don't allow execution within the data
segment".  This is NOT self-modifying code, this is a technique for
implementing partial application.  Because I am only creating code,
then executing it (perhaps) several times, it is possible for me to

  (a) execute NOPS to clean out the instruction pipeline
or
  (b) execute cache flush instructions
or
  (c) execute a special block of some number of nops/jumps to be sure
      to touch the instruction cache line where the data might be
      stored (in the event that we are reusing some data addresses for
      a second round of code generation).
or
  (d) call the OS to modify or copy the data into the code segment

before returning from the subroutine which performs the partial
application.

The code (which is not self-modifying) is easy to write; it is
generated by a subroutine call.

Partial application is not poor programming practice.  Again, what I
(still) wish to know is what machines/operating systems allow me to
use this technique for implementing partial application, and which
ones require one of the magic handshakes (a,b,c,d) listed above, and
which machines make it totally impossible.

David Chase
Olivetti Research Center, Menlo Park

djones@megatest.UUCP (Dave Jones) (07/09/88)

From article <429@uwovax.uwo.ca>, by 16012_3045@uwovax.uwo.ca (Paul Gomme):
> In article <619@goofy.megatest.UUCP>, djones@megatest.UUCP (Dave Jones) writes:
>> 
>> From article <3353@cognos.UUCP>, by jimp@cognos.uucp (Jim Patterson):
...

>>> there are machines that don't allow you to execute data as
>>> code.
>> 

...

> 	Unless my memory is failing me completely, I believe that OS/2 will
> absolutely prohibit "executing data".

...

> 	Besides, I thought that self-modifying code was (a) extremely difficult
> to write, and (b) considered poor programming practice.
>


(a) "Extremely difficult" is what makes this job fun.
(b) I'm sure it is considered poor practice by many. And for most
    applications, I agree.  But that will not slow me down if it's the 
    best engineering solution available.
    
    What tends to be frustrating is to come upon a situation where you
    have been prevented from using some technique only because some
    I-know-your-job-better-than-you-do has decided to prevent anybody
    from ever using the technique.

    However, the kind of stuff I am talking about is not what is
    traditionally called "self-modifying" code.

    I'll give you an example from actual practice:

    You have a Pascal compiler's source code.   You want to provide
    a debugging tool which will compile and execute Pascal code on-the-fly,
    as it is typed in by the user at runtime at a "breakpoint".
    You must be able to call the user-defined procedures and inspect
    and modify user-defined variables.  You even want to be
    able to redefine a procedure (when you find a bug) and link in
    the new version, without recompiling the entire program or losing
    the program's context.  (I think PPI and Sabre Software have such 
    facilities for Objective C and C, respectively.)

    Solution: When the user prepares the program, "freeze" the compiler
    after it has parsed the program.  Use the "frozen" compiler to 
    compile the interactive code.  Read it into the program (as data),
    link it, and then branch to it.  What you have done is exactly what a
    normal compiler and linker would do, except that you have done it
    at runtime.

    We have been using this technique, with great success, for over five
    years now.  We've even been able to get the interactive response time
    on short code sequences down to a respectable 200 Ms or so.  
    (Works good. Lasts a long time.)


    

			-- Dave J.

ok@quintus.uucp (Richard A. O'Keefe) (07/09/88)

In article <429@uwovax.uwo.ca> 16012_3045@uwovax.uwo.ca (Paul Gomme) writes:
>	Unless my memory is failing me completely, I believe that OS/2 will
>absolutely prohibit "executing data".

The book "Inside OS/2" claims that OS/2 _does_ permit the execution of data,
and tells you the names of the system calls you need to do it.  What you do
is to assign one segment two segment numbers, one of which describes it as
code, the other of which describes it as data.  As the book points out,
such a facility is very important if you want e.g. a data base system which
compiles queries on the fly.

>	Besides, I thought that self-modifying code was (a) extremely difficult
>to write, and (b) considered poor programming practice.

Self-modifying code is not the point:  the point is programs which write
_other_ programs and call them.

smryan@garth.UUCP (Steven Ryan) (07/10/88)

Chant for the day: `Let's hear it for our guru!' `Gee! You are you!'

>Partial application is not poor programming practice.  Again, what I
>(still) wish to know is what machines/operating systems allow me to
>use this technique for implementing partial application, and which
>ones require one of the magic handshakes (a,b,c,d) listed above, and
>which machines make it totally impossible.

From personal experience, CDC 170s use a single address space for code
and data. Any data can be executed and any code can be read/written. In fact,
the IO library does this extensively to load and unload subroutines during
execution. All 6x00s, 7600, 7x, and 17x cpus guarentee the instruction stack
(cache) is flushed by an RJ (return jump--call a subroutine) instruction.

200s use a single address space and permit individual pages to be locked,
but nobody bothers. The 205 cpu has a VSB (void stack and branch) instruction
to flush the instruction stack (cache).

180s do use locked pages in a segmented address space. I don't recall seeing
any operating system calls to change the page keys.

rob@kaa.eng.ohio-state.edu (Rob Carriere) (07/10/88)

In article <644@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>    [...]  We've even been able to get the interactive response time
>    on short code sequences down to a respectable 200 Ms or so.  

200 *Mega*seconds?  That's about 5 years; quite respectable indeeed!
:-) :-) :-)

Rob Carriere
"Mega shall be abbreviated by ``M''; milli by ``m''; see your friendly
local SI standard."