[comp.lang.c] "Numerical Recipes in C" is nonportable code

rob@kaa.eng.ohio-state.edu (Rob Carriere) (08/28/88)

In article <13258@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
> [ still on the b = malloc( foo );  bb = b - 1; code in NumRecipes ]
>Such an implementation will ABORT ON THE COMPUTATION `b - 1',
>possibly (indeed, preferably) at compile time.  And it is legal!

So the standard says, they tell me.  It is also one the more flagrant
violations of the Principle of Least Astonishment I've seen in a
while.  In fact, while we're at it, it would seem to violate the idea
that you give the programmer all the rope she asks for, because she
just might be needing it to pull herself out of a bog.  Gentlemen
system programmers, surely you too have algorithms that are more
accurately expressed with arrays from other than base zero?

bill@proxftl.UUCP (T. William Wells) (08/29/88)

In article <531@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes:
: In article <13258@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
: > [ still on the b = malloc( foo );  bb = b - 1; code in NumRecipes ]
: >Such an implementation will ABORT ON THE COMPUTATION `b - 1',
: >possibly (indeed, preferably) at compile time.  And it is legal!
:
: So the standard says, they tell me.  It is also one the more flagrant
: violations of the Principle of Least Astonishment I've seen in a
: while.

Actually, on a segmented architecture I might be astonished if it
*didn't* bomb.  The principle is rather subjective I'm afraid.

:         In fact, while we're at it, it would seem to violate the idea
: that you give the programmer all the rope she asks for, because she
: just might be needing it to pull herself out of a bog.

Note that the standard does *not* say that you can't do this, it
just says that it is nonportable.  So, unless this bog is a
portable bog, she (Ugh. I prefer s/h/it for a neutered pronoun :-)
won't need a portable rope!

:                                                         Gentlemen
: system programmers, surely you too have algorithms that are more
: accurately expressed with arrays from other than base zero?

Well, actually, no.  One of the characteristics of being *very*
experienced with a language is that you tend to think of
solutions in terms of what that language most easily supplies.

Hmmmm.  Now that I think about it, I do seem to recall some Shell
sort where a zero base made the code more complex.

However, since there *is* a portable way to do this (if you don't
mind the syntax), I'll show it.

func()
{
	int     foo_array[SIZE][SIZE];
#define foo(n,m)  (foo_array[(n)-1][(m)-1])
	...
}

Ugly, but it works.  And it can be used to make the NR programs
portable.

---
Bill
novavax!proxftl!bill

henry@utzoo.uucp (Henry Spencer) (08/29/88)

In article <531@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes:
>>Such an implementation will ABORT ON THE COMPUTATION `b - 1',
>>possibly (indeed, preferably) at compile time.  And it is legal!
>
>So the standard says, they tell me.  It is also one the more flagrant
>violations of the Principle of Least Astonishment I've seen in a
>while.  In fact, while we're at it, it would seem to violate the idea
>that you give the programmer all the rope she asks for, because she
>just might be needing it to pull herself out of a bog.  Gentlemen
>system programmers, surely you too have algorithms that are more
>accurately expressed with arrays from other than base zero?

Yes, certainly.  However, if one wants such code to be portable, one must
be careful how one computes addresses into such arrays.  The only fully
portable way to compute a[b] when you want "a" to start at subscript "s"
is a[b-s].  (a-s)[b] certainly is appealing, since it permits doing the 
subtraction once rather than every time, but it is *NOT PORTABLE*.  Thanks
primarily (but not exclusively) to Intel, it is not safe to back a pointer
up past the beginning of an array and then advance it again.  C has never
guaranteed this to work; indeed, there have always been explicit warnings
that once the pointer goes outside the array, all bets are off.  X3J11 has
legitimized pointers just past the end of an array, since this is very
common and is cheap to do, even on difficult machines, but the beginning
of an array remains an absolute barrier to portable pointers.  This is
simply a fact of life in the portability game.
-- 
Intel CPUs are not defective,  |     Henry Spencer at U of Toronto Zoology
they just act that way.        | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

bright@Data-IO.COM (Walter Bright) (08/30/88)

In article <531@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes:
<In article <13258@mimsy.UUCP< chris@mimsy.UUCP (Chris Torek) writes:
<< [ still on the b = malloc( foo );  bb = b - 1; code in NumRecipes ]
<<Such an implementation will ABORT ON THE COMPUTATION `b - 1',
<<possibly (indeed, preferably) at compile time.  And it is legal!
<So the standard says, they tell me.  It is also one the more flagrant
<violations of the Principle of Least Astonishment I've seen in a
<while.

On a segmented architecture, like 8086's, malloc can and does return
a value that is a pointer to the beginning of a segment. That is, there
is a 16 bit selector and a 16 bit offset, the offset portion is 0 or a
very small number. Thus, subtracting a value from the pointer could result
in a segment wrap. Trouble occurs when you do things like:
	array = malloc(MAX * sizeof(array[0]));
	for (p = &array[MAX-1]; p >= &array[0]; p--)
		...
The >= will fail, because the last p-- will cause an underflow and now
p is greater than &array[MAX]! I've encountered this many times in
porting code from Unix to PCs. The correct way to write the loop is:
	for (p = &array[MAX]; p-- > &array[0]; )
or something similar.

Please, no flames about Intel's architecture. I've heard them all for years.

The best way to learn to write portable code is to be required to port
your applications to Vaxes, 68000s, and PCs. (I have all 3 on my desk!)

dhesi@bsu-cs.UUCP (Rahul Dhesi) (08/30/88)

In article <1673@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright)
writes:
>The best way to learn to write portable code is to be required to port
>your applications to Vaxes, 68000s, and PCs. (I have all 3 on my desk!)

And VAX/VMS specifically.  Until you've ported to VMS you haven't
ported.  Really.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

pdc@otter.hple.hp.com (Damian Cugley) (08/31/88)

... if you want to index an array starting at one, but

	int	b[4], *bb = &b[-1];

and variations thereof are interdit, why not use

	int	bb[5];

Before I am flamed to death for wasting *four* *whole* *bytes* of memory,
I think I can claim excemption under the `speed-vs-space' banner.
Using a pointer as an array probably involves an extra instruction or
CPU cycle somewheres - and `#define bb(x) (b[(x)-1])' does countless
`invisible' subtractions...

pdc

rob@kaa.eng.ohio-state.edu (Rob Carriere) (08/31/88)

In article <1673@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright)
writes:
>In article <531@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu 
>(Rob Carriere) writes:
><In article <13258@mimsy.UUCP< chris@mimsy.UUCP (Chris Torek) writes:
><< [ still on the b = malloc( foo );  bb = b - 1; code in NumRecipes ]
>< [ claiming astonishment in accordance with the Least of.. Principle ]
>On a segmented architecture, like 8086's, malloc can and does return
>a value that is a pointer to the beginning of a segment. 
Yes, and as I've written people who e-mailed me this argument, that
spells out ``broken compiler'' if it gives problems ('cause you can't
malloc more than 64K that way).  Of course on a machine with large
segments, the counterargument doesn't quite hold water either, so...
>[ unItelligent CPU explanation deleted ]
>	array = malloc(MAX * sizeof(array[0]));
>	for (p = &array[MAX-1]; p >= &array[0]; p--)

No!  That wasn't the problem!  (Wish it was, that'd be easy to
avoid!).

The problem is that the authors of Numerical Recipes (NR) observe,
correctly, that many numerical problems are naturally non-zero based.
This gives you the choice between carrying around boatloads of index
arithmatic (inefficient and error-prone), or making non-zero based
arrays.  They opt for the latter, in the following way:

float *my_vec;  /* this is going to be a vector */
int nl, nh;
...
my_vec = vector( nl, nh );  /* allocates a vector with lowest valid
                               index nl, and highest valid index nh
                            */
...
my_vec[3] = foo(bar);
...

Where we have:

float *vector( nl, nh )
     int nl;
     int nh;
{
    float *v;

    v = (float *)malloc( ( nh-nl +1 )* sizeof(float) );
    if( v == 0 ) nrerror( "Allocation error in vector()" );

    return v - nl;
}

This is quite a bit more disciplined than the example above; it is
also quite bit more fundamental.  Fortunately, as far as I've checked
at least, NR only uses vectors and matrices with either 0 or unit
offset, so on broken architectures you could always do 

                  malloc( (nh   + 1 )* sizeof(float) );

    return v;

This would waste a float per vector, and a pointer-to-float plus n
floats for an n-by-something matrix.  Ugly, but it works.  (and we
*are* the throw-away culture after all :-)

Rob Carriere

gwyn@smoke.ARPA (Doug Gwyn ) (08/31/88)

In article <531@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes:
>>Such an implementation will ABORT ON THE COMPUTATION `b - 1',
>So the standard says, they tell me.  It is also one the more flagrant
>violations of the Principle of Least Astonishment I've seen in a
>while.

Sorry, but reality is sometimes astonishing.
That is not an X3J11 invention, just an acknowledgement of the
way the world is.  (For example, segmented architectures.)

>Gentlemen >system programmers, surely you too have algorithms that are
>more accurately expressed with arrays from other than base zero?

I doubt that even lady system programmers have much trouble with
0-based arrays.

gwyn@smoke.ARPA (Doug Gwyn ) (08/31/88)

In article <547@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes:
>The problem is that the authors of Numerical Recipes (NR) observe,
>correctly, that many numerical problems are naturally non-zero based.

INcorrectly!  I've written a lot of array/matrix code in both
Fortran and C, and have found that it normally doesn't matter
and in those cases where it does matter, it doesn't matter much.

I've known mathematicians who have switched over to starting
enumerating at 0 instead of 1.  They argued that THAT was "more
natural".  One can certainly get used to either convention.

rob@raksha.eng.ohio-state.edu (Rob Carriere) (09/01/88)

In article <8400@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <547@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu 
>(Rob Carriere) writes:
>>The problem is that the authors of Numerical Recipes (NR) observe,
>>correctly, that many numerical problems are naturally non-zero based.
                       ^^^^^^^^^^^^^^^^^^
>INcorrectly!  I've written a lot of array/matrix code in both
                                     ^^^^^^^^^^^^^^^^^
>Fortran and C, and have found that it normally doesn't matter
>and in those cases where it does matter, it doesn't matter much.

Trivial refutation time!  Surely it is obvious that ``numerical
problems'' forms a (large) superset of ``array/matrix code'' as far as
numerical analysis is concerned?

Believe it or not, but there are *many* algorithms out there where
it's either base-1 indexing or index arithmatic all over the place.
Not with your traditional LU-decomposition stuff and so on, but with
algorithms where the contents or properties of the matrix elements are
computed from the indeces.

Rob Carriere

gwyn@smoke.ARPA (Doug Gwyn ) (09/01/88)

In article <554@accelerator.eng.ohio-state.edu> rob@raksha.eng.ohio-state.edu (Rob Carriere) writes:
>Trivial refutation time!  Surely it is obvious that ``numerical
>problems'' forms a (large) superset of ``array/matrix code'' as far as
>numerical analysis is concerned?

Trivial indeed!  If the code does not involve arrays/matrices,
the issue of 0-based or 1-based indexing doesn't even arise.

vkr@osupyr.mast.ohio-state.edu (Vidhyanath K. Rao) (09/03/88)

In article <8395@smoke.ARPA>, gwyn@smoke.ARPA (Doug Gwyn ) writes:
> [From way past]
    Such an implementation will ABORT ON THE COMPUTATION `b - 1',
> That is not an X3J11 invention, just an acknowledgement of the
> way the world is.  (For example, segmented architectures.)

But why should it abort? If the address is sr:0, (sr = segment register)
subtract 1 to get (sr-1):ffff [or whatever number of 'f's]. Memory
protection, it seems to me, should not notice attempts to compute addresses
but only attempts to access forbidden addresses. 

Of course, this approach levies heavy penalities on segmented architecutres.
If you are using the 'small' model (in the 8088 meaning of the word), 
sr:0 - 1 = sr:ffff. Now you got to worry about the model. But doesn't the
philosophy of C say 'programmer knows best'. If you want to diddle with
segmented architectures, you got to put up with headaches.

So what am I missing?
-Nath
vkr@osupyr.mast.ohio-state.edu

vkr@osupyr.mast.ohio-state.edu (Vidhyanath K. Rao) (09/03/88)

In article <8400@smoke.ARPA>, gwyn@smoke.ARPA (Doug Gwyn ) writes:
> I've known mathematicians who have switched over to starting
> enumerating at 0 instead of 1.  They argued that THAT was "more
> natural".  One can certainly get used to either convention.

A mathematician is one who starts counting at 0 :-)
Historically, people were suspicious of 'nothing' which is why 0 was not
a number by itself (as opposed to being used in place value notation) till
about 6th century A.D.
As far as indexing goes where one starts makes a difference in terms of
typography :-) More seriously, one may have several things to be indexed,
over a big range (-infinity to infinity even) and each thing is indexed
over some subrange not starting at 0. Changing every origin to 0 is
painful and likely to lead to bugs. Ideally this must be fixed up at the
preprocessor level than at code level. Anybody want to write these
macros?
-Nath
vkr@osupyr.mast.ohio-state.edu

daveb@geac.UUCP (David Collier-Brown) (09/04/88)

> In article <8395@smoke.ARPA>, gwyn@smoke.ARPA (Doug Gwyn ) writes:
>> [From way past]
>     Such an implementation will ABORT ON THE COMPUTATION `b - 1',
>> That is not an X3J11 invention, just an acknowledgement of the
>> way the world is.  (For example, segmented architectures.)

From article <867@osupyr.mast.ohio-state.edu>, by vkr@osupyr.mast.ohio-state.edu (Vidhyanath K. Rao):
> But why should it abort? If the address is sr:0, (sr = segment register)
> subtract 1 to get (sr-1):ffff [or whatever number of 'f's]. Memory
> protection, it seems to me, should not notice attempts to compute addresses
> but only attempts to access forbidden addresses. 

  Regrettably, some architectures prohibit this: (sr-1):ffff may
mean <undefined segment>:ffff, and the loading of the selector into an
selector register will cause a fault. The basic idea here is that
the operating system pre-fetches a page or segment on being informed
that the program is "about" to need it, as indicated by loading its
selector into a distinguished register.
  This behavior is possible on the Honeywell DPS-6[1], and certainly
on an Intel machine running a non-DOS operating system.

--dave (@lethe) c-b
[1] I think the compiler writers watch out for this happening, but
    I do know that it makes compiler- & debugger-writing **difficult**.
    Anyone from SDG want to comment?
-- 
 David Collier-Brown.  | yunexus!lethe!dave
 78 Hillcrest Ave,.    | He's so smart he's dumb.
 Willowdale, Ontario.  |        --Joyce C-B

dricej@drilex.UUCP (Craig Jackson) (09/05/88)

In article <867@osupyr.mast.ohio-state.edu> vkr@osupyr.mast.ohio-state.edu (Vidhyanath K. Rao) writes:
>In article <8395@smoke.ARPA>, gwyn@smoke.ARPA (Doug Gwyn ) writes:
>> [From way past]
>    Such an implementation will ABORT ON THE COMPUTATION `b - 1',
>> That is not an X3J11 invention, just an acknowledgement of the
>> way the world is.  (For example, segmented architectures.)
>
>But why should it abort? If the address is sr:0, (sr = segment register)
>subtract 1 to get (sr-1):ffff [or whatever number of 'f's]. Memory
>protection, it seems to me, should not notice attempts to compute addresses
>but only attempts to access forbidden addresses. 

There exist machines whose protection philosophy is to prevent you from
even thinking something illegal.  In particular, on the Unisys A-series,
the compiler must implement all memory addressing protection--there is
no kernel/user state protection on memory.*  A program cannot be allowed
to form an invalid address, as there is nothing to stop it from using it,
and nothing in the hardware to stop you from stomping on another user
if you do.  Therefore, the compiler and the operating system would be
written so as to cause an interrupt if computing 'b - 1' were attempted.
The ANSI rules were written to allow C to be implemented on such an
architecture.

Note that there is no C compiler for the A-series today, although one is
rumored.  The rumors say that arrays and pointers will not be implemented
this way, however.  In order to get around some other problems, and to allow
more old programs to run, a linear-address space machine will be simulated,
using a large array.  (Arrays are hardware concepts on the A-series.)

>Of course, this approach levies heavy penalities on segmented architecutres.

On some architectures, it may be an infinite penalty--C could not be
implemented.  Or maybe only by simulating a more PDP-11-like machine
(as discussed above).

>If you are using the 'small' model (in the 8088 meaning of the word), 
>sr:0 - 1 = sr:ffff. Now you got to worry about the model. But doesn't the
>philosophy of C say 'programmer knows best'. If you want to diddle with
>segmented architectures, you got to put up with headaches.

You sometimes have to, in order to get some benefits (like having your
OS written in a really high-level language, with no assembler, etc.)

>So what am I missing?

A broad education in the corners of the computer architecture world.

>-Nath
>vkr@osupyr.mast.ohio-state.edu

* Note that putting the protection in the compiler was also an idea
of Per Brinch-Hansen's in the 1970s, with Concurrent Pascal.  Burroughs
had been doing it for many years, even then.
-- 
Craig Jackson
UUCP: {harvard!axiom,linus!axiom,ll-xn}!drilex!dricej
BIX:  cjackson

chris@mimsy.UUCP (Chris Torek) (09/06/88)

[me, paraphrased by me:]
>An implementation may ABORT ON THE COMPUTATION of an illegal address.

In article <867@osupyr.mast.ohio-state.edu> vkr@osupyr.mast.ohio-state.edu
(Vidhyanath K. Rao) asks:
>But why should it abort? If the address is sr:0, (sr = segment register)
>subtract 1 to get (sr-1):ffff [or whatever number of 'f's].

On many machines, addresses are unsigned numbers.  The domain and range
of an unsigned 16-bit number is 0..65535.  What is the (mathematical)
result of 0 - 1?  Answer: -1.  Is it in range?  No.  So what happens?
Integer underflow, which on many machines is a trap.

You can even do this on a VAX, although there you must first enable the
trap (use bispsw or set the appropriate flag in the subroutine entry
mask), and then it only fires on integer computations outside the range
-2 147 483 648..2 147 483 647; so if (for instance) you were to write

	main()
	{
		char *p;

		p = (char *)0x7fffffff;
		asm("bispsw	$0x20");	/* PSL_IV */
		p++;
	}

This program, when run, aborts with a `floating exception' (SIGFPE).
It would be legal for the C compiler to set IV in the entry point
of each subroutine, although it would probably break too much code
that expects integer overflow/underflow to be ignored, and the code
that does C's `unsigned' arithmetic would have to turn it off temporarily.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

peter@ficc.uu.net (Peter da Silva) (09/08/88)

In article <640@drilex.UUCP>, dricej@drilex.UUCP (Craig Jackson) writes:
> * Note that putting the protection in the compiler was also an idea
> of Per Brinch-Hansen's in the 1970s, with Concurrent Pascal.  Burroughs
> had been doing it for many years, even then.

What's to stop you from doing the following:

	Generate code in an array.
	Jump to the beginning of the array. *

Now you've blown the protection. You can do anything. I hope this isn't a
multiuser machine...

* this may involve such things as passing a pointer to an array to a
function that's declared that argument as a pointer to a function, or
even by writing the array out as a file and executing it... I can't see
how you could write a valid 'C' compiler that wouldn't let you violate
this protection.
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net

peter@ficc.uu.net (Peter da Silva) (09/08/88)

In article <3200@geac.UUCP>, daveb@geac.UUCP (David Collier-Brown) writes:
>   Regrettably, some architectures prohibit this: (sr-1):ffff may
> mean <undefined segment>:ffff, and the loading of the selector into an
> selector register will cause a fault.

But nobody says you have to load the selector into a selector register
just to compute an address. Why should the address calculation hardware
be involved at all?
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net

tanner@cdis-1.uucp (Dr. T. Andrews) (09/08/88)

In article <13402@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
) 	main()
) 	{
) 		char *p;
) 
) 		p = (char *)0x7fffffff;
) 		asm("bispsw	$0x20");	/* PSL_IV */
) 		p++;
) 	}
) This program, when run, aborts with a `floating exception' (SIGFPE).

I may not be the first one to cast a stone at this example, but have
you considered the possibility that a floating point exception is
manifestly the \fBwrong\fP thing to do in your example?  There is no
floating-point math in there.  Complain to your vendor.
-- 
...!bikini.cis.ufl.edu!ki4pv!cdis-1!tanner  ...!bpa!cdin-1!cdis-1!tanner
or...  {allegra killer gatech!uflorida decvax!ucf-cs}!ki4pv!cdis-1!tanner

chris@mimsy.UUCP (Chris Torek) (09/09/88)

-In article <640@drilex.UUCP> dricej@drilex.UUCP (Craig Jackson) writes:
->* Note that putting the protection in the compiler was also an idea
->of Per Brinch-Hansen's in the 1970s, with Concurrent Pascal.  Burroughs
->had been doing it for many years, even then.

In article <1429@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
-What's to stop you from doing the following:
-
-	Generate code in an array.
-	Jump to the beginning of the array. *

Whenever the compiler is forced to generate `iffy' code, it also generates
tests such as tags to make sure that you do not do something like this.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.ARPA (Doug Gwyn ) (09/09/88)

In article <1429@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>What's to stop you from doing the following:
>	Generate code in an array.
>	Jump to the beginning of the array. *
>... I can't see how you could write a valid 'C' compiler that wouldn't
>let you violate this protection.

That's simple.  All the compiler has to do is detect any attempt to
use a data object as a function.  The only way to even attempt this in
standard C is via an explicit cast to a function pointer somewhere,
which is where the compiler would enforce the constraint.

peter@ficc.uu.net (Peter da Silva) (09/10/88)

In article <13454@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> -In article <640@drilex.UUCP> dricej@drilex.UUCP (Craig Jackson) writes
   about Burroughs putting protection in the compiler...

> In article <1429@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
> -What's to stop you from doing the following:
> -
> -	Generate code in an array.
> -	Jump to the beginning of the array. *

Chris Torek noted:

> Whenever the compiler is forced to generate `iffy' code, it also generates
> tests such as tags to make sure that you do not do something like this.

So what's to stop me from writing out a load module and subverting
the protection mechanism, as I noted in my (deleted) footnote? I would
think that the perversions necessary to make 'C' safe to run on this machine
would make it sufficiently useless that a little thing like calculating
a pointer to a position before the beginning of an array is a minor
detail...

That is to say, yes... this construct is non-portable. But only to machines
you would have severe problems porting to in the first place.
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net

blarson@skat.usc.edu (Bob Larson) (09/11/88)

In article <1450@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In article <13454@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
>> In article <1429@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>> -What's to stop you from doing the following:
>> -	Generate code in an array.
>> -	Jump to the beginning of the array. *

Decent memory protection.  (There are those of us who believe that
executable and writable memory should be mutually exclusive.  (with a
provision to change from one to the other.))

>So what's to stop me from writing out a load module and subverting
>the protection mechanism, as I noted in my (deleted) footnote?

The same type of protection mechinism that makes it impossible
(or hopefully at least difficult) to alter other users files.
Writing out executalbe files may be considered a priviliged
function reserved to compilers.

(Please note I am not saying that I think that compilers are the proper
place to enforce system security, just that portably written code shouldn't
have undue hardship running on such a machine.)
-- 
Bob Larson	Arpa: Blarson@Ecla.Usc.Edu	blarson@skat.usc.edu
Uucp: {sdcrdcf,cit-vax}!oberon!skat!blarson
Prime mailing list:	info-prime-request%ais1@ecla.usc.edu
			oberon!ais1!info-prime-request

dricej@drilex.UUCP (Craig Jackson) (09/11/88)

In article <1429@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In article <640@drilex.UUCP>, dricej@drilex.UUCP (Craig Jackson) writes:
>> * Note that putting the protection in the compiler was also an idea
>> of Per Brinch-Hansen's in the 1970s, with Concurrent Pascal.  Burroughs
>> had been doing it for many years, even then.
>
>What's to stop you from doing the following:
>
>	Generate code in an array.
>	Jump to the beginning of the array. *
>
>Now you've blown the protection. You can do anything. I hope this isn't a
>multiuser machine...

Two things stop this:

1. There's no way to 'say it'; see below.

2. There is a tag field on each word of memory.  Data has a tag of 0 or 2;
code has a tag of 3.  It is the responsibility of the compiler to make sure
that a user program cannot set its own tags.  Only the operator can turn
a program into a compiler, and only a compiler can create an object program.

(There are, of course, holes for people with super-user-like privileges.
Just like Unix.)

>* this may involve such things as passing a pointer to an array to a
>function that's declared that argument as a pointer to a function, or
>even by writing the array out as a file and executing it... I can't see
>how you could write a valid 'C' compiler that wouldn't let you violate
>this protection.

Another feature of this system is a type-checking linker.  All functions
must agree in number of arguments and type of arguments with their calls.
The linker, called the binder on the A-series, enforces this.  (This makes
varargs be a pain in the behind, BTW.  One reason why A-series C most likely
will not fully use the hardware, and therefore be a slow, undesirable
language.  Much like their PL/I.)

>Peter da Silva  `-_-'  Ferranti International Controls Corporation.

-- 
Craig Jackson
UUCP: {harvard!axiom,linus!axiom,ll-xn}!drilex!dricej
BIX:  cjackson

sho@pur-phy (Sho Kuwamoto) (09/11/88)

I *want* to be able to create an array and jump to it.  I do this all
the time.  Granted, I do this on a micro (a Mac) so first of all, it's
just more feasable, and second of all, there's no sophisticated memory
management (or for that matter, not nearly as much need to worry about
crashing the system) but still, I think it's a bit severe to say that
such a thing should never be done.  Maybe it would be OK if the
compiler gave you the option of explicitly coercing some piece of data
into becoming code.

						-Sho

mtr@eagle.ukc.ac.uk (M.T.Russell) (09/15/88)

In article <8470@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <1429@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>>What's to stop you from doing the following:
>>	Generate code in an array.
>>	Jump to the beginning of the array. *
>>... I can't see how you could write a valid 'C' compiler that wouldn't
>>let you violate this protection.
>
>That's simple.  All the compiler has to do is detect any attempt to
>use a data object as a function.  The only way to even attempt this in
>standard C is via an explicit cast to a function pointer somewhere,
>which is where the compiler would enforce the constraint.

There is another way to treat a data object as a function:

	union foo {
		char *data;
		int (*func)();
	};

The compiler would either have to prohibit unions with both text and
data pointers or do runtime bookkeeping to remember what was last
stored in such unions.

Mark Russell
mtr@ukc.ac.uk

chip@ateng.uucp (Chip Salzenberg) (09/16/88)

According to peter@ficc.uu.net (Peter da Silva):
>But nobody says you have to load the selector into a selector register
>just to compute an address.

More to the point:  The dpANS says you (the implementor) are _allowed_ to
load the selector into a selector register when computing the address.  To
do otherwise could slow down register-intensive pointer manipulation.

-- 
Chip Salzenberg                <chip@ateng.uu.net> or <uunet!ateng!chip>
A T Engineering                My employer may or may not agree with me.
	  The urgent leaves no time for the important.

news@ism780c.isc.com (News system) (09/17/88)

In article <1988Sep15.145026.20325@ateng.uucp> chip@ateng.UUCP (Chip Salzenberg) writes:
>According to peter@ficc.uu.net (Peter da Silva):
>>But nobody says you have to load the selector into a selector register
>>just to compute an address.
>
>More to the point:  The dpANS says you (the implementor) are _allowed_ to
>load the selector into a selector register when computing the address.  To
>do otherwise could slow down register-intensive pointer manipulation.
>

But consider what might have happened had dpANS mandated that the compution
of a pointer to x[-1] be a valid operation.  Then machines for wich the
mandated behavior is slow would be not used by people interested in high
performance.  The net effect could be salubrious for the computer industry in
the long run.

   Marv Rubinstein

peter@ficc.uu.net (Peter da Silva) (09/17/88)

In article <1988Sep15.145026.20325@ateng.uucp>, chip@ateng.uucp (Chip Salzenberg) writes:
> According to peter@ficc.uu.net (Peter da Silva):
> >But nobody says you have to load the selector into a selector register
> >just to compute an address.

> More to the point:  The dpANS says you (the implementor) are _allowed_ to
> load the selector into a selector register when computing the address.  To
> do otherwise could slow down register-intensive pointer manipulation.

OK, then, I withdraw my objection to the original message. Since
there is no portable method of declaring non-zero-based arrays in 'C',
and since the code generation task for using such is trivial, they should
be added.
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net

gwyn@smoke.ARPA (Doug Gwyn ) (09/18/88)

In article <16041@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes:
-But consider what might have happened had dpANS mandated that the compution
-of a pointer to x[-1] be a valid operation.  Then machines for wich the
-mandated behavior is slow would be not used by people interested in high
-performance.  The net effect could be salubrious for the computer industry in
-the long run.

I doubt that any effect on the computer industry would have occurred
other than reduced adherence to the postulated C standard.  People
writing portable applications would still not be able to compute
&array[-1], since several compilers would ignore that requirement
(benchmark speed is a far greater driving factor than the desires of
a few sloppy programmers to compute non-existent addresses).  What
good would that situation accomplish?  Better that the standard be
widely followed and that programmers become better educated about
actual portability considerations, than to encourage false hopes for
availability of features that are difficult or detrimental to provide.

henry@utzoo.uucp (Henry Spencer) (09/18/88)

In article <5514@eagle.ukc.ac.uk> mtr@arthur.UUCP (M.T.Russell) writes:
>	union foo {
>		char *data;
>		int (*func)();
>	};
>
>The compiler would either have to prohibit unions with both text and
>data pointers or do runtime bookkeeping to remember what was last
>stored in such unions.

No, it is sufficient if any attempt to use this trick malfunctions badly.
(For example, if the two kinds of pointers are not the same size, it is
almost guaranteed to.)
-- 
NASA is into artificial        |     Henry Spencer at U of Toronto Zoology
stupidity.  - Jerry Pournelle  | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (09/18/88)

In article <16041@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes:
>But consider what might have happened had dpANS mandated that the compution
>of a pointer to x[-1] be a valid operation.  Then machines for wich the
>mandated behavior is slow would be not used by people interested in high
>performance.  The net effect could be salubrious for the computer industry in
>the long run.

No.  A much more probable result would be widespread rejection of the C
standard, making things worse than before.  ANSI does not have the power
to legislate conformance to standards -- that has to be voluntary.  If
too many manufacturers, especially big ones, decline to conform to a
standard, it falls into disuse and is forgotten.  Let us not forget that
the machine whose segmented architecture causes the biggest headaches for
pointer trickery is also the biggest-selling computer of all time.  To get
a standard accepted (by the world, not just by ANSI), it is necessary --
distasteful, but necessary -- to restrain desires for social engineering,
and produce something that will work even on systems one does not like.
-- 
NASA is into artificial        |     Henry Spencer at U of Toronto Zoology
stupidity.  - Jerry Pournelle  | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

atbowler@watmath.waterloo.edu (Alan T. Bowler [SDG]) (09/23/88)

In article <531@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes:
>Gentlemen >system programmers, surely you too have algorithms that are
>more accurately expressed with arrays from other than base zero?

I feel like the world has gone through some strange warp.  Back when I
was studying numerical analysis the complaint from the mathematicians
and numerical analysts was about how awkward it was to code algorithms
in Fortran-IV because it used origin 1 indexing and origin 0 would
clearly have been so much more "natural".

peter@ficc.uu.net (Peter da Silva) (09/24/88)

In article <21058@watmath.waterloo.edu>, atbowler@watmath.waterloo.edu (Alan T. Bowler [SDG]) writes:
> In article <531@accelerator.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes:
> >Gentlemen >system programmers, surely you too have algorithms that are
> >more accurately expressed with arrays from other than base zero?

  [ complaints from programmers ]
> and numerical analysts was about how awkward it was to code algorithms
> in Fortran-IV because it used origin 1 indexing and origin 0 would
> clearly have been so much more "natural".

Most cases 0 is more natural. For some cases 1 is more natural. For other
cases -63 might be more natural. and for others 7 might be the best base.

Fortran now allows these other bases (we use a lot of 0-based arrays here).
'C' doesn't. There is some question whether it should.
-- 
Peter da Silva  `-_-'  Ferranti International Controls Corporation.
"Have you hugged  U  your wolf today?"            peter@ficc.uu.net