[comp.sys.ibm.pc] Turbo C far pointers

elrond@titan.tsd.arlut.utexas.edu (Brad Hlista) (10/28/89)

I am having a problem of getting an allocated block of far memory
to return float values.  Here is how variables are declared and dereferenced:

far *ptr;

main()
{
int i;

	ptr=(float *) farmalloc(100000);

	f(i=0;i<1000;i++)
		*(ptr+i)=3.1415927;

	for(i=0;i<10;i++)
		printf(" %f", (float) *(ptr+i) ) ;  /* is yielding 3.00000 */
}

Can someone please help me understand what is going on?
Thanks.

Brad
elrond@titan.tsd.arlut.utexas.edu

jwright@atanasoff.cs.iastate.edu (Jim Wright) (10/29/89)

In a recent posting elrond@titan.tsd.arlut.utexas.edu (Brad Hlista) writes:
| I am having a problem of getting an allocated block of far memory
| to return float values.  Here is how variables are declared and dereferenced:

| far *ptr;

I think this is the big part.  An int pointer is not the same as a float
pointer.  The following program works for me (MSC).

#include <stdio.h>
#include <malloc.h>
int main(void);

int main()
{
    int i;
    float far *start;
    float far *p;

    start = (float far *) _fmalloc(1000*sizeof(float));

    if (start != NULL) {
        printf("Good.\n");
        for (p=start,i=0 ; i<1000 ; i++)
            *(p++) = 3.1415927;
        for (p=start,i=0 ; i<10 ; i++)
            printf("%f\n", *(p++));
    }
    else
        printf("Oh shit.\n");
    return(0);
}

-- 
Jim Wright
jwright@atanasoff.cs.iastate.edu

bdb@becker.UUCP (Bruce Becker) (10/29/89)

In article <565@titan.tsd.arlut.utexas.edu> elrond@titan.tsd.arlut.utexas.edu (Brad Hlista) writes:
|I am having a problem of getting an allocated block of far memory
|to return float values.  Here is how variables are declared and dereferenced:
|
|far *ptr;
|
|main()
|{
|int i;
|
|	ptr=(float *) farmalloc(100000);
|
|	f(i=0;i<1000;i++)
|		*(ptr+i)=3.1415927;

Should be:	*((float *)(ptr+i)) = 3.1415927;

	Otherwise the "far" declaration of "ptr"
	will force a type conversion to the default
	of type "int". The way you have it the
	contents of of *(ptr+i)" contains an int
	value which is coerced back to float
	in the printf statement.

|	for(i=0;i<10;i++)
|		printf(" %f", (float) *(ptr+i) ) ;  /* is yielding 3.00000 */

Should be	printf(" %f", *((float *)(ptr+i)) ) ;  /* is yielding 3.1415927 */

|}
|
|Can someone please help me understand what is going on?
|Thanks.

	You need to ensure that either by default
	or by explicit declaration that
	sizeof(far) == sizeof(float); is there a
	"far long" type? Better yet, a "far float"?

	I didn't actually try this so if your compiler
	barfs on "*((float *)ptr)" where "ptr" is a
	"far *" declaration, well... you could try
	"far float * ptr;", etc...

Cheers,
-- 
  .::.	 Bruce Becker	Toronto, Ont.
w \@@/	 Internet: bdb@becker.UUCP, bruce@gpu.utcs.toronto.edu
 `/c/-e	 BitNet:   BECKER@HUMBER.BITNET
_/  \_	 In the future the term "wife" will have no gender significance

CMH117@PSUVM.BITNET (Charles Hannum) (10/29/89)

Two suggestions:

  1)  Your main problem is that you need to use the declaration:

      far float *ptr;

      I won't even try to explain what happens otherwise.

  2)  You really should use "ptr[i]" rather than "*(ptr+i)".  It effectively
      does the same thing, but it makes for much cleaner and much more
      readable code.

hp0p+@andrew.cmu.edu (Hokkun Pang) (10/31/89)

>  2)  You really should use "ptr[i]" rather than "*(ptr+i)".  It effectively
>      does the same thing, but it makes for much cleaner and much more
>      readable code.

I read from a book that "ptr[i]" will be converted to "*(ptr+i)" by the
compiler, so the "*(ptr+1)" is faster than "ptr[1]". Is that right?

spolsky-joel@CS.YALE.EDU (Joel Spolsky) (10/31/89)

In article <wZHEDrW00VoH479lIM@andrew.cmu.edu> hp0p+@andrew.cmu.edu (Hokkun Pang) writes:
>>  2)  You really should use "ptr[i]" rather than "*(ptr+i)".  It effectively
>>      does the same thing, but it makes for much cleaner and much more
>>      readable code.
>
>I read from a book that "ptr[i]" will be converted to "*(ptr+i)" by the
>compiler, so the "*(ptr+1)" is faster than "ptr[1]". Is that right?

That's ridiculous. *(A+1), A[1], and 1[A] all produce exactly the same
code. There is no reason to believe that the compiler prefers one over
the other, or that the compiler implements A[1] by first expanding
that to *(A+1). For all you know it does the opposite. 

And even if it did make a difference, it would be so small as to be
imperceivable even on the worlds slowest C compiler running on an
HP-41C. 

+----------------+----------------------------------------------------------+
|  Joel Spolsky  | bitnet: spolsky@yalecs.bitnet     uucp: ...!yale!spolsky |
|                | internet: spolsky@cs.yale.edu     voicenet: 203-436-1538 |
+----------------+----------------------------------------------------------+
                                                      #include <disclaimer.h>

levitte@garbo.bion.kth.se (Tommy Levitte) (10/31/89)

In article <990@becker.UUCP> bdb@becker.UUCP (Bruce Becker) writes:
Bruce> Xref: sics.se alt.msdos.programmer:153 comp.sys.ibm.pc:29919

Bruce> In article <565@titan.tsd.arlut.utexas.edu> elrond@titan.tsd.arlut.utexas.edu (Brad Hlista) writes:

Bruce> |far *ptr;

Bruce> |main()

[... unimportant things deleted ...]

Bruce> |	f(i=0;i<1000;i++)
Bruce> |		*(ptr+i)=3.1415927;

Bruce> Should be:	*((float *)(ptr+i)) = 3.1415927;

(* GAACK *) This means you should use the i'th INTEGER location, not float!
Wow. The effect when printing out would be quite a sight...

No, the correct syntax would be : *((float *)ptr + i)

This way, you tell the compiler that ptr really is a float pointer, then
offset the whole thing with i.

In your solution, the offset would be 2*i bytes (2 = sizeof(int)). In mine,
the offset would be 4*i bytes (4=sizeof(float)) !!!!

Bruce> |	for(i=0;i<10;i++)
Bruce> |		printf(" %f", (float) *(ptr+i) ) ;  /* is yielding 3.00000 */

Bruce> Should be	printf(" %f", *((float *)(ptr+i)) ) ;  /* is yielding 3.1415927 */

Likewise here. Use *((float)ptr + i) instead !

Bruce> 	sizeof(far) == sizeof(float); is there a
Bruce> 	"far long" type? Better yet, a "far float"?

Wow. sizeof(far) ! Never seen that before. This would mean 'far int', which is
not (as far as I know) permitted in TC.
far is only applied on pointer and functions, to tell the compiler you need a
4-byte adress to get those objects. Thus, far is a modifier, not a type.
Now, about int. int is the same as short in TC, a 2-byte integer. float is a
4-byte float.

One IMPORTANT rule when you program in C: NEVER assume anything about the size
of differnet types. NEVER !!!!!

Bruce> 	I didn't actually try this so if your compiler
Bruce> 	barfs on "*((float *)ptr)" where "ptr" is a
Bruce> 	"far *" declaration, well... you could try
Bruce> 	"far float * ptr;", etc...

Quite surprising. I just tried it with Turbo C v 2.0, writing 
*((float *)ptr+i), and it worked just nice. writing *((float *)(ptr+i)) gave
me 0.000000 when I tried it...


--
Tommy Levitte 
	gizmo@nada.kth.se
	gizmo@kicki.stacken.kth.se
	gizmo@ttt.kth.se

hollen@eta.megatek.uucp (Dion Hollenbeck) (11/01/89)

From article <990@becker.UUCP>, by bdb@becker.UUCP (Bruce Becker):
> In article <565@titan.tsd.arlut.utexas.edu> elrond@titan.tsd.arlut.utexas.edu (Brad Hlista) writes:
> |I am having a problem of getting an allocated block of far memory
> |to return float values.  Here is how variables are declared and dereferenced:
> |
> |far *ptr;
> |
> |main()
> |{
> |int i;
> |
> |	ptr=(float *) farmalloc(100000);
> |
> |	f(i=0;i<1000;i++)
> |		*(ptr+i)=3.1415927;
> 
> Should be:	*((float *)(ptr+i)) = 3.1415927;
> 
> 	Otherwise the "far" declaration of "ptr"
> 	will force a type conversion to the default
> 	of type "int".

If the pointer is declared to be of type "float far *" then the
statement  "ptr + i" says "ptr + sizeof(type of ptr)*i" even
though i is an int.  If this does not occur, then the compiler
is broken.  The program could be written thusly:

main()
{
	float far *ptr;

	ptr = (float *) farmalloc(100000);

	f(i=0;i<1000;i++)
		*(ptr+i)=3.1415927;

}

The key is to declare the pointer to be a far ptr to a type of float
and incrementation and pointer arithmetic will work.

	Dion Hollenbeck             (619) 455-5590 x2814
	Megatek Corporation, 9645 Scranton Road, San Diego, CA  92121

        uunet!megatek!hollen       or  hollen@megatek.uucp

c9h@psuecl.bitnet (11/01/89)

In article <4037@cs.yale.edu>, spolsky-joel@CS.YALE.EDU (Joel Spolsky) writes:
> In article <wZHEDrW00VoH479lIM@andrew.cmu.edu> hp0p+@andrew.cmu.edu (Hokkun Pang) writes:
>>>  2)  You really should use "ptr[i]" rather than "*(ptr+i)".  It effectively
>>>      does the same thing, but it makes for much cleaner and much more
>>>      readable code.
>>
>>I read from a book that "ptr[i]" will be converted to "*(ptr+i)" by the
>>compiler, so the "*(ptr+1)" is faster than "ptr[1]". Is that right?
>
>
> That's ridiculous. *(A+1), A[1], and 1[A] all produce exactly the same
> code. There is no reason to believe that the compiler prefers one over
> the other, or that the compiler implements A[1] by first expanding
> that to *(A+1). For all you know it does the opposite.
>
> And even if it did make a difference, it would be so small as to be
> imperceivable even on the worlds slowest C compiler running on an
> HP-41C.

point is that "ptr[i]" makes more sense to the average person than "*(ptr+i)",
and will, in fact, produce *exactly* the same code.

And it most certainly WILL NOT produce the same code as "(*ptr)+i"!  IF
your compiler does this, then it is VERY non-standard.

BTW:  How long it takes the compiler to resolve an expression has very
      little to do with execution speed.

Anyway, this has turned into an argument about semantics, and if anywhere,
that should be constrained to comp.lang.c.


- Charles Hannum        |  Klein bottle for sale ...  |  Live long and prosper.
  c9h@psuecl.psu.edu    |  inquire within.            |
  cmh117@psuvm.psu.edu  |                             |  To life immortal!

dbin@norsat.UUCP (Dave Binette) (11/01/89)

In article <wZHEDrW00VoH479lIM@andrew.cmu.edu> hp0p+@andrew.cmu.edu (Hokkun Pang) writes:
>I read from a book that "ptr[i]" will be converted to "*(ptr+i)" by the
>compiler, so the "*(ptr+1)" is faster than "ptr[1]". Is that right?

Its possible but not alwyays determinable.  In fact it may depend on the
compiler and the native CPU.

Some CPU's handle array indexing more efficiently than pointers so in either
case your kind of at the mercy of those who ported the compiler to your
machine.
-- 
OS2... The nightmare continues.
uucp:  {uunet,ubc-cs}!van-bc!norsat!dbin | 302-12886 78th Ave
bbs:   (604)597-4361     24/12/PEP/3     | Surrey BC CANADA
voice: (604)597-6298     (Dave Binette)  | V3W 8E7

johnl@esegue.segue.boston.ma.us (John R. Levine) (11/02/89)

In article <150@norsat.UUCP> dbin@norsat.UUCP (Dave Binette) writes:
>In article <wZHEDrW00VoH479lIM@andrew.cmu.edu> hp0p+@andrew.cmu.edu (Hokkun Pang) writes:
>>I read from a book that "ptr[i]" will be converted to "*(ptr+i)" by the
>>compiler, so the "*(ptr+1)" is faster than "ptr[1]". Is that right?
>...
>Some CPU's handle array indexing more efficiently than pointers so in either
>case your [sic] kind of at the mercy of those who ported the compiler to your
>machine.

Sheesh.  The two expressions are defined by the language to mean exactly the
same thing.  Anywhere you can use one of them, you can use the other and get
exactly the same result.  A C compiler should generate exactly the same code
for both.

In practice, every C compiler that I have seen converts ptr[i] to *(ptr+i) at
compile time, and then generates the code.  (Well, actually, I once did see a
compiler that generated different code, but it turned out to be a mutant PL/I
compiler rather than a C compiler.) I suppose there might be some microscopic
difference in compile speed between the two, but the runtime performance that
most people worry about will be identical.

This is one of the most frequently misunderstood parts of the C language.  If
you still aren't sure why the two expressions are equivalent, this would be a
good time to go back and reread your C books.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl
Massachusetts has over 100,000 unlicensed drivers.  -The Globe

2179ak@gmuvax2.gmu.edu (JDPorter) (11/16/89)

In article <4037@cs.yale.edu> spolsky-joel@CS.YALE.EDU (Joel Spolsky) writes:
>In article <wZHEDrW00VoH479lIM@andrew.cmu.edu> hp0p+@andrew.cmu.edu (Hokkun Pang) writes:
>>I read from a book that "ptr[i]" will be converted to "*(ptr+i)" by the
>>compiler, so the "*(ptr+1)" is faster than "ptr[1]". Is that right?
>That's ridiculous. *(A+1), A[1], and 1[A] all produce exactly the same
>code. There is no reason to believe that the compiler prefers one over

Sorry, I must disagree. (item #0: '1[A]' looks very alien and undigestible
to me.)
But to get to the point:
*(A+1) does NOT produce the same code as A[1].  (not for MSC, anyway.)
The first form takes the pointer, increments it by one (as a pointer 
entity), and dereferences it.
The second form places the specified index into an index register (or
offset register, if you prefer) and dereferences the pointer PLUS the
offset.  
In general, the SECOND form executes in FEWER cycles (contrary to the
C programmer's notion that pointers are always the most efficient.)

John Porter

johnl@esegue.segue.boston.ma.us (John R. Levine) (11/16/89)

In article <579@gmuvax2.gmu.edu> 2179ak@gmuvax2.UUCP (JDPorter) writes:
>>That's ridiculous. *(A+1), A[1], and 1[A] all produce exactly the same code.

>*(A+1) does NOT produce the same code as A[1].  (not for MSC, anyway.)

Congratulations, you've found an optimization bug in MSC.  In Turbo, assuming
that A is an int * passed as an argument, they all generate these two
instructions (picked verbatim from the generated .ASM):

	mov	bx,word ptr [bp+4]
	mov	ax,word ptr [bx+2]

Can we stop arguing about this now?
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl
"Now, we are all jelly doughnuts."

alanf@bruce.OZ (Alan Grant Finlay) (11/17/89)

In Article <6689@esegue.segue.boston.ma.us> John R. Levine writes:

>In article <579@gmuvax2.gmu.edu> 2179ak@gmuvax2.UUCP (JDPorter) writes:
>>>>That's ridiculous. *(A+1), A[1], and 1[A] all produce exactly the same code.
>
>>*(A+1) does NOT produce the same code as A[1].  (not for MSC, anyway.)
>
>Congratulations, you've found an optimization bug in MSC.  In Turbo, assuming

As I am doing research in programming language semantics I can't resist putting
my oar in.  This is an issue I have much meditated upon in the past.  As I see
it the semantics of high level programming languages are best left independent
of efficiency specifications.  This is partly due to the need for machine
independence but also for more philosophical reasons.  With high level languages
the programmer wants to be able to say what is to be done without being
concerned about how it is done.  With assembly languages the reverse is the
case (i.e. if you can't do what you want efficiently then you change the
requirements).  Hence for high level languages we have optimisation of the 
generated code.  I have never heard of an optimiser intended for hand written
assembly code (except maybe rumours from the AI community).

This brings me to the problem child C which has characteristics of both
high and low level languages.  C is undoubtably a popular language much to
my disgust.  If C replaces COBOL we will be no better off.  There, I've said
it, and will probably never live it down.  More seriously though what does
"Kernighan and Ritchie" say?  Page 94 (1978 edition): 

   "Rather more surprising, at least at first sight, is the fact that a
    reference to a[i] can also be written as *(a+i).  In evaluating a[i],
    C converts it to *(a+i) immediately; the two forms are completely
    equivalent."

Although the meaning of "equivalent" is not further specified we are given
the additional clue that a conversion on a syntactic level can be presumed
to have taken place.  A similar discussion in the appendix page 210 states:

   "By definition, the subscript operator [] is interpreted in such a
    way that E1[E2] is identical to *((E1)+(E2)).  Because of the
    conversion rules which apply to +, if E1 is an array and E2 an 
    integer, then E1[E2] refers to the E2-th member of E1.  Therefore
    despite its asymmetric appearance, subscripting is a commutative
    operation."

From the tone of the discussion I draw the following conclusions:

1) The equivalence referred to is "equivalence in effect" and does not
   dictate the means by which this effect is produced.  The language 
   manual occasionally refers to machine dependencies but hardly
   presumes to dictate the code generated.  In fact it states that:
   (page 212) "Some difficulties arise only when dubious coding 
   practices are used.  It is exceedingly unwise to write programs 
   which depend on any of these properties."

2) The C language assumes it is implemented on a certain class of machine 
   which we may broadly classify as "Von Neumann" or perhaps more accurately
   as "linear addressable data and code".  We may not presume that
   the instruction set architecture has index registers.  Some form of
   indirect addressing must be achievable. 

I think to assume that source code which is equivalent "by definition"
must generate the same (essentially) object code in a single code object,
is a dubious coding practice.  Although the C language is clearly defined
for efficient programming on a certain class of machine there are no
guarantees written into the language definition that such and such a way of
doing something will be more efficient than any other way.

C lovers please post your flames to "comp.lang.c" which I agree to read
for the next few weeks.

bl@infovax.UUCP (Bj|rn Larsson) (11/19/89)

In article <1697@bruce.OZ> alanf@bruce.OZ (Alan Grant Finlay) writes:
>
>In Article <6689@esegue.segue.boston.ma.us> John R. Levine writes:
>
>>In article <579@gmuvax2.gmu.edu> 2179ak@gmuvax2.UUCP (JDPorter) writes:
>>>>>That's ridiculous. *(A+1), A[1], and 1[A] all produce exactly the same code.
>>
>>>*(A+1) does NOT produce the same code as A[1].  (not for MSC, anyway.)
>>
>>Congratulations, you've found an optimization bug in MSC.  In Turbo, assuming
>
>As I am doing research in programming language semantics I can't resist putting
>my oar in.  This is an issue I have much meditated upon in the past.  As I see
>it the semantics of high level programming languages are best left independent
>of efficiency specifications.  This is partly due to the need for machine
>independence but also for more philosophical reasons.  With high level languages

I have not read the original article or most of the followups but as I
see it, it is best to use the syntax

	A[i]

to access an array element if A is actually an array which is accessed
directly, and

	*(A+i)

if you access an array element via a pointer (A in this case) which points
to the start of the array. In other words, in the first example, A is the
ARRAY ITSELF - in the second, A is a POINTER TO the array. To me, any
other practice is MISLEADING as to what is really going on in the hardware,
although the same effects are achieved. To me it is actually somewhat
unfortunate that C allows this kind of 'aliasing' for fundamentaly diffe-
rent access methods. The example

	i[A]

mentioned above should NOT work. in fact it is awful!  The compiler would
have to assume that i is a pointer, but of which type?  Except this being
a syntax error, the compiler needs to know the size of the element type
that i points to, so it knows how much to multiply A (which is also not
an integer type) to yield the offset in BYTES from where i points... YUCK!!!
Still, this might actually work on machines that have the same size of
integers and pointers... and a very forgiving compiler indeed.

This is also the area where 'C' programmers who don't know assembly
language make the most mistakes. Sometimes you can see horrible errors
being made, and when one tries to explain why it is an error, they don't
understand - they just don't have the concept of 'primary memory' which
you can either write into directly or access indirectly via pointers.

Note I don't discuss optimizer efficiency above. But why would you expect
a compiler to actually generate the exact same code if the source code
is different. Wouldn't it be possible that an optimizer WILL find other
optimizations methods (based on the surrounding statements) if the source
code is different? The important thing must be that if you give a function
some input, you get the correct output, independent of the algorithm and
coding practices.

To rely on the knowledge of how one specific compiler generates code I
consider bad practice - your assumptions will not hold if you port to
another compiler/environment. Of course, I'm stretching my point somewhat
here - you usually code in a way that is generally efficient - but be
aware that there may be machine architectures where 'normally efficient'
coding practices are LESS efficient. For example, often DSP's (signal
processors) need to be coded in a very different way, since they
allow a greater degree of parallell data transfering and instruction
pipe-lining, and 'normal' coding will not make use of these possible
advantages.

Furthermore, the speed differencies caused by different optimizations
will probably be extremly minor. Usually, if you use code size optimi-
zation with MSC or Turbo C (all versions) you'll only save a few hund-
red bytes in a typical 32k program. When I have tried speed optimization,
it usually hasn't even been possible to measure any resulting improvements
when executing the code, although the machine code WAS indeed different
(but alas, not faster). Note: this is my experience for PC's - I think
under UNIX you will probably get a measurable improvement by using the
optimization flags (5-10% maybe?). Just don't expect miraculous bene-
fits!
-- 
 ====================== InfoVox = Speech Technology =======================
 Bjorn Larsson, INFOVOX AB      :      ...seismo!mcvax!kth!sunic!infovax!bl
 Box 2503                       :         bl@infovox.se
 S-171 02 Solna, Sweden         :         Phone (+46) 8 735 80 90