[comp.arch] FORTRAN Dhrystone for i860? [NO

mcdonald@uxe.cso.uiuc.edu (03/25/89)

>"Several years ago I saw the 386 assembly language output generated by the
>Greenhills C compiler for Dhrystone.  They inlined all the string functions,
>and turned constant strings into fixed length copies (no NUL-character
>recognition).

>I wouldn't be surprised if you're seeing the same thing for the i860
>results."

>Now, according to the letter or the law of Herr Doktor Weicker's Dhrystone 2.1
>writeup, it's OK to in-line strcpy and strcmp.  Unfortunately, in this
>particular case, a set of conditions exists that is not particularly frequent:
>	a) The source of the strcpy is a constant string, so the compiler
>		knows how long it is.
>	b) The target of the strcpy is something whose alignment is known
>		at compile time, or even better, can be aligned as the
>		compiler chooses.  [i.e., NOT a pointer to some unknown place.]
>Given these conditions, you can easily turn the copy into a structure-assignment
>equivalent, that need not inspect any bytes at all.

>1) Does anybody KNOW if the Green Hills compilers do this, in general?
>(or specifically:
>	the 386?

I don't know about Green Hills. But, I use the MicroWay NDPC compiler
on my PC. A discussion with a programmer there about a problem
I had elicited a comment: (This is a paraphrase) 
    "Our front end is basically a 4.3 front end [whether clone or
purchased I didn't ask JDM]. The back end ....
The optimizer we bought. I can't tell you where we bought it, but
it is a company well known in the Unix C field. They seem to make a really
nice optimizer."


>2) Could you post the assembly code, if there is something like this?

Certainly - the .S file is full of things like (to copy 31 bytes 
including the trailing zero:)

        mov     ecx,7
	lea	esi,dword ptr ds:L98
	push	edi
	lea	edi,dword ptr [eax]+16
	rep	movsd
	movsw
	movsb
	pop	edi

For the 386 and one particular compiler (with the -OLM switch),
the smoking gun. 

Doug McDonald

grunwald@flute.cs.uiuc.edu (03/26/89)

re: does greenhills auto-inline strcpy & strlen, etc?

yes, they do, on the 386 compiler we have. In addition, they also inline
cos/sin/tan automatically, and possibly some other functions (memcpy, etc)
as well.

I noticed this when comparing Gnu C to Greenhills C on a 386. Once you discount
the strcpy hacks, Gnu C is about the same speed (this was 1.30, 1.34 and up
should be better due to loop opt. changes).
--
Dirk Grunwald
Univ. of Illinois
grunwald@flute.cs.uiuc.edu

meissner@dg-rtp.dg.com (Michael Meissner) (04/06/89)

In article <GRUNWALD.89Mar26010129@flute.cs.uiuc.edu> grunwald@flute.cs.uiuc.edu writes:
| 
| re: does greenhills auto-inline strcpy & strlen, etc?
| 
| yes, they do, on the 386 compiler we have. In addition, they also inline
| cos/sin/tan automatically, and possibly some other functions (memcpy, etc)
| as well.
| 
| I noticed this when comparing Gnu C to Greenhills C on a 386. Once you discount
| the strcpy hacks, Gnu C is about the same speed (this was 1.30, 1.34 and up
| should be better due to loop opt. changes).

I've been meaning to post for sometime, about my experiences with GNU
and dhrystone.  For our 88k machines, Data General decided to go with
the GNU compiler for it's default compiler, for various reasons,
instead of using Greenhills.  Anyway, when the GNU compiler was
complete enough to run benchmarks, we naturally ran dhrystone, and
found that for the 20 MHz Motorola Anglefire, the then current
Greenhills compiler came in at about 31,000 dhrystones, wheras the
best I could get GNU was around 19,000.

Needless to say this caused some soul searching, and poring over the
generated code.  The only thing that we found that was substanially
different was that the Greenhills compiler had changed the 4 strcpy
calls into calls to an internal function with the two pointers and the
length.  GNU was just calling strcpy directly.  By editing the
assembly languange output to call the same routine Greenhills did, the
GNU compiler suddenly spurted to 33,000 dhrystones.

In addition to high amounts of time being consumed by strcpy, we also
discovered that naive single-byte strcmp's will dramatically lower
your dhrystones as well.

Note, I think it perfectially legitimate to optimize calls to strcpy
and friends if you can do it in a safe fashion.  I don't think that
strcpy's where the second argument is a constant string, is as
frequent as normal strcpy's, so I tend to think that dhrystone in this
case is not measuring typicall integer performance.

We eventually modified the compiler to optimize strcpy, and memcpy
(and to a limited extent, strcmp, and memcmp).


Michael Meissner, Data General.
Uucp:		...!mcnc!rti!xyzzy!meissner		If compiles were much
Internet:	meissner@dg-rtp.DG.COM			faster, when would we
Old Internet:	meissner%dg-rtp.DG.COM@relay.cs.net	have time for netnews?