[comp.lang.c] Using

cik@l.cc.purdue.edu (Herman Rubin) (08/03/88)

In article <11699@steinmetz.ge.com>, davidsen@steinmetz.ge.com (William E. Davidsen Jr) writes:
> In article <37406@linus.UUCP> munck@faron.UUCP (Robert Munck) writes:

> | Mixing languages is not a terrific idea if your program is to be
> | maintained and enhanced over the years.  Languages change, too, and
> | trying to keep up with diverging languages...
  
>   A few words on that... One way to preserve portability is to get the
> whole program working in C, and identify the problem areas. Then the
> assembler output of C can be massaged by hand for efficiency. By always
> starting with compiler output you avoid having the C source and the
> assembler source out of phase to the point where the C won't work
> anymore.
> 
>   I know this is clumsy, that's why I recommend it. It will make you
> think before spending a lot of time trying to make small gains in
> performance, as opposed to rethinking the whole algorithm.

But what do you recommend if your languages (C, FORTRAN, etc.) are
incapable of doing even a remotely reasonable job with the program?
I can provide relatively simple examples of programs which can be
handled in a reasonably portable manner, but for which C does not 
have the concepts needed at all.  These are not esoteric concepts;
anyone who understands that machine words are composed of bits in
a structured manner, and that the bits of a structure can be used in
any manner compatible with the structure, can understand them.

Another problem is the fact that what one can do with a single instruction
on one machine takes many on another.  An example is to round a double 
precision number to the nearest integer.  If you think this is unimportant,
every trigonometric and exponential subroutine, or a subroutine computing
elliptic functions, etc., uses this or something similar.  On some machines,
this should be a moderately lengthy series of instructions.  On others, there
is a machine instruction which does precisely this.  Now many (all that I have
seen, but I have been told that there are ones which do not have this problem)
C compilers will not allow asm instructions to use the compiler's knowledge of
which registers are assigned to which variables.

C cannot do the job.  The present assemblers have the disadvantage that they
are atrociously difficult to read and write.  This is not necessary; COMPASS
for the CDC6x00 and its descendents were easier, but not easy enough, and CAL
on the CRAY easier still, but more can be done.  I wonder how difficult it 
would be to use an overloaded operator weakly typed assembler.  Some think that
C is this; maybe it was intended as a replacement for such an object, but it
fails badly.

-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

ok@quintus.uucp (Richard A. O'Keefe) (08/04/88)

In article <856@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>Another problem is the fact that what one can do with a single instruction
>on one machine takes many on another.  An example is to round a double 
>precision number to the nearest integer.  If you think this is unimportant,
>every trigonometric and exponential subroutine, or a subroutine computing
>elliptic functions, etc., uses this or something similar.  On some machines,
>this should be a moderately lengthy series of instructions.  On others, there
>is a machine instruction which does precisely this.  Now many (all that I have
>seen, but I have been told that there are ones which do not have this problem)
>C compilers will not allow asm instructions to use the compiler's knowledge of
>which registers are assigned to which variables.

I am sympathetic to Rubin's position, but I can't quite tell which side
he is arguing on here.  Let's replace his example by a still humbler one:
integer multiplication.  I have to include the operation of integer
multiplication in assembly code generated by a "smart" macro processor.
MUL doesn't quite take the non-portability prize; I think DIV does that.
But even MUL is a subroutine call on some machines (not always obeying
the protocol used elsewhere), an instruction on some, a sequence of 20-40
instructions on others.  It is a relief to turn to C (pathetic though a
language which doesn't report overflows may be) and not have to worry
about that[*].  The situation with floating-point is worse.  Once you start
porting generated assembly code between machines with 32/16/11/8 general
registers and varying numbers and arrangements of floating-point registers,
it pretty soon occurs to you that it doesn't matter whether the compiler
will "allow asm instructions to use the compiler's knowledge of which
registers are assigned to which variables" or not, it's not going to port
_nohow_.

If you want to mingle small chunks of assembly code with C code, I'm
convinced that asm("..") is obsolete: /usr/lib/inline is so much better
(e.g. you can switch from C implementations of such functions to inline
assembly code without changing your source, and better still, change _back_).
A trap for asm("...") users: some C compilers switch off optimisation in a
function if you use asm(), even if all you did was plant a label or comment.

[*] I have seriously considered writing C routines to do integer *, /, %,
and calling them from the generated assembly code.  What was that about using
assembly code to get access to machine instructions?

smryan@garth.UUCP (Steven Ryan) (08/05/88)

>                                                 I wonder how difficult it 
>would be to use an overloaded operator weakly typed assembler.  Some think that
>C is this; maybe it was intended as a replacement for such an object, but it
>fails badly.

Compass macros can implement something akin to overloaded operators and typed
operands, although the 8-character limitation is difficult.

On the 205 assembler, Meta, our project had procs (=macros) to do things like

          WHILE      P,EQ,TRUE                    while p do
          IF         [T_EXP,L,EXP_OPCD],EQ,O_RADD   if t_exp[l].exp_opcd=o_rad
          THEN                                       then
          GFIELD     T_EXP,L,EXP_OP1,L                l:=t_exp[l].exp_opcd
          ELSE                                       else
          CALL       XYZ,[L,IN],[P,OUT]               xyz(l in,p out)
          IFEND                                     ifend
          WHILEND                                  whileend

daveb@geac.UUCP (David Collier-Brown) (08/05/88)

From article <856@l.cc.purdue.edu>, by cik@l.cc.purdue.edu (Herman Rubin):
> But what do you recommend if your languages (C, FORTRAN, etc.) are
> incapable of doing even a remotely reasonable job with the program?

>...   Now many (all that I have
> seen, but I have been told that there are ones which do not have this problem)
> C compilers will not allow asm instructions to use the compiler's knowledge of
> which registers are assigned to which variables.

  There are two problems here: expressive power of the language and
quality of the compiler/embedded-assembler interface.

Problem 1: Language.
  Get a better language.  No smileys here, its a serious suggestion.
C may be my favorite language, but I won't claim it is either modern
or consistently well-implemented.  Use FORTRAN for numerical
problems if C gets in the way, C for structure-shuffling if FORTRAN
gets in the way.  Use Ada or C++ if the non-extensibility of either
gets in the way.  

Problem 2: Compilers.
  Get a good one, for whatever language you buy.  Make sure it will
use the machine's standard[1] call interface if one exists, so you can
mix languages.  

   An anecdote: I once used an Ada[2] cross-compiler, back when Ada
was **real** new and implementations tended to be weak.  When I had
to generate an in-line trap, I discovered I could state it all in
the HLL.  One declaration of a bit pattern (which was the
instruction), one package definition to define the interface to it
and umpty-dozen pragmas to say 
	put it in-line
	parameter A must be in register An and An+1
	parameter B must be in register R1
   Net result? It generated almost "perfect" code for the trap,
without any register-moves or call-return linkages.  This is
extensible to modeling almost any machine instruction[3], and one can
always re-write the package to turn it into a different instruction
(or sequence of instructions) if you have to port it.


  --OR--
  You can try to invalidate the problem, and make the compiler
generate optimal code.  Yes, I said MAKE.

  Another anecdote: last year, I had to speed up a critical inner
loop in a very large system written in Pascal.  The compiler **would
not** accept a declaration for the construct that was needed (a
pointer to an array greater than its normal maximum size).  It was
initially done by allocating the array in the non-pascal startup
code and using inline assembler to access it.  It was ***slow***.
  When I looked at the code generated, inlining had only saved me
the call-return instruction pair, not the required register/stack
setup for the call.
  So I wrote out the addressing expression in Pascal, as arithmetic.
The code it generate was one instruction "poorer" than the optimum,
because it moved the result from the general registers to the
address registers at the end.  This was 6 instructions better than
the inline assembler, and 8 better than out-of line instructions.
(The move occurred as a result of my using an explicit union: if the
language understood casts, it might have disappeared too).
  Result?  We got our speedup, even though Pascal "knew" it wasn't
supposed to be that helpful.

  So don't dispair, you can get out of almost any problem in
Confuser Science: you just have to go back to first principles and
attack the meta-problem.

--dave (no cute statement today) c-b
[1] Including de-facto ones, if the hardware one is yucky...
[2] Ada is a trademark of the U.S. Gov't (Ada Joint Project Office)
[3] This is generally a super-non-portable idea, though.
-- 
 David Collier-Brown.  |{yunexus,utgpu}!geac!daveb
 Geac Computers Ltd.,  |  Computer science loses its
 350 Steelcase Road,   |  memory, if not its mind,
 Markham, Ontario.     |  every six months.

dsill@NSWC-OAS.ARPA (Dave Sill) (08/16/88)

l.cc.purdue.edu!cik@k.cc.purdue.edu  (Herman Rubin) writes:
>Another problem is the fact that what one can do with a single instruction
>on one machine takes many on another.  An example is to round a double 
>precision number to the nearest integer.  If you think this is unimportant,
>every trigonometric and exponential subroutine, or a subroutine computing
>elliptic functions, etc., uses this or something similar.

It's not that "we" think it's unimportant.  It's kinda like saying
"This hammer is inadequate.  I can't even drive screws with it."

>I wonder how difficult it would be to use an overloaded operator
>weakly typed assembler.  Some think that C is this; maybe it was
>intended as a replacement for such an object, but it fails badly.
 
Yes and no.  As we all know, C was designed for systems programming.
It provides the system programmer with an easier to use interface to
the machine than assembler, is far more portable than assembler, and
in most cases obviates the need to use assembler for system
programming.  If you expect C to do more, i.e., be a portable
assembler for all machines, of course it will have some problems.
 
As Kernighan and Ritchie say in their recent article in Byte (I'm
paraphrasing), it's a tribute to C that there is so little assembler
programming being done by systems programmers today.  If you've ever
had the "honor" of doing systems programming on a system that didn't
have C, you can really appreciate this.

smryan@garth.UUCP (Steven Ryan) (08/19/88)

>As Kernighan and Ritchie say in their recent article in Byte (I'm
>paraphrasing), it's a tribute to C that there is so little assembler
>programming being done by systems programmers today.  If you've ever
>had the "honor" of doing systems programming on a system that didn't
>have C, you can really appreciate this.

Well, I've done a few thousand lines of Compass (assembly) for my own
programming environment. (Actually I'm not sure how much: three binders
of doublesided laser printer listings.)

I'm not sure if that constitutes system programming, because part of I did
was build a multitasking executive between rest of my code and NOS. Oh, and
I rewrote the I/O interface to handle various character sets, command line
parsing, strange file structures (no EOR before EOI, as when SFM attaches
the job dayfile).

It was an honour. And alotta fun. Now I have the `honour' of learning C and
Unix. And I thought Cybil and NOS 2 had problems.

----------------------------------------------------------------------------
Well, that was perhaps more bitchy than it really had to be. Just like
Wirth does for his toys, K+R built their prejudices into C. I happen to
disagree. That's fine. But I don't appreciate this waxing raptureously over
some particular language and casting every other language into the Outer
Darkness.

Perhaps, I should go away if nobody wants to listen. Problem is I've got
a bit of that old mad prophet.

hashem@mars.jpl.nasa.gov (Basil Hashem) (04/12/90)

I am not sure this is the most appropriate group, but let me try.

I'm using the cpp program (the SunOS C pre-processor) for filtering some 
files which are not C programs.

I have been successful in doing #include's, #ifdef's, and #defines's. What I
need to do is pass a file such as this one through cpp.  (Don't worry
about the extra lines and comment fields.)

----
#define LANGUAGE English

#if (LANGUAGE == French)
#define GREETING       Bonjour
#define FAREWELL       Salut
#define GENTLEMEN      Monsieurs
#else
#define GREETING       Good Morning
#define FAREWELL       Bye
#define GENTLEMEN      Sirs
#endif

Dear GENTELMEN,

GREETING

blah blah

FAREWELL
----

The results are not what is expected since cpp doesn't do the compare properly.
What I really need is a strcmp(LANGUAGE, "French") but I can't really do
that considering this is not C.

Anyone have suggestions?

                "WOOOSH!  If they only knew what I was up to."
                                 Basil Hashem                                   
                            hashem@mars.jpl.nasa.gov
            Jet Propulsion Laboratory     La Canada Flintridge, CA

diamond@tkou02.enet.dec.com (diamond@tkovoa) (04/12/90)

In article <3358@jato.Jpl.Nasa.Gov> hashem@mars.jpl.nasa.gov (Basil Hashem) writes:

>I am not sure this is the most appropriate group, but let me try.

It is.

>#if (LANGUAGE == French)
>#define GREETING       Bonjour
>#else
>#define GREETING       Good Morning
>#endif

>What I really need is a [preprocessing-time] strcmp(LANGUAGE, "French")
>but I can't really do that considering this is not C.

In fact, some programs (including a certain famous C-compiler itself, pcc)
need a similar feature.  "If you want PL/I, you know where to find it."

The way pcc solves it is by #include of a header file with such things as:
  #define English 1
  #define French 2
  #define German 3

Since un"define"d preprocessor identifiers turn into zeroes in #if expressions,
there is an advantage to not using 0 for a valid option.  You can check:
  #if (LANGUAGE == 0)
  #error You set LANGUAGE to an unknown language (or you forgot to set it).
  #else
  ...

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
This_blank_intentionally_left_underlined________________________________________

davidm@uunet.UU.NET (David S. Masterson) (04/13/90)

In article <3358@jato.Jpl.Nasa.Gov> hashem@mars.jpl.nasa.gov (Basil Hashem) writes:

   #define LANGUAGE English

   #if (LANGUAGE == French)

   [etc...]
   ----

Hmmmm...  Has the M4 preprocessor lost favor?  I know there is a portable
implementation of it. 
--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
"If someone thinks they know what I said, then I didn't say it!"